An analysis of NYC parking violation datasets using PySpark, SparkSQL and MapReduce with Hadoop Streaming