项目作者: trdtnguyen

项目描述 :
Post-Sale Automobile Report - Using Spark
高级语言: Python
项目地址: git://github.com/trdtnguyen/sb-miniproject6.git
创建时间: 2020-12-05T01:24:41Z
项目社区:https://github.com/trdtnguyen/sb-miniproject6

开源协议:GNU General Public License v3.0

下载


sb-miniproject6

Post-Sale Automobile Report - Using Spark
This project does the same job with sb-miniporject5 but with Spark job instead of Hadoop’s map-reduce.

The purpose of this project is illustrate the power of Spark compared to Hadoop Mapreduce. The final solution look simpler and faster.

Requirement

  • Hadoop and Spark are installed and config properly.
  • Module pyspark is installed

Setup and Run project

Clone the project on your local working directory

  1. $ git clone https://github.com/trdtnguyen/sb-miniproject6.git
  2. $ cd sb-miniproject6

To run the project, just simply execute the run.sh

  1. $ ./run.sh