项目作者: fawind

项目描述 :
Spark hands-on exercise for the lecture Distributed Data Analytics
高级语言: Scala
项目地址: git://github.com/fawind/spark-examples.git
创建时间: 2018-01-17T15:50:53Z
项目社区:https://github.com/fawind/spark-examples

开源协议:

下载


Spark Examples

Spark hands-on exercise for the lecture Distributed Data Analytics.

Task

Usage

  1. Build a fatjar using sbt assembly
  2. Run the main method with the following program arguments:
    • --path <path to folder> - Path to the folder containing the dataset csv files. Optional, defaults to ./TPCH.
    • --paths <fileA,fileB,fileC> - Direct path to the dataset files seperated by comma. Optional, defaults to --path argument.
    • --cores <number of cores> - Number of local cores to use. Optional, defaults to 4.