项目作者: SatLight

项目描述 :
Building a prediction model for a huge dataset using Big Data tech like Kylin and Spark.
高级语言: Jupyter Notebook
项目地址: git://github.com/SatLight/Airline-Delay-Prediction-using-Spark-and-Kylin.git
创建时间: 2020-12-11T06:53:57Z
项目社区:https://github.com/SatLight/Airline-Delay-Prediction-using-Spark-and-Kylin

开源协议:GNU General Public License v3.0

下载


Airline-Delay-Prediction-using-Spark-and-Kylin

Mapper and Reducer purpose is to detect and replace null values by column average.

The model is built in PySpark using Decision Tree Classifiers.

Dataset