项目作者: rspai

项目描述 :
Predictive analytics with Spark
高级语言: Jupyter Notebook
项目地址: git://github.com/rspai/ML_with_Spark.git
创建时间: 2020-06-11T03:56:07Z
项目社区:https://github.com/rspai/ML_with_Spark

开源协议:

下载


ML_with_Spark

Predictive analytics with Spark

This project implements a movie genre prediction model using Apache Spark. “train.csv” has movie summaries of around 31K movies along with their genres. “test.csv” has just plot summaries, on which genres are to be predicted.

The task of predicting the genre is essentially a multi-label classification problem. A movie can have multiple genres associated with it. The model should be able to predict all the genre associated with the movie.