项目作者: angeligareta

项目描述 :
This project aims to predict the delays on the Yellow taxi dataset, by implementing an application based on Apache Flink.
高级语言: Java
项目地址: git://github.com/angeligareta/flink-overview.git
创建时间: 2020-01-05T23:08:29Z
项目社区:https://github.com/angeligareta/flink-overview

开源协议:MIT License

下载


Flink Overview

Final project of Cloud Computing and Big Data Ecosystems Design subject of the EIT Digital data science master at UPM


UPM
License
GitHub contributors

Aim

This project aims to predict the delays on the Yellow taxi dataset, by implementing an application based on Apache Flink. The goal is to inform about the trips ending at JFK airport with two or more passengers each hour for each vendorID.

The output format is: vendorID, tpep_pickup_datetime, tpep_dropoff_datetime, passenger_count.

Tools

It is fully developed using Java 8 and using lambda for the Apache Flink pipeline.

Authors