项目作者: sm86

项目描述 :
My first analytics hackathon on Analytics Vidhya
高级语言: R
项目地址: git://github.com/sm86/McKinsey-Analytics-Hackathon.git
创建时间: 2017-11-19T08:48:29Z
项目社区:https://github.com/sm86/McKinsey-Analytics-Hackathon

开源协议:

下载


McKinsey-Analytics-Hackathon

Problem Statement

Mission

You are working with the government to transform your city into a smart city. The vision is to convert it into a digital and intelligent city to improve the efficiency of services for the citizens. One of the problems faced by the government is traffic. You are a data scientist working to manage the traffic of the city better and to provide input on infrastructure planning for the future.

The government wants to implement a robust traffic system for the city by being prepared for traffic peaks. They want to understand the traffic patterns of the four junctions of the city. Traffic patterns on holidays, as well as on various other occasions during the year, differ from normal working days. This is important to take into account for your forecasting.

Task

To predict traffic patterns in each of these four junctions for the next 4 months.

About my approach

Language used: R
Architecture of Code

Data preparation and Feature extraction Code

I have used XGBoost model based on few parameters here Code

Then I tried creating seperate model for each junction and took average over all the models Code

Can be worked on
  • ARIMA time series models
  • Feature of public holiday detection
  • Season feature
  • Prophet R package
  • Average over more models as it helps in stablizing model during prediction on unseen data.

Result

Leaderboard rank: 48

Evaluation metric(RMSE) = 8.035319

This is coded as part of 24 hour hackathon. Hosted on Analytics Vidya platform on Nov 17-18, 2017. Link to hackathon