项目作者: HaripriyaTV

项目描述 :
Will Big Mart prosper with their customers? Will Big Mart succeed with their sales? To answer these, here is "Big Mart Sales Prediction using R"
高级语言: R
项目地址: git://github.com/HaripriyaTV/Sales-cure-all...-or-will-they-.git


Sales-cure-all…-or-will-they-

Will Big Mart prosper with their customers? Will Big Mart succeed with their sales? To answer these, here is “Big Mart Sales Prediction using R”.

Sales prediction is a very common real-life problem that each company faces atleast once in its lifetime. If done correctly, it can have a significant impact on the success and performance of that company. According to a study, companies with accurate sales predictions are 10% more likely to grow their revenue year-over-year and 7.3% more likely to hit quota.

Problem Statement

The data scientists at BigMart have collected sales data for 1559 products across 10 stores in different cities for the year 2013. Now each product has certain attributes that sets it apart from other products. Same is the case with each store.

The aim is to build a predictive model to find out the sales of each product at a particular store so that it would help the decision makers at BigMart to find out the properties of any product or store, which play a key role in increasing the overall sales.

Data Source

The dataset is from the Big Mart Sales Practice Problem of Analytics Vidhya. The link to the dataset is, https://datahack.analyticsvidhya.com/contest/practice-problem-big-mart-sales-iii/. Further details about the data can be found
here.

Contents

The problem was handled in a structured way.The following is the table of content that was followed.

  1. Introduction
    • About data
    • Loading packages and data in R
    • Understanding the data
  2. Exploratory Data Analysis (EDA)
    • Univariate Analysis
    • Bivariate Analysis
    • Initial insights from the analyses
  3. Data Preparation
    • Missing Value Treatment
    • Feature Engineering
    • Encoding Categorical Variables
    • PreProcessing data
  4. Modelling
    • Building the following models,
      • Linear Regression
      • Regularized Linear Regression
      • Random Forest
      • Extreme Gradient Boosting
  5. Summary
    • Final insights from the prediction

All the above gave me a leaderboard score of 1156.228 and #624 by Analytics Vidhya.