项目作者: peloyeje

项目描述 :
[MAP540] Cap Gemini
高级语言: Jupyter Notebook
项目地址: git://github.com/peloyeje/map540-cg-datacamp.git
创建时间: 2018-01-19T12:07:31Z
项目社区:https://github.com/peloyeje/map540-cg-datacamp

开源协议:

下载


Datacamp - CapGemini

MAP540 - group 7

Summary

This repository contains all code snippets related to our solution to the datacamp.
The aim was to extract the main pain points expressed by smartphones owners from a broad range of structured and unstructured data sources (Twitter, Reddit, boards, marketplaces). The scope was limited to the iPhone X and the Samsung Galaxy S8.

Project architecture

  • The EDA notebooks live in the notebooks folder. We dealt with text preprocessing, topic modeling and semi-supervised classification.
  • The scrapers folder contains the Scrapy spiders we created to acquire the data.

Members