[MAP540] Cap Gemini
MAP540 - group 7
This repository contains all code snippets related to our solution to the datacamp.
The aim was to extract the main pain points expressed by smartphones owners from a broad range of structured and unstructured data sources (Twitter, Reddit, boards, marketplaces). The scope was limited to the iPhone X and the Samsung Galaxy S8.
notebooks
folder. We dealt with text preprocessing, topic modeling and semi-supervised classification.scrapers
folder contains the Scrapy spiders we created to acquire the data.