This project is part of Udacity machine learning nanodegree, using an sklearn estimator for plagiarism detection