项目作者: alfafimel

项目描述 :
IMPLEMENTATION OF KNN & NAIVE BAYES CLASSIFIERS: This week's project requires us to implement a K-nearest neighbor (kNN) classifier and a Naive Bayes classifier.
高级语言: Jupyter Notebook
项目地址: git://github.com/alfafimel/IPWK9-CORE.git
创建时间: 2020-10-07T10:37:19Z
项目社区:https://github.com/alfafimel/IPWK9-CORE

开源协议:MIT License

下载


IMPLEMENTATION OF KNN & NAIVE BAYES CLASSIFIERS

This week’s project requires us to implement a K-nearest neighbor (kNN) classifier and a Naive Bayes classifier. Once we conduct the experiments, we will calculate the resulting metrics:

Download the two datasets from the given links:

Dataset 1 Source:

[Train Dataset Source: Link (https://archive.org/download/train5_202002/train%20%285%29.csv),

Test Dataset Source: Link (https://archive.org/download/test1_202002/test%20%281%29.csv)]

Dataset 2 Source:

[Link (https://archive.ics.uci.edu/ml/datasets/Spambase)]

Experimental Procedure:

  1. Randomly partition each dataset into two parts i.e 80 - 20 sets.

  2. For dataset 1, because we don’t have the label for the test set, we will use the train set to create train and test data (i.e. splitting further), then perform K-nearest neighbor classification.

  3. For dataset 2, perform classification of the testing set samples using the Naive Bayes Classifier.

  4. Compute the accuracy (percentage of correct classification).

  5. Report the confusion matrix of each classifier.

  6. Repeat step 1 to step 3 twice, each time splitting the datasets differently i.e. 70-30, 60-40, then note the outcomes of your modeling.

  7. Suggest and apply at least one of the optimization techniques that you learned earlier this week.

  8. Provide further recommendations to improve both classifiers.

Create a notebook for each project.

License

MIT