项目作者: faizaladhitama

项目描述 :
Hierarchical Multi Label Hate Speech and Abusive Language Classification
高级语言: Python
项目地址: git://github.com/faizaladhitama/Hierarchical-Multi-Label-Classification-API.git
创建时间: 2019-09-16T06:40:06Z
项目社区:https://github.com/faizaladhitama/Hierarchical-Multi-Label-Classification-API

开源协议:GNU General Public License v3.0

下载


Hierarchical-Multi-Label-Classification-API

This API can predict text and classify it as hate speech and/or abusive language for Indonesian Language because this model only trained by Indonesian Language data. If text classified as hate speech, then it will show hate speech characteristic of that text.
This API is a result of my paper. For reference, you can copy paste .bib file that i uploaded or text below.

  1. @INPROCEEDINGS{Prab1909:Hierarchical,
  2. AUTHOR="Faizal Adhitama Prabowo and Muhammad Okky Ibrohim and Indra Budi",
  3. TITLE="Hierarchical Multi-label Classification to Identify Hate Speech and Abusive
  4. Language on Indonesian Twitter",
  5. BOOKTITLE="2019 6th International Conference on Information Technology, Computer and
  6. Electrical Engineering (ICITACEE) (2019 6th ICITACEE)",
  7. ADDRESS=", Indonesia",
  8. DAYS=25,
  9. MONTH=sep,
  10. YEAR=2019,
  11. KEYWORDS="Hate Seech; Multi-Label Text Classification; Hierarchical Classification;
  12. Machine Learning; RFDT; NB; SVM",
  13. ABSTRACT="Hate speech is one type of speech whose spread is banned in public spaces
  14. such as social media. Twitter is one of the social media used by some
  15. people to broadcast hate speech. The hate speech can be specified based on
  16. the target, category, and level. This paper discusses multi-label text
  17. classification using a hierarchical approach to identify targets, groups,
  18. and levels of speech hate on Indonesian-language Twitter. Identification is
  19. completed using classification algorithms such as the Random Forest
  20. Decision Tree (RFDT), Naïve Bayes (NB), and Support Vector Machine (SVM).
  21. The feature extraction used for classification is the term frequency
  22. feature such as word n-gram and character n-gram. This research conducted
  23. five scenarios with different label hierarchy to find the highest accuracy
  24. that can possibly be reached by hierarchical classification. The
  25. experimental results show that the hierarchical approach with the SVM
  26. algorithm and word uni-gram feature has an accuracy of 68.43\%. It proved
  27. that the hierarchical algorithm can increase data transformation or flat
  28. approach."
  29. }

Example :

Send JSON file to API with text parameter that have value text or array of text as shown in below
alt text

Contact

If you have question, please email me at faizaladhitamaprabowo@gmail.com