INDONESIAN TWITTER HATE SPEECH AND ABUSIVE LANGUAGE DETECTION: METHODS AND ANALYSIS

Ana’llhaqq Suryaningprang; Arif Kurniawan; Muhammad Syaifudin Tamami; Muhamad Tuhfatur Roziqin; Anas Nasrullah; Irwan Siswanto

doi:10.2015/cyfors.31.2026

Authors

Ana’llhaqq Suryaningprang Author
Arif Kurniawan Author
Muhammad Syaifudin Tamami Author
Muhamad Tuhfatur Roziqin Author
Anas Nasrullah Author
Irwan Siswanto Author

DOI:

https://doi.org/10.2015/cyfors.31.2026

Keywords:

Hate speech, abusive language, multi-label classification, Indonesian Twitter, machine learning, feature extraction

Abstract

This study presents a comprehensive analysis of hate speech and abusive language on Indonesian Twitter using a multi-label classification approach. A meticulously cleaned and labeled dataset was employed, categorizing various forms of hate speech and abusive language. We applied machine learning algorithms, including Support Vector Machine (SVM), Naive Bayes (NB), and Random Forest Decision Tree (RFDT) with Binary Relevance (BR), Label Power-set (LP), and Classifier Chains (CC) for data transformation. Our results indicate that RFDT with LP transformation achieves the highest accuracy. Additionally, this paper underscores the critical role of text normalization and feature extraction in enhancing classification performance and discusses the importance of comprehensive annotation guidelines. The study’s findings provide a foundation for future research in hate speech detection and highlight areas for improvement in data annotation and algorithm selection.

INDONESIAN TWITTER HATE SPEECH AND ABUSIVE LANGUAGE DETECTION: METHODS AND ANALYSIS

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Issue

Section

Menu Kanan