Predicting the victims of hate speech on microblogging platforms
Published in Heliyon, 2024
Recommended citation: Sahrish Khan, Rabeeh Abbasi, Muddassar Sindhu, Sachi Arafat, Akmal Khattak, Ali Daud, Mubashar Mushtaq, "Predicting the victims of hate speech on microblogging platforms." Heliyon, 2024. https://www.sciencedirect.com/science/article/pii/S240584402416642X
Hate speech constitutes a major problem on microblogging platforms, with automatic detection being a growing research area. Most existing works focus on analyzing the content of social media posts. Our study shifts focus to predicting which users are likely to become targets of hate speech. This paper proposes a novel Hate-speech Target Prediction Framework (HTPK) and introduces a new Hate Speech Target Dataset (HSTD), which contains tweets labeled for targets and non-targets of hate speech. Using a combination of Term Frequency-Inverse Document Frequency (TFIDF), N-grams, and Part-of-Speech (PoS) tags, we tested various machine learning algorithms, Naïve Bayes (NB) classifier performs best with an accuracy of 93%, significantly outperforming other algorithms. This research identifies the optimal combination of features for predicting hate speech targets and compares various machine learning algorithms, providing a foundation for more proactive hate speech mitigation on social media platforms.