Journals Information
Computer Science and Information Technology Vol. 13(3), pp. 39 - 48
DOI: 10.13189/csit.2025.130301
Reprint (PDF) (2302Kb)
Comparative Analysis of Word Embedding Techniques Integrated with Machine Learning for Detecting Offensive Comments on Social Media
Victor Thomas Emmah *, Chizi Michael Ajoku , Ibiere Boma Cookey
Department of Computer Science, Rivers State University, Nigeria
ABSTRACT
The rapid growth of social media has enabled global communication and also increased the number of user-generated content. This, in turn, has amplified the spread of offensive, hateful, and abusive language online. This increasing prevalence of offensive comments has created a need for robust and automated moderation systems that will help in automatic detection of offensive comments crucial to maintaining healthy online discourse. This paper presents a comparison of word embedding-based approaches for detecting offensive content, leveraging deep learning techniques to enhance classification accuracy. Using pre-trained word embeddings such as GloVe and Word2Vec, the paper explores their effectiveness for training deep learning models, including Simple Neural Networks (NN), Long Short-Term Memory (LSTM) networks, and Convolutional Neural Networks (CNN). The models are trained on a dataset of tweets, applying preprocessing techniques such as tokenization, normalization, and data balancing to improve performance. Experimental results indicate that CNN performed better than the simple Neural Network and LSTM on both embedding models, achieving higher accuracies of 65% for Word2Vec and 84.7% for GloVe, thereby demonstrating its capability to capture linguistic patterns associated with offensive language. These results show that word embeddings with semantic understanding, when integrated with deep learning models, outperform traditional vectorization techniques in identifying offensive language with higher accuracy and recall. This highlights the significance of combining high-quality word embeddings with deep learning architectures for effective social media moderation.
KEYWORDS
Word Embedding, Long-Short Term Memory, Convolutional Neural Network, Social Media
Cite This Paper in IEEE or APA Citation Styles
(a). IEEE Format:
[1] Victor Thomas Emmah , Chizi Michael Ajoku , Ibiere Boma Cookey , "Comparative Analysis of Word Embedding Techniques Integrated with Machine Learning for Detecting Offensive Comments on Social Media," Computer Science and Information Technology, Vol. 13, No. 3, pp. 39 - 48, 2025. DOI: 10.13189/csit.2025.130301.
(b). APA Format:
Victor Thomas Emmah , Chizi Michael Ajoku , Ibiere Boma Cookey (2025). Comparative Analysis of Word Embedding Techniques Integrated with Machine Learning for Detecting Offensive Comments on Social Media. Computer Science and Information Technology, 13(3), 39 - 48. DOI: 10.13189/csit.2025.130301.