A MULTIMODAL APPROACH TO EMOTION, HATE SPEECH, SARCASM, AND SLANG DETECTION IN SOCIAL MEDIA TEXT by IRJET Journal

International Research Journal of Engineering and Technology (IRJET)

e-ISSN: 2395-0056

Volume: 11 Issue: 04 | Apr 2024

p-ISSN: 2395-0072

www.irjet.net

A MULTIMODAL APPROACH TO EMOTION, HATE SPEECH, SARCASM, AND SLANG DETECTION IN SOCIAL MEDIA TEXT Nikesh Malik1, Akash Jayaprasad Nair2, Ayush Radheshyam Prajapati3 and Sheetal Shimpikar4 1

B.Tech, Computer Engineering, Pillai College Of Engineering, New Panvel, Maharashtra, India B.Tech , Computer Engineering, Pillai College Of Engineering, , New Panvel, Maharashtra, India 3 B.Tech , Computer Engineering, Pillai College Of Engineering, New Panvel Maharashtra, India 4 Assistant Professor, Department of Computer Engineering, Pillai College Of Engineering, New Panvel Maharashtra, India ---------------------------------------------------------------------***--------------------------------------------------------------------2

Abstract - The rapid growth of social media has led to an

learning, have shown promising results in addressing this challenge.

increase in online hate speech and targeted harassment. This paper presents a hybrid approach for sentiment analysis and hate speech detection using the BERT (Bidirectional Encoder Representations from Transformers) model. The proposed system utilizes natural language processing techniques to analyze text-based social media posts and identify content containing hate speech or targeted harassment. A combination of supervised and unsupervised machine learning methods is employed, along with emotion- based analysis using the EmoBERT model. The system is trained on large datasets of labeled social media posts and evaluated using metrics such as accuracy, precision, recall, and F1 score. Results demonstrate the effectiveness of the hybrid approach, with the hate speech detection model achieving an accuracy of 88% and the emotion classification model reaching 91% accuracy. The proposed system has potential ap- plications in online content moderation, policy enforcement, and cyber- bullying prevention. Future work includes enhancing the model architecture, evaluating performance on diverse datasets, and exploring commercial deployment strategies.

1.2 Objectives and scope of the research The primary objective of this research is to develop multimodal text classification models using BERT for detecting emotions, hate speech, sarcasm, and slang in social media content. By leveraging the power of transfer learning and the contextual understanding capabilities of BERT, we aim to create robust classifiers that can accurately categorize text into predefined classes. The scope of this research encompasses the collection and preprocessing of diverse datasets, fine-tuning of pretrained BERT models, and extensive evaluation of the developed classifiers.

1.3 Significance and contributions to the field This research contributes to the field of natural language processing and social media analysis by demonstrating the effectiveness of BERT-based models for multimodal text classification tasks. The developed classifiers have significant implications for content moderation, sentiment analysis, and user engagement on social media platforms. By accurately identifying and classifying harmful content, these models can help create safer and more inclusive online environments. Furthermore, the ability to detect sarcasm and slang usage can enhance the understanding of user sentiment and facilitate more effective communication strategies.

Keywords: Sentiment Analysis, Hate Speech Detection, BERT, EmoBERT, Natural Language Processing, Machine Learning, Social Media, Content Moderation

1. INTRODUCTION 1.1 Background and motivation for text classification in social media

2. RELATED WORK

The exponential growth of social media platforms has transformed the way people communicate, share information, and express opinions online. However, this ubiquity has also led to the proliferation of harmful content,

2.1 BERT-based models for text classification tasks BERT, introduced by Devlin et al. (2018), has revolutionized the field of natural language processing. Its bidirectional architecture and pre-training on large-scale unlabeled data have enabled it to achieve state-of-the-art performance on various text classification tasks. Numerous studies have explored the application of BERT

such as hate speech, targeted harassment, and offensive language. The sheer volume of user-generated content necessitates the development of automated tools to identify and moderate such content effectively. Text classification techniques, particularly those based on deep

Impact Factor value: 8.226

ISO 9001:2008 Certified Journal

Page 2480