THE MACHINE LEARNING-BASED ANALYSIS OF SENTIMENT IDENTIFICATION FOR POSTS by IRJET Journal

International Research Journal of Engineering and Technology (IRJET)

e-ISSN: 2395-0056

Volume: 11 Issue: 03 | Mar 2024

p-ISSN: 2395-0072

www.irjet.net

THE MACHINE LEARNING-BASED ANALYSIS OF SENTIMENT IDENTIFICATION FOR POSTS Savita Tripathi1, Mr. Sambhav Agarwal2 1M.Tech, Computer Science and Engineering, SR Institute of Management & Technology, Lucknow, India 2Assistant Professor, Computer Science and Engineering, SR Institute of Management & Technology, Lucknow

---------------------------------------------------------------------***---------------------------------------------------------------------

Abstract - Sentiment analysis is a field of research that studies people's opinions about different things, such as products, social and political events, and problems. This type of analysis has become increasingly popular because it can help stakeholders make better decisions based on public opinion. Opinion mining is one way to gather information from sources like search engines, web blogs, Twitter, and social networks. However, because there are so many tweets available online in unstructured text form, it can be difficult to analyze them manually. To solve this problem, researchers use computational strategies that involve identifying sentimentbearing words in the text. There are many different methods for doing this using machine-learning techniques like Bag-ofWords (BoW) representation. In this study specifically, the researchers used a lexicon-based approach to automatically identify sentiment in tweets collected from Twitter's public domain. They also applied three different machine learning algorithms – Naive Bayes (NB), Maximum Entropy (ME), and Support Vector Machines (SVM) – to see which was most effective at classifying the tweets by sentiment. The experiments showed that both NB with Laplace smoothing and SVM were effective classifiers when using certain features like unigrams or Part-of-Speech (POS). Overall, sentiment analysis is an important tool for understanding public opinion on various topics through user-generated content on platforms like Twitter.

Neural Networks (RNNs) or Transformer-based architectures such as BERT. Training the selected model involves dividing the data into training and testing sets and then fine-tuning and optimizing it to improve performance. Evaluation metrics such as accuracy, precision, and recall are used to determine the effectiveness of the model. Once satisfactory performance is achieved, the model can be deployed for real-time sentiment analysis either through API integration or web application deployment. Ongoing monitoring and maintenance are essential to ensure that the model remains accurate and up-to-date with evolving language patterns and sentiments on Twitter. Moreover, ethical considerations must be taken into account throughout the development and deployment process to protect privacy and mitigate bias. By adopting this systematic approach towards harnessing Twitter for automatic sentiment identification, it becomes an invaluable resource for applications in market research, brand monitoring, social media analytics amongst others.

Key Words: Bag-of-Words (BoW), Lexicon, Machine Learning Algorithms, Laplace Smoothing, Part-of-Speech (POS).

1. INTRODUCTION The process of utilizing Twitter for automatic sentiment identification involves a series of vital steps. The first step is to gather a large dataset of tweets using Twitter's API, with a focus on specific keywords, hashtags, or timelines of interest. Once this data has been collected, it undergoes preprocessing, which involves removing noise such as URLs, special characters, and stopwords. Additionally, the text is tokenized and normalized. After preprocessing, feature extraction takes place. This entails extracting relevant features such as bag-of-words representations, TF-IDF scores, or embeddings like Word2Vec from the preprocessed text. The next step is to select an appropriate model for sentiment analysis. This can range from traditional machine learning algorithms like Naive Bayes and Support Vector Machines to advanced deep learning models like Recurrent © 2024, IRJET

Impact Factor value: 8.226

Figure-1: Sentiment Identification of Social Media Post.

1.1.

Purpose of Sentiment Identification

Sentiment identification is a crucial tool in analyzing Twitter data as it serves multiple purposes. Firstly, it provides invaluable insights into consumer perceptions, which is essential for companies to understand customer sentiment towards their products or services. By discerning positive, negative, or neutral sentiments from tweets, businesses can tailor their strategies to meet customer needs effectively. This can lead to improved customer satisfaction and loyalty. ISO 9001:2008 Certified Journal

Page 79