Issuu

International Research Journal of Engineering and Technology (IRJET)

e-ISSN: 2395-0056

Volume: 12 Issue: 04 | Apr 2025

p-ISSN: 2395-0072

www.irjet.net

Design an Approach for prediction of Popular Movies Based on Previous Reviews by Using Bagging & Boosting Techniques Bharti Singh 1, Prof. Manish Rohila2 MTech Scholar, Department of Computer Science & Engineering, Technocrats Institute of Technology (Excellence), Bhopal (M.P), India, Email: bhartisingh15193@gmail.com Assistant Professor, Department of CSE, Technocrats Institute of Technology (Excellence), Bhopal (M.P), India2 --------------------------------------------------------------------------***----------------------------------------------------------------------ABSTRACT print media and social media because the positive review Here, Authors explained their views with the help of experimental events & setup. In this work the selected data set is movies reviews given or collected by concern people. This data has taken from the nltk.corpus package defines a collection of corpus reader classes, which can be used to access the contents of a diverse set of corpora. NLTK’s corpus reader classes are used to access the contents of a diverse set of corpora. Each corpus reader class is specialized to handle a specific corpus format. The dataset selected as previous which we took as our Base Reference for our research. the dataset selected having two categories negative & positive each category contains 1000 files. In our file we have movies review by different reviewer. By the given dataset we need to extract sentiment from the given text documents. we extend some more in comparison to our base papers. we apply Bagging & Boosting in sentiment analysis tool, which is basically a way to figure out if a piece of text is expressing positive, negative, or neutral emotions. VADER concept based upon Bags of Words approach. Authors used a concept Bagging & Boosting which comes under Ensemble Techniques. This concept improves the performance by 1 % in voted algorithms. Keywords: Sentiment Analysis, Stop Word, Tokens, Features, Training & Testing Data, Model or classifier, VADER, Ensemble Learning, Bagging, Boosting

I INTRODUCTION

we all knows that this decade is digital decade all different media are not so popular in comparison to social media. Even though many big online companies depend upon social media advertisement. as example if we go through the companies like Zomato then we find that Zomato is now very big company in comparison to many FMCG Companies. If we go to in depth then we find that how uber gives the many chances to their drivers who have very good reviews given by past customers. All the above example comes under sentiment or opinion of customers after availing the services by different providers from many different Domains [3]. 1.1 Machine Learning Raw Sketch

New Data Used Data set

ML Algorithms

Models/ Classifiers

Out put Figure 1: Machine Learning Raw Sketch [3]

In today’s digital era, social media platforms play Vital roles for transforming the individual’s life, they share information in the public domain and try to interact with different people and organizations. The process of collecting, analyzing, and finding the insight from these given data, by this insight we are able to find any one's behaviour and trends. Now days social media plays very critical role of finding any analytics with the help of Data science. For finding these we need number of Tools & techniques. because by using these tools we can find the trends in effective manner [1]. In this computer world every company and form want that reviews of their products goes in market with the help of

can pull many new consumers to them without much expenditure in their marketing plans. If you see major FMCG companies depends upon their products feedback by the customers who purchase their product [2].

Impact Factor value: 8.315

In figure:1 Authors explain the very first concept of machine learning here the very first part is dataset where concern data has been stored. Now then we pass the part of data to next part that is known as training, in this part we pass the data to a given ML Algorithm that create a specific Model or classifiers that will used further for prediction a new data sample directly given to the model. Machine learning is broadly categories into following based upon Dataset behaviour: Labelled Data: If Label available in Dataset: Any Machine learning applied on labelled Data that comes under Supervised Machine Learning [4].

| ISO 9001:2008 Certified Journal

Page 268