International Research Journal of Engineering and Technology (IRJET)
e-ISSN: 2395-0056
Volume: 11 Issue: 10 | Oct 2024
p-ISSN: 2395-0072
www.irjet.net
Amazon Fake or Spam Review Classification Using Short Text Processing Technique Sunny M. Ramchandani1, Dr. Hemant H. Patel2 1PG Student, Department of Computer Science and Engineering, Dr. Subhash University, Junagadh, Gujarat, India. 2Associate Professor and Head, Department of Computer Science and Engineering, Dr. Subhash University,
Junagadh, Gujarat, India. ---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - Text classification is a classical research domain
traditional text analysis task is different from the social media or e-commerce-based text analysis. The amount of text in these type of text data is fewer but having high impact on social media and also in e-commerce platform [4].
utilized for various applications in education, medical and government. However, traditional text classification task is different from the new age text classification problem. In this presented work, the text classification is investigated for classifying the fake or spam review in an e-commerce platform. The e-commerce product vendors are sometimes utilizing fake and partial reviews to boost sales of their own low-quality products in e-commerce platform. This act will waste the consumer’s time, and also negatively impact the ecommerce credibility. Therefore, identification and removal of such misleading reviews from the e-commerce platform. In this work, the text classification technique is used for classifying the spam reviews in e-commerce platform. For this purpose, first text pre-processing technique is used to make clean the reviews, next the word two vector technique is used to prepare the training and validation samples. For conducting the experiments amazon product review dataset has been used. In this dataset different category of product reviews is available, among them the toy and game product category is considered. Additionally, to perform the classification task, two popular machine learning algorithms namely Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) have been used. After successfully implementation the performance in terms of precision, recall, f-measure and accuracy has been measured. Based on experimental results CNN is providing 90% classification accuracy then LSTM, which provides 89.31% accuracy. Additionally, the CNN is efficient than LSTM model in terms of training time. Therefore, it is recommended to use CNN model for future implementations.
In this presented work, a task of sentiment analysis has been done on e-commerce platform data specifically on reviews. The aim is to identify the spam reviews which are influencing user’s or buyers’ decisions [5]. Because sometimes of the time e-commerce sellers are utilizing the fake or spam review to lure the buyers during new product launch. Most of the time a new buyer in e-commerce platform utilizes the reviews for making a buying decision [6]. In this context, the fake review can impact on the buying decisions. Therefore, it is essential to identify and remove such kind of fake reviews using the sentiment analysis task. In this section, the basic overview of the proposed work has been discussed. Additionally, the next section includes the motivation of the proposed work.
2. PROPOSED WORK In this section, the architecture of the proposed system has been discussed. Additionally, the components of the proposed system are also explained. Using these components the functional requirement of each component is discussed.
2.1 System overview In order to understand the working of the sentiment-based text classification techniques recently a review has been carried out. In this review, the different research articles are involved based on machine learning and text classification. The aim is to study different available technique of classifying the spam reviews in an e-commerce platform. Therefore, a detailed investigation of the collected research articles has been performed. Additionally, based on the analysis it is recognized that, the traditional text classification techniques are different from the sentimentbased text classification. In addition, the sentiment-based text classification techniques require a smaller number of features as compared to the traditional methods of text classification. Therefore, why the traditional text feature selection techniques are used in the emotion classification. It is required to investigate.
Key Words: Deep Learning, E-commerce review, Machine Learning, small text classification, Text classification.
1.INTRODUCTION The text analysis is a traditional domain of research in academics. There are a number of applications has been developed using text analysis techniques such as information retrieval (IR) and search engine [1]. But, due to increase in the communication mediums various sources of text information have been developed such as social media, and e-commerce [2]. In both the channels a significant amount of text has been generated and analysis of such information can help in various real-world applications such as disaster management, social gathering and others [3]. However, the
© 2024, IRJET
|
Impact Factor value: 8.315
|
ISO 9001:2008 Certified Journal
|
Page 473