A Novel Preprocessing Method for Web Usage Mining based on Hierarchical Clustering

International Research Journal of Engineering and Technology (IRJET)

e-ISSN: 2395 -0056

Volume: 04 Issue: 04 | Apr -2017

p-ISSN: 2395-0072

www.irjet.net

A Novel Preprocessing Method for Web Usage Mining based on Hierarchical Clustering R. Padmapriya1 1Research

Scholar, Department of Computer Science, Rathnavel Subramaniam College of Arts & Science, Sulur

Dr. D. Maheswari2 2Assistant

Professor & HoD- Research School of Computer Studies , Rathnavel Subramaniam College of Arts & Science, Sulur ---------------------------------------------------------------------***---------------------------------------------------------------------

interconnected, coherent, and integrated techniques, employed in a succession to create clean and clear results.

Abstract

Web Usage Mining is the method of implementing data mining procedures to extract usage pattern from Web Log files data. There are three phases in Web usage mining preprocessing, pattern discovery and pattern analysis. There are several preprocessing tasks that must be performed prior to data collected from server log data mining algorithms to apply. This serves to define the value of specific clients, cross marketing strategies across products and the effectiveness of promotional efforts, and so on. Data preprocessing is a data mining technique which involves the transforming of raw data into an understandable format. Data preprocessing is important to insure the ability of web log mining. Result of preprocessing has direct influence on the choosing of mining algorithm. In this research, data preprocessing algorithms are discussed in database-driven applications such as customer relationship management and rule based applications. The preprocessed Web Log File can be suitable for the discovery and analysis of useful information referred to as web mining. Preprocessing may be needed to make data more suitable for data mining. This research summarizes the efficient and complete preprocessing results before actual mining can be performed. Keywords: Web usage mining, Data preprocessing, Framework, PSO with Hierarchical clustering

1. INTRODUCTION Referable to the increasing popularity of ecommercialism in our daily lives, credit card usages have increased over the years. There are various applications for examining user navigational pattern which uses web usage mining. Web usage mining is technique of web mining. Preprocessing performs a series of processing of web log file covering data cleaning, user identification, session identification, course completion and transaction identification. . This process deals with logging of the data; performing accuracy check; putting the data together from disparate sources; transforming the data into a session file; and finally structuring the data as per the input requirements. Preprocessing phase is a set of

ÂŠ 2017, IRJET

|

Impact Factor value: 5.181

|

Data preprocessing is needed and important phase in web usage mining. The web log file is the data source for data preprocessing method. The aim of data cleanup is to get rid of irrelevant items. The task of User identification is to identify who access the web site and which pages are accessed in the web site. Current research is on data preprocessing methods which are data cleaning and user identification. A different technique is provided for data cleaning, but still there are problems remain in data collection and accuracy metric of user identification. This paper offers a review on algorithm and different techniques utilized in data preprocessing that are applied for web usage mining. Data preprocessing is used for clean the data so that when it provide for the pattern discovery it will distinguish the technique which will used to discover the users' navigational pattern and after treating it will communicate that to pattern analysis so that it will contain only relevant pattern and removes irrelevant pattern. Web usage mining refers to the automatic discovery and analysis of patterns in click stream and related data collected or generated as an outcome of user interactions with Web resources on single or more Web sites. The main aim of this is to confine, model and examine the behavioral pattern and profiles of user interacting with Web site. The observed patterns are normally interpreted as collections of pages, objects, or re-sources that are often accessed by groups of users with common needs or interests. Sticking with the stock data mining process the overall Web usage mining process can be split into three interdependent stages: data collection and pre-processing, pattern discovery, and pattern analysis. This remaining paper describes Literature Survey in Section II, Preprocessing methodology is discussed in Section III, Experiments and achieved results in Section IV. Finally, Conclusion of this work is given in Section V.

2. LITERATURE SURVEY Different data mining techniques can be used on web usage data to extract user access patterns and this knowledge can be applied in a diversity of applications such ISO 9001:2008 Certified Journal

|

Page 1517

Turn static files into dynamic content formats.

Create a flipbook