PHISHING URL DETECTION USING MACHINE LEARNING

Page 1

International Research Journal of Engineering and Technology (IRJET) Volume: 09 Issue: 11 | Nov 2022

www.irjet.net

e-ISSN: 2395-0056 p-ISSN: 2395-0072

PHISHING URL DETECTION USING MACHINE LEARNING 1Prof.

Pooja Parikh,2Ketan Kokane, 3Shrikant Rathod, 4Mayur Mali,5Harshal Pagare

ALARD COLLEGE OF ENGINEERING & MANAGEMENT (Alard Knowledge Park, Survey No. 50, Marunji, Near Rajiv Gandhi IT Park, Hinjewadi, Pune-411057) Approved by AICTE. Recognized by DTE. NAAC Accredited. Affiliated to SPPU (Pune University) ------------------------------------------------------------------------***--------------------------------------------------------------------the phishing attacks, whether carried out by emails or I. ABSTRACT:

any other medium, the objective is to get the victim to follow a link that appears to go to a legitimate web resource but actually redirects the victim to a malicious web page. The easiest way to link the operation is to construct a malicious URL and direct the user to the malicious page desired by the attacker. This document focuses on detecting phishing websites with URL detection. Previously, methods such as k-nearest neighbors, list-based approaches, fuzzy logic, mining, and classification approaches such as Phishzoo were used for detection, but over time as the strength of attacks increased , a more sophisticated algorithm was used. techniques were introduced to detect and prevent attacks.

Phishing is an illegal activity that uses a variety of deceptive methods to direct people to the wrong website. The purpose of these phishing websites is to confiscate personal information and other financial details for personal gain or abuse. As technology advances, the phishing approaches in use must evolve, and there is an urgent need for increased security and improved mechanisms to prevent and detect these phishing approaches. The main focus of this paper is to introduce his model as a solution for detecting phishing websites using the URL detection method with a random forest algorithm. The model has three main phases, such as parsing, heuristic classification of data, and performance analysis, where each phase uses a different technique or algorithm to process the data for better results.

II.

III.

INTRODUCTION

Currently, various peer-reviewed journals and conferences have published various studies and studies on phishing website detection, and one proposed approach was multi-level classification of phishing URL filtering. In it, author presents an innovative method for extracting phishing URL features based on message content weighting [3]. His multiple classification algorithms including SVM, AdaBoost and Naive Bayes are used. These algorithms are divided into three layers using 21 fixed individual functions 4446 [3]. A two-step process is then done using a different classification algorithm, but the problem here is the time and complexity involved, the overhead 4486 involved, and the performance issue, so this is It wasn't the best 4484 method.

Online procedures, online business or trading, or exposure, so the online systems already in place at that time faced little threat. However, in the past five years, the world has experienced a big boom in the IT sector, resulting in most of the daily operations going online. From shopping to banking. The term "phishing" was coined in 1996 by his hacker, who stole the America OnLine account by stealing passwords from unsuspecting AOL users. The word phishing comes from the phrase "website phishing" and is his variation of the word "phishing". The idea is that, like a fish, it casts the bait in hopes that the user will grab it and bite. In most cases, bait is either an e-mail or an instant messaging site, which will take the user to hostile phishing websites. Over the years, phishing attacks grew in number and intensity too.

Another method adopted by one author of the IEEE 2017 paper was pattern recognition, d. H. Various features are extracted from emails to obtain a model that helps distinguish between phishing and non-phishing messages [4]. One of the main methods used in this context is detecting attacks and using feature extraction and classification. The main limitation of this proposal is that it evaluates too many characteristics without considering whether they are really important for identifying phishing. Therefore, this can lead to unnecessary computational costs. According to the Institute of Research Engineers and Doctors, USA,

Phishing attacks now target users of online banking, payment services such as PayPal, and online e-commerce sites. There are different modes through which phishing can be carried out and hence there are various types of phishing like vishing (voice over phishing), smishing (Phishing via SMS), whaling, Mishing (mobile phishing), social engineering, spear phishing, etc. Usually, there are four phases in a typical phishing attack like preparation, mass broadcast, mature and account hijack. For most of

© 2022, IRJET

|

Impact Factor value: 7.529

PREVIOUS WORK

|

ISO 9001:2008 Certified Journal

|

Page 758


Turn static files into dynamic content formats.

Create a flipbook