DETECTION OF PHISHING WEBSITES USING MACHINE LEARNING by IRJET Journal

International Research Journal of Engineering and Technology (IRJET)

e-ISSN: 2395-0056

Volume: 11 Issue: 04 | Apr 2024

p-ISSN: 2395-0072

www.irjet.net

DETECTION OF PHISHING WEBSITES USING MACHINE LEARNING R. Karthikeyan1, C. Abirami2 ,R. Ramya3, B. Sneha4,U. Uma Azhagu Sudha5 12345Dept. of Computer Science and Engineering, Government College of Engineering Srirangam, TamilNadu, India ---------------------------------------------------------------------***---------------------------------------------------------------------

Abstract - Phishing is a kind of worldwide spread cyber

2. RELATED WORKS

crime that uses disguised websites to trick users into downloading malware or providing personally sensitive information to attackers. With the rapid development of artificial intelligence, researchers in the cyber security field utilize machine learning algorithms to classify phishing websites. To track the phishing websites the machine learning algorithm Random Forest is used because it gives high accuracy as compared to other machine learning algorithms. The dataset is collected which contains both the malicious and legitimate url to train Machine Learning models using Random Forest Classifier and predict the websites in order to identify and prevent users from falling victim to online scams.

[1] Researchers have investigated different feature sets and selection techniques to improve the performance of phishing detection systems. Studies have explored the importance of features such as URL characteristics, domain registration information, website content analysis, SSL certificate attributes, and user-related factors in distinguishing between phishing and legitimate websites. Feature selection methods, including information gain, chisquare test, and recursive feature elimination, have been employed to identify the most discriminating features for classification.[2]Some research focuses on analyzing the dynamic behavior of website interactions, including mouse movements, keystrokes, and navigation patterns, to detect phishing attempts in real-time. These approaches aim to capture subtle indicators of malicious intent that may not be apparent from static website features alone, enhancing the adaptability and responsiveness of detection systems to evolving phishing tactics.[3]Hybrid and multi-modal detection systems integrate multiple sources of information, such as website content, network traffic, user behavior, and reputation data, to improve the comprehensiveness and effectiveness of phishing detection. These approaches combine the strengths of different detection techniques and data sources to enhance detection accuracy and resilience to evasion strategies employed by cyber criminals.[4]With the growing sophistication of phishing attacks, researchers have explored adversarial machine learning techniques to enhance the robustness of detection systems against evasion attempts. Adversarial training, robust feature representation learning, and generative adversarial networks (GANs) are among the approaches investigated to defend against adversarial manipulation and stealthy phishing tactics.

1.INTRODUCTION In today's digital world, where the internet is an integral part of our daily lives, online security has become an important concern. Phishing is a form of cybercrime, poses a significant threat to individuals, businesses, and organizations worldwide. Phishing attacks involve fraudulent attempts to obtain sensitive information such as usernames, passwords, and financial data by masquerading as a trustworthy entity in electronic communication. The consequences of falling victim to phishing can be severe, ranging from financial loss and identity theft to compromised personal and corporate data. With the advancement of technology, phishing techniques have become increasingly sophisticated, making it more challenging to distinguish between legitimate and malicious websites. Therefore, the ability to detect phishing websites accurately and efficiently is crucial in safeguarding against cyber threats. This detection process involves analyzing the lexical features and characteristics of websites to identify indicators of phishing or other malicious activity.

By building upon and extending the insights gained from related works in the field, the proposed system aims to advance the state-of-the-art in phishing website detection, addressing key challenges and limitations to enhance cybersecurity resilience in the face of evolving threats.

In this paper, we analyze into the methodologies and techniques employed in the detection of phishing websites. By understanding the strategies utilized by cyber criminals and leveraging innovative detection techniques, we aim to empower users and organizations to mitigate the risks posed by phishing attacks.

Impact Factor value: 8.226

ISO 9001:2008 Certified Journal

| Page 2361