Issuu

International Research Journal of Engineering and Technology (IRJET)

e-ISSN: 2395-0056

Volume: 12 Issue: 3 | Mar-2025

p-ISSN: 2395-0072

www.irjet.net

Mischievous Urls detection based on multi-feature using Soft Voting Classifier Rini Gupta 1, Prof. Nagendra Kumar2, Prof. Shobha Rajak3 1Reseacrh Scholar, Department of CSE, Shri Ram Institute of Science and Technology, Jabalpur, M.P. 2Professor, Department of CSE, Shri Ram Institute of Science and Technology, Jabalpur, M.P.

3Professor, Department of CSE, Shri Ram Institute of Science and Technology, Jabalpur, M.P. -----------------------------------------------------------------------***-------------------------------------------------------------------increased 56% from the last quarter of 2020 to the first Abstract – URLs allow Internet users to move from one

quarter of 2021. Fig. 1.1 shows that the most targeted industries in 2021 are financial institutions, social media, and web emails [2].

website to another. Fully represent access to content stored on servers somewhere in the world. URLs are available by simply clicking on a link or image or typing in our browsers. A favourite method used by attackers and children of text is to deceive the social media because regular users still click on any link or visit any URL they find. Blocking other URLs is a fundamental and essential way to provide a basic level of security. With the advent of internet technology, network security is under various threats. In particular, cybercriminals can spread the same dangerous industries (URLs) to attack as criminals by sensitive and spam information. Searching for the wrong URL is vital in preventing this attack. Cybercriminals use malicious URLs as distribution channels to distribute malicious software on the web. Attackers use browser vulnerabilities to install malicious software so that they can access the victim's computer remotely. A malware program aims to gain access to the network, filter sensitive information, and secretly monitor targeted computer systems. In this project, we compare models and find the best guessing model for classifying spam and ham URLs in a better way. Our proposed prediction model seeks to improve the accuracy of the forecast by using various factors that take into account the interaction effect of different parameters.

Figure 1.1: APWG report 2023 [2]. According to figure 1.1, the main goal of attackers is to steal victims’ financial or personal information by targeting financial markets and social media platforms, respectively. Attackers can also send malware that can lead to other network attacks, such as malware attacks, ransomware attacks, etc. Most organizations now rely on human knowledge to detect these attacks [4]. However, due to the similarity between legitimate and fake messages, phishing attacks are difficult to detect even for experts. Therefore, cybersecurity experts are paying more attention to email links such as uniform resource locators (urls) or email ids to identify phishing emails. However, attackers are improving their attack techniques by using new techniques to create phishing attacks that are difficult to detect. For example, they create phishing urls such as https://www.facebook.com/, https://www.faceb00k.com/, https://www.facebook and web pages that look like harmless urls. Therefore, it is important to determine ways to distinguish between phishing urls and harmless urls. Therefore, researchers have proposed various solutions against phishing in recent years, such as blacklists [5], traditional machine learning [6], and deep learning (dl) [7], [8], [9].

Keywords: Uniform resource locators, Phishing, Diversity, Machine learning, Feature Engineering, Soft Voting Classifier, Accuracy.

1. INTRODUCTION Phishing attacks are cybercrime using social engineering to deceive users into stealing their information, such as personal identity, financial information, etc. Masquerading as legitimate sources, attackers can reach victims by sending fraudulent messages using emails (such as Gmail, Outlook, etc.) or social media platforms (like Twitter, Facebook, etc.). Users become vulnerable if they input their information or download attachment files [1]. In recent years, there has been an increase in social media platform attacks since it is easy for attackers to reach many users from anywhere in the world by posting a single message [2]. According to [2], the Anti-Phishing Working Group (APWG) reports the number of phishing attacks increased by 250000 in one month in Jan 2021. In addition, the number of business compromises

Impact Factor value: 8.315

Below we provide a brief review of each solution. • A blacklist is a list of website urls that are most likely to be phishing sites. Any url or ip on this list will be blocked. However, there are drawbacks to this approach. The system needs to have a phishing attack url to block it; if the url is not on the list, it will not be detected.

ISO 9001:2008 Certified Journal

Page 1136