CREDIT CARD FRAUD DETECTION AND AUTHENTICATION SYSTEM USING MACHINE LEARNING

Page 1

CREDIT CARD FRAUD DETECTION AND AUTHENTICATION SYSTEM USING MACHINE LEARNING

Assistant Professor, Dept. of Information Science and Engineering, Bangalore 234Student, Dept. of Information Science and Engineering, Bangalore, Karnataka, India ***

Abstract - This project's objective is to create a reliable system that can detect fraudulent transactions and authenticate credit card users before the transaction gets completed. The increase in the use of credit cards for transactions has led to a rise in fraudulent activities, making it crucial to develop a system that can identify and prevent such activities. The system will use various machine learning algorithms, such as decision trees, random forests, KNN algorithm to analyze transactional data and detect any suspicious activities. It will be trained on a dataset containing information on past fraudulent activities to identify patterns and recognize similar fraudulent transactions. Furthermore, the system will implement diverse authentication methods, including face recognition detection and authentication and one-time passwords to verify the legitimacy of the credit card user. The implementation of this system is expected to increase the security of credit card transactions and prevent fraudulent activities by detecting and authenticating the fraud before it takes place, leading to significantsavings for consumersandfinancial institutions.

Key Words: Fraudulent transactions, authentication, onetime passwords, face recognition, KNN Algorithm, Random Forest, Naïve Bayes.

1. INTRODUCTION

This research work focuses on the identification of fraudulent transactions made through credit cards. The objective is to create a fraud detection algorithm that can accurately and efficiently classify transactions as either fraudulentornon-fraudulentbyutilizingmachinelearningbased classification algorithms. With the increasing use of online payments and the decline of cash payments, fraudstersaretakingadvantageoftheanonymityprovided by these transactions. Online payments, in particular, require only the card number, expiration date, and CVV, making it easy for data to be lost or stolen without the cardholder's knowledge. In some cases, cardholders may not even be aware that their information has been compromised due to fraudulent purchases made through phishingtechniques.

Thishighlightstheimportanceofkeepingcarddetails private, though there are instances where this is not possible due to the prevalence of phishing sites and cases oflostorstolencards.

One effective way to determine the legitimacy of a transaction is to analyze the spending patterns of the cardholder using existing data and applying machine learning algorithms. This can help identify anomalies in spending that may indicate fraudulent activity. There are various types of credit card fraud, including online and offline fraud, card theft, data phishing, application fraud, and telecommunication fraud. It is essential to address these different types of fraud to prevent fraudulent transactionsandprotectcardholders'data.

2. LITERATURE REVIEW

Parth Roy, Prateek Rao, Jay Gajre [1]. IFA suggested using machine learning to identify fraudulent Master Card transactions. The strategies are used to improve the best solution for issues with fraud detection. Methods for reducingfalsealarmratesandincreasingtherateatwhich scamsarediscoveredarestillproven.SinceEuropeancard holders have had 284,807 communications, data on card transactions continues to be collected. A modified version of these methods can be implemented to the bank's credit cardscamdetectionsystemtohelpidentifyandstopfraud.

IshikaSharma,ShivjyotiDalai,VenkteshTiwari,Ishwari Singh,SeemaKharb[2]presentedvarioustechniquessuch asNaiveBayes,RandomForestandLogisticRegressionare utilizedtotacklethisproblem.Thistransactionisevaluated individually, and whatever works best is carried out. The primary purpose is to detect fraud by filtering the aforementioned strategies in order to achieve a better outcome.

Anuruddha Thennakoon, Chee Bhagyani, Sasitha Premadasa, Shalitha Mihiranga, Nuwan Kuruwitaarachchi [3offered an evaluation that offers a thorough manual for choosing the best algorithm for the kind of frauds, and we use an adequate performance metric to show the evaluation.Inordertodetermineifaparticulartransaction is legitimate or fraudulent, they also created the use of predictive analytics performed by the implemented machinelearningmodelsandanAPImodule

H. Najadat, O. Altiti, A. A. Aqouleh, and M. Younes [4] carriedoutathoroughexperimentalinvestigationusingthe answers to the imbalance classification issue. They investigated these options and the machine learning fraud

© 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page1203
1
International Research Journal of Engineering
Technology (IRJET) e-ISSN:2395-0056
Issue:
| Apr 2023 www.irjet.net p-ISSN:2395-0072
and
Volume: 10
04

detectionmethods,pinpointedtheirflaws,andsummarised thefindingsusingacreditcardfraudlabelleddataset.

S.Makki,Z.Assaghir,Y.Taher,R.Haque,M.-S.Hacid, and H. Zeineddine [5] presented various machine learningbased classification techniques, including Naive Bayes, Random Forest, and Logistic Regression, for handling the severelyskeweddataset.Accuracy,precision,recall,f1score, confusion matrix, and Roc-auc score will all be calculated as partoftheresearchproject.

D. Tanouz, R Raja Subramanian, D. Eswar, G V Parameswara Reddy, A. Ranjith kumar, CH V N M Praneeth[6] proposedtheusageofRandomForest,Gradient Boost, Support Vector Machine, and their combinations. Depending on the set of data and the application, the algorithms' efficacy vary. They demonstrate that, in spite of all computations, all algorithms exhibit some degree of imbalanceat some pointin thestudy. Whenlearning curves wereplotted,itwasdiscoveredthatwhilelogisticregression had a higher accuracy, KNN could only learn, while the majority of the algorithm was underfit. KNN is therefore a strongerclassifierforidentifyingcreditcardfraud.

MJ Madhury, H L Gururaj, B C Soundarya, K P Vidyashree ,B Rajendra [7] provided a variety of machine learning-based methods for credit card recognition, including XG Boost, Decision Tree, Random Forest, Support Vector Machine, and Extreme Learning Method. To get effective results, comparative examination of both machine learning and deep learning algorithms was done. Using the European card benchmark dataset for fraud detection, a thorough empirical investigation is conducted. The dataset was first subjected to a machine learning technique, which somewhat increased the accuracy of fraud detection.The evaluationoftheresearcheffortdemonstratestheenhanced outcomes obtained, with optimised values for accuracy, f1score, precision, and AUC Curves of 99.9%, 85.71%, 93%, and 98%, respectively. For situations involving credit card detection,thesuggestedmodelperformsbetterthancuttingedgemachinelearninganddeeplearningtechniques.

Fawaz Khaled Alarfaj , Iqra Malik , Hikmat Ullah Khan,NaifAlmusallam,MuhammadRamzan,AndMuzamil Ahmed [8] Using both publicly available and actual transaction records, 13 statistical and machine learning models for payment card fraud detection were created. Analysis and comparison are done on the outcomes from both the original features and the aggregated features. To determine if the combined characteristics produced by a genetic algorithm have greater discriminative ability than the original features in detecting fraud, a statistical hypothesis test is performed. The results demonstrate that employingaggregatedfeaturestotacklereal-worldpayment cardfrauddetectionissuesiseffective.

Manjeevan Seera, Chee Peng Lim, Ajay Kumar, Lalitha Dhamotharan, Kim Hua Tan [9] proposed a method to determine whether transactions on the Kaggel-provided

IEEE-CIS Fraud Detection dataset were genuine or fraudulent. Bidirectional Long Short-Term Memory (BiLSTM) and bidirectional Gated Recurrent Unit (BiGRU) are the foundations of the model, which is called BiLSTMMaxPooling-BiGRUMaxPooling. They used the following six machinelearningclassifiers:LogisticRegression,NaiveBase, Voting, Ada Boosting, Random Forest, and Decision Tree. When comparing the outcomes of machine learning classifiers and our model, it is clear that the model performedbetterbecauseitreceiveda91.37%score.

Sailusha Ruttala; Gnaneswar V.; Ramesh R.; Rao, G. Ramakoteswara [10] The Adaboost algorithm and the random forest method are the algorithms used in the presentation. The two algorithms' outputs are based on F1score, accuracy, precision, recall, and other metrics. On the basisoftheconfusionmatrix,theROCcurveisplotted.Whthe algorithms from Random Forest and Adaboost are compared, the method with the highest accuracy, precision, recall, and F1-score is regarded as the best one for spotting fraud.

XuanShiyangXuan,GuanjunLiu,ZhenchuanLi,Shuo Wang,LutaoZheng,ChangjunJiang[11]discussedtheuseof two alternative random forest models to train the behavioural characteristics of typical and anomalous transactions, compare the two random forests, which differ in their underlying classifiers, and assess how well they detectcreditfraud.

3.THE OBJECTIVE OF PROJECT

Thespecificobjectiveofthisprojectisto:

1.Develop a fraud detection system that can accurately identifyandclassifyfraudulentand creditcardtransactions thatarenotfraudulentusing machinelearningtechniques.

2. Implement a reliable authentication system that uses various authentication methods, such as face recognition authentication and one-time passwords verifythelegitimacyofthecreditcarduser.

3. Train the system on a dataset of past fraudulent activities to identify patterns and recognize similar fraudulent transactions, thereby increasing the accuracyofthe frauddetectionalgorithm.

4. Increase the security of credit card transactions by detecting and preventing fraudulent activities before they occur, leading to significant savings for consumersandfinancialinstitutions.

5. Keep up with the advancements in technology and adapt the system to new fraudulent activities and authenticationmethodstomaintainitseffectiveness

International Research Journal of Engineering and Technology (IRJET) e-ISSN:2395-0056 Volume: 10 Issue: 04 | Apr 2023 www.irjet.net p-ISSN:2395-0072 © 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page1204

4.PROPOSED METHODOLOGY

usedfortheresearch

All Our study suggests a novel use of sophisticated machine learning algorithms to identify outliers, or false or unusual events. In order to achieve this, we made use of a Kaggle dataset made up of transaction records by using it as our training data. The dataset consists of v1-v28 PCA feature concerns with secrecy, time, and amount as our variable factors and classes with 0 and 1 where 0 indicates no fraud and1meansfraud,respectively.

The figure shows a sequence of critical events that must occur in the creation of the proposed model. The task of classifying the data is completed after procedures like data processing, decision making and face detection and OTP authentication.

The system receives data from various sources, such as credit card transactions, user account information, and device information. The input data is pre-processed to normalize and standardize the data. This may involve converting data into a specific format, identifying missing values,andremovinganyirrelevantinformation.

In order to find patterns and anomalies in the data, the pertinent characteristics are retrieved from the preprocesseddata.Tofindpotentialfraudulenttransactions,the collected features are analyzed using a variety of methods, including rule-based systems, anomaly detection, and machine learning algorithms. The system authenticates legitimate users using various techniques such as face detection, two-factor authentication, and device recognition. Then based on the fraud detection and authentication results, the system makes a decision on whether to approve orrejectatransaction.

The system informs the user and the company of the transaction's status, including whether it was accepted or refused. In order to increase the accuracy and efficiency of the fraud detection and authentication system, the system generates reports and analyses the data to find trends and patterns.

5.ALGORITHMS USED TO DETECT THE FRAUDULENT TRANSACTIONS

In order to extract the data and find trends, such as fraudulent transactions, we will train our system using a variety of machine learning approaches after building the heatmap. To forecast a fraudulent transaction in the future, we will leverage the pattern discovered between the legitimate transactions and the fraudulent transactions that are currently present in the dataset. We then employed a variety of algorithms, including those listed below, to construct and train a model aimed at identifying fraudulent activitiesincreditcardtransactions:

International Research Journal of Engineering and Technology (IRJET) e-ISSN:2395-0056 Volume: 10 Issue: 04 | Apr 2023 www.irjet.net p-ISSN:2395-0072 © 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page1205
Figure4.1-Flowchartofthesystem Figure4.2.Thedataset Figure4.3Theheatmapdrawnforfindingcorrelation.

5.1.RANDOM FOREST CLASSIFIER:

AclassificationsystemcalledRandomForestusesa number of decision trees, each of which is constructed using a different subset of the dataset. The algorithm then aggregates the outcomes of each decision tree's predictions to produce a final outcome. When compared to employing a single decision tree, this method is intended to increase the predictability of results. The final output is predicted by the algorithm using majority voting, which means that the forecastthatreceivedthegreatestsupportfromthedecision trees is chosen as the result. More trees in the forest can be used to improve accuracy and lessen the problem of overfitting.

5.2. Naïve Bayes algorithm:

5.3. Decision tree algorithm:

A decision tree is a type of hierarchical structure that resembles a tree, where every internal node, including the root node, represents a "test" run on a dataset's instances' attributes. The results of each test are represented by the corresponding branches, which connect to further internal nodes or leaf nodes, which stand in for the class labels. For theexecutionofthisformoftree,thefollowingthreefactors are necessary: Target Attribute is the attribute that represents the class label. Instances is a collection of instances for which the class label is already known. Attributes List is a collection of predictor attributes used to createthedecisiontree.

5.4.Logical Regression:

A categorical dependent variable is analyzed in relation to one or more independent variables using the statistical procedure known as logistic regression. It is a common use of regression analysis in machine learning and predictive modelling.Thedependentvariableinlogistic regression is a binary or dichotomous variable, meaning it can only have two possible outcomes, such as yes or no, true or false, success or failure, etc. The independent variables may be categorical or continuous. Based on the values of the independent variables, logistic regression seeks to determinethelikelihoodthatthedependentvariablewillfall into one of the two groups. A probability value between 0 and 1 is the result of logistic regression, which can be transformed The theory presented here relies on two fundamental assumptions. Firstly, it assumes that each feature inagiveninput carries equal weight towards the objective of classification. Secondly, it assumes that all attributesprovided are statistically independent, meaning that the values of each attribute do not provide any information about the values of other attributes. While theseassumptionsmayholdtrueinsomescenarios,theymay not always be applicable in practice. To address such cases, the Bayes rule is employed to determine whether a giveninputisfraudulent or legitimate. The rule computes

the probability of an input belonging to each possible class and assigns the predicted class based on the one with the highest probability. By utilizing this approach, the system can make more accurate predictions even when the independence assumption between attributes does not hold.

6.ADVANTAGES

• EnhancedSecurity:Acreditcardfrauddetectionand authentication system can provide an extra layer of securitytoprotectagainstfraudulenttransactions.It can use various techniques such as machine learning,artificialintelligence,andfacialrecognition to identify and authenticate legitimate users while blockingunauthorizedaccess.

• Real-time Monitoring: Such systems can monitor credit card transactions in real-time, which enables quick identification and prevention of fraudulent activities. This is especially important for ecommercetransactionswherespeediscritical.

• Lessened Financial Losses: Both consumers and businesses can suffer large financial losses as a result of credit card fraud. By recognising and preventing fraudulent transactions from taking place,anefficientfrauddetectionandauthentication systemcanaidinthepreventionoftheselosses.

• Improved Customer Trust: When customers know that their credit card information is secure and protected,theyaremorelikelytotrustthecompany they are dealing with. This can result in improved customerloyaltyandrepeatbusiness.

• Regulation Compliance: Credit card fraud detection and authentication tools can assist companies in meeting security standards like the Payment Card IndustryDataSecurityStandard(PCIDSS)andother legalobligations.Penaltiesandotherlegalproblems maybeavoidedinthisway.

• Regulation Compliance: Credit card fraud detection and authentication tools can assist companies in meeting security standards like the Payment Card IndustryDataSecurityStandard(PCIDSS)andother legalobligations.Penaltiesandotherlegalproblems maybeavoidedinthisway.

7.CONCLUSION

Developing a reliable system that can identify fraudulent transactions and authenticate credit card users is the goal ofthisproject,tosumup.Theneedtodevelopasystemthat can stop such activities in their tracks by identifying them and preventing them has increased due to the rising prevalence of credit card theft. The suggested system will

International Research Journal of Engineering and Technology (IRJET) e-ISSN:2395-0056 Volume: 10 Issue: 04 | Apr 2023 www.irjet.net p-ISSN:2395-0072 © 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page1206

examine transactional data and look for any suspicious behaviour using a variety of machine learning algorithms, including decision trees, random forests, and more.. To identifytrendsandidentifysimilarfraudulenttransactions, it will be trained on a dataset of prior fraudulent acts. In order to confirm the legality of the credit card user, the systemwill alsousea varietyofauthenticationtechniques, such as facial detection authentication, one-time passwords,andsecurityquestions.Theintroductionofthis systemisanticipatedtoresultinsignificantcostsavingsfor both customers and financial institutions while also enhancing the security of credit card transactions, preventing fraud by identifying and validating it beforehand,andpreventingfraudulentactivity.

8.FUTURE ENHANCEMENTS

In the future, the proposed system could be enhanced by incorporating advanced machine learning algorithms and techniques such as deep learning and natural language processing to better analyses and interpret transactional data.Additionally,thesystemcouldbeintegratedwithrealtime fraud detection mechanisms that could monitor transactionsastheyoccurandidentifysuspiciousactivities in real-time. Furthermore, the system could leverage blockchain technology to enhance security and prevent fraudulent activities by creating an immutable and transparent record of all transactions. Finally, the system could be extended to support more diverse payment methods and authentication mechanisms to accommodate the evolving needs of consumers and businesses. These enhancements are expected to significantly improve the effectiveness and efficiency of the system, leading to even greater savings for consumers and financial institutions whilefurtherreducingtheriskofcreditcardfraud.

9. REFERENCES

[1] Parth Roy, Prateek Rao, Jay Gajre, "Comprehensive Analysis for Fraud Detection of Credit Card through Machine Learning" ,2021 International Conference on Emerging SmartComputingandInformatics(ESCI),March2021.

[2]IshikaSharma,ShivjyotiDalai,VenkteshTiwari,Ishwari Singh , Seema Kharb, "Credit Card Fraud Detection Using Machine Learning & Data Science" , International Research Journal of Engineering and Technology (IRJET) Vol. 09, Issue06,Jun2022.

[3] Anuruddha Thennakoon, Chee Bhagyani, Sasitha Premadasa, Shalitha Mihiranga, Nuwan Kuruwitaarachchi ,"Real-time Credit Card Fraud Detection Using Machine Learning",IEEE 9th International Conference on Cloud Computing,DataScience&Engineering,2019.

[4] H. Najadat, O. Altiti, A. A. Aqouleh, and M. Younes, ‘‘Credit card fraud detection based on machine and deep learning,’’inProc.11thInt.Conf.Inf.Commun.Syst.(ICICS), Apr.2020.

[5]S.Makki,Z.Assaghir,Y.Taher,R.Haque,M.-S.Hacid,and H. Zeineddine, ‘‘An experimental study with imbalanced classification approaches for credit card fraud detection,’’ IEEEAccess.

[6] D. Tanouz, R Raja Subramanian, D. Eswar, G V ParameswaraReddy,A.Ranjithkumar,CHVNMpraneeth, "Credit Card FraudDetection Using Machine Learning ",Fifth International Conference on Intelligent Computing and ControlSystems,2021.

[7] MJ Madhury, H L Gururaj, B C Soundarya, K P Vidyashree,BRajendra,"Exploratory analysis of credit card fraud detection using machine learning techniques",Elsevier B.V.2022.

[8] Fawaz Khaled Alarfaj , Iqra Malik , Hikmat Ullah Khan , Naif Almusallam , Muhammad Ramzan , And Muzamil Ahmed,"CreditCardFraudDetectionUsingState-of-the-Art Machine Learning and Deep Learning Algorithms", Deanship of Scientific Research at Imam Mohammad Ibn SaudIslamicUniversitythroughtheResearch2022.

[9] Manjeevan Seera, Chee Peng Lim, Ajay Kumar, Lalitha Dhamotharan, Kim Hua Tan ,"An intelligent payment card fraud detection system", Springer Science+Business Media, LLC,partofSpringerNature2021.

[10] Sailusha Ruttala; Gnaneswar V.; Ramesh R.; Rao, G. Ramakoteswara, "Credit Card Fraud Detection Using MachineLearning", International Conference on Intelligent ComputingandControlSystems,IEEEExplore2020

[11] Shiyang Xuan,Guanjun Liu, Zhenchuan Li,Shuo Wang, Lutao Zheng, Changjun Jiang,"Random forest for credit card frauddetection",IEEE,2018.

International Research Journal of Engineering and Technology (IRJET) e-ISSN:2395-0056 Volume: 10 Issue: 04 | Apr 2023 www.irjet.net p-ISSN:2395-0072 © 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page1207

Turn static files into dynamic content formats.

Create a flipbook