CREDIT CARD FRAUD DETECTION USING ARTIFICIAL NEURAL NETWORK (ANN) ALGORITHM

Page 1

International Research Journal of Engineering and Technology (IRJET) e ISSN: 2395 0056

Volume: 09 Issue: 07 | July 2022 www.irjet.net p ISSN: 2395 0072

CREDIT CARD FRAUD DETECTION USING ARTIFICIAL NEURAL NETWORK

(ANN) ALGORITHM

1,2 Department of Computer Science and Engineering, Prince Shri Venkateshwara Padmavathy Engineering College, Ponmar, Chennai

3 Assistant Professor, Department of Computer Science and Engineering, Prince Shri Venkateshwara Padmavathy Engineering College, Ponmar, Chennai

4 Associate Professor, Department of Computer Science and Engineering, Prince Shri Venkateshwara Padmavathy Engineering College, Ponmar, Chennai ***

Abstract Duetotheriseandrapidgrowthof E commerce, use of credit card for online purchases has dramatically increased and it caused an explosion in the credit card fraud. In our project we mainly focused on finding out whether a transaction is fraud or genuine for that we were using a feature totally based only on time i.e.) a particular period of time the occurred transaction data. And extracted feature is given as input for further implementation using Artificial NeuralNetwork(Deep Learning)andSupportVectorMachine (Supervised Learning), accuracy of the process is individually collected and compared with one and another. while comparing ANN provides the better accuracy of above 95% while SVM provides only 93% and at last using confusion matrix total number of transactions, number of genuine transactions and number of fraud transactions will be displayed as output based on the data given as input.

Key Words: FeatureExtraction, Fraud Detection, Online Payment, Credit Card, Deep Learning, Artificial Neural Network(ANN),Machine Learning, Support Vector Machine(SVM).

1. INTRODUCTION

Thefraudincreditcardtransactionoccurswhenthestealer uses the other person card without authorization of the respectivepersonbystealingthenecessaryinformationlike PIN, password and other credentials with or without the physical card. Using fraud detection module involving machine learning and deep learning, we can find out whethertheupcomingtransactionisfraudandlegitimate.

MachineLearningisthetreadingandmostusedtechnology because of its various applications and less time consumption,moreaccurateinresult.Machinelearningisa technologythatdealswiththealgorithm,whichprovidesthe computer, a capability to study and advance through experiencewithoutbeingexplicitlyprogrammed.Machine learninghasapplicationinmultiplefields.Example:medical, diagnosis,regressionetc.

Machinelearninginvolvesthecombinationofalgorithmand staticallymodelswhichallowcomputertoperformthetask

withouthardcodingthenamodelisbuiltthroughatraining dataandthenitistestedonthetrainedmodel.

Deeplearningisapartofmachinelearningtechniquesthat makesuseofneuralnetworks.Someofmethodsthatcome under deep learning are artificial neural network, Convolutionneuralnetwork,autoencoders,recurrentneural networks,restrictedBoltzmannmachineetc.

Deep learning makes uses of neural networks, which resembles the human brain in processing the data and makingthedecision.HereweusedbothDeepLearningand MachineLearningtechniquesbutDeepLearningAlgorithm outperformed based on accuracy. For implementation process we were used “PYTHON” programming language sinceitissimpleandeasytoread,learnandwrite.Andalso weused someofthe pythonlibrariesand packagescalled NUMPY,Pandas,SCIKITLEARN,KERAS,MySQLandTKINTER fordataanalysis,datamanipulation,useoflinearalgebraic operation,storagepurpose,usedforGraphicalUserInterface (GUI).

2. TYPES OF CREDIT CARD FRAUDS

There are different kinds of frauds that are seen on e commercesites.OfflinetheftandrobberyoccurnearATMs; while online theft can occur over the internet and mobile phones.

1 Application fraud: Customer'scredentialsarestolenby the fraudster, then he creates a fake account and the transactionstakesplace.

2 Electronic or manual card imprints: Thefraudsterwill skimtheinformationthatispresentonthecardandusesthe credentialsandfraudtransactiontakesplace

3 Card not present: This is a type where actual physical cardisnotpresentduringtransaction

4 Counterfeit card fraud: Allthedatafrommagneticstrip willbecopiedbythefraudsterwheretherealcardlookslike originalcardandthesamecardcanbeusedforfraud

© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page2565

International Research Journal of Engineering and Technology (IRJET) e ISSN: 2395 0056

Volume: 09 Issue: 07 | July 2022 www.irjet.net p ISSN: 2395 0072

5 Lost/stolen card: Thishappensduetoloosingofthecard by the cardholder or by stealing the card from the cardholder.

6 Card id theft: Thishappenswhentheidofthecardholder isstolenandfraudtakesplace.

7 Mail non received card fraud: Whileissuingthecredit cardtherewillbeprocessofsendingamailtotherecipient, fraudcanoccurherebydefraudingthemailorphishing.

8 Account Takeover: The fraudster takes the complete controloftheaccountholderandmakesafraud.

9 Fake fraud in website: A malicious code will be introduced by the fraudster which does their work in the website

10 Merchant collision: The details of the cardholder are shared by the third party or the fraudster by merchants withoutcardholderauthorization.

3. PROBLEM STATEMENT

Now a days,mostofthemareusingcreditcardsforbuying thegoodswhicharesomuchinneedbutcan’taffordatthe moment.Inordertomeettheneeds,creditcardsareused andthefraudassociatedwithitisalsoincreasing.So,thereis a needtocreateandimplementa modelthat’sfitwelland predictsathigheraccuracy.

4. OBJECTIVES

The main objective of the research is to find a fraudulenttransactionincreditcardtransactions.

6. PROPOSED SYSTEM

IntheProposedsystemweusetheArtificialNeuralNetwork tofindthefraudinthecreditcardtransactions.Performance ismeasuredandaccuracyiscalculatedbasedonprediction. And also classification algorithm such as Support vector machineisusedtobuildacreditcardfrauddetectionmodel. We compared both algorithms and made a decision that artificialneuralnetworkspredictswellthansupportvector machineandgivestheoutcomeofthetransactionineither0 or1.

ADVANTAGES:

• Provideshighaccuracy.

• Correctstheproblemofoverfitting.

• Addedlayerofsafety

7. ARCHITECTURE DIAGRAM

Comparisonbetweenthesupervisedlearningand deep learning and deep learning algorithm outperformedbasedonaccuracy.

5. EXISTING SYSTEM

Theexistingsystemsarecarriedoutbyconsideringmachine learning algorithms like Support Vector Machine, Naïve Bayes,k NearestNeighborandsoonandsomeofthemused random dataset. Very few have used artificial neural networkforcreditcardfrauddetection.

DISADVANTAGES:

• It produces lot of tables with relatively smallnumberofcolumns.

• Datarestrictions.

• Requireshugeprocessingtime.

8. MODULES

Fig 1

8.1. DATA PREPROCESSING:

This section explains about the implementation of the algorithm used for proposed system. In this paper, the implementation starts from the collection of data (Data Collection). Then data pre processing is carried out that includes data cleaning (Filling any missing values in the transaction by using mean, median, standard deviation techniques)andnormalizingthedata.Datasetissplit into twodatasetastraindataandtestdataandmodelistrained andtestedtomeasuretheaccuracy.Finally,systempredicts whethertransactionisfraudornon fraud.

© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page2566

International Research Journal of Engineering and Technology (IRJET) e ISSN: 2395 0056

Volume: 09 Issue: 07 | July 2022 www.irjet.net p ISSN: 2395 0072

flow. KERAS is mainly used while implementing deep learning algorithms such as CNN, RNN because its user friendly,modularity,andeasytoextensibility.Itrunsonboth CPUandGPU.Intheexperimentoffindingthefraudornon fraudcreditcardtransactionwehadusedKERASalongwith backendrunningtensorflow.ThisKERASalongwithtenor flow backend makes excellent choice for training neural networkarchitecture.

Fig 2

8.2. FEATURE EXTRACTION

• It is the method of reducing the input variable to ourmodelbyusingonlyrelevantdataandgetting ridofnoiseindata.

• Here we use features based only on time i.e.) particular period of time occurred transactional data

8.3. IMPLEMENTATION:

i) SOFTWARE:

Anaconda Navigator is a desktop graphical user interface (GUI)includedinAnaconda®distributionthatallowsyouto launch applications and easily manage CONDA packages, environments and channels without using command line commands.NavigatorcansearchforpackagesonAnaconda CloudorinalocalAnacondaRepository.

ii) PACKAGES AND LIBRARIES USED:

Someofthepythonlibraryandpackagesusedinproposed systemareasfollows:

1. NUMPY

NUMPY is a python library. Abbreviation of NUMPY is numerical python library. NUMPY package is used for multidimensionalarraysandlinearalgebraicoperations.

2. Pandas

Pandas is a python library. Pandas is used for data analysis and data manipulation tool. It is used to read the datasetandloadthedataset.Itisfast,flexiblewhenworking withdata.

3. SCIKITLEARN

Apythonpackagewhichissuitableforstatisticalmodel andmachinelearningmodels.Abestsuitedpythonpackage formachinelearningmodeling.

4. KERAS

KERASisadvancedstageofneuralnetworkapplication programminginterface(API).Itisableofrunontopoftensor

5. MySQL

MySQLisdatabasewhichisusedforstoragepurpose.In theexperimentoffraudidentificationincardtransactionwe had used MySQL for storing the user details namely user name,password,email idandphonenumber.Whileentering into application, user needs to register by providing the credential. These credentials are stored in database. Thereafter, user needs to login by giving username and password. The application will validate the login and registeredinformationthanuserismovedtonextwindow.

6. TKINTER

TKINTER is python library which is used for Graphical User interface (GUI). It can be used on both Unix and Windowsplatform.WecancreateitbyimportingTKINTER module then GUI is created and one or more widgets are addedfinally,calledinloop.

iii) PROGRAMMING LANGUAGE USED:

Wehaveusedpythonasprogramminglanguage.Pythonis beginner’slanguage,whichprovidesvariousapplications.In recentyears,pythonhadsetthenewtrendbecauseitiseasy to use, interpreted, object oriented, high level, scripting language.Itprovidesrichpackagesandlibrariesthatusedin machinelearning. iv)CLASSIFICATION TECHNIQUES:

Thefollowingalgorithmsareusedinimplementation: 1. SupportVectorMachine 2. ArtificialNeuralNetwork Support Vector Machine(SVM):
VectorMachine(SVM)isasupervisedmachine learning algorithm used for both classification and regression.Thoughwesayregressionproblemsaswell its bestsuitedforclassification. © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page2567
Support

International Research Journal of Engineering and Technology (IRJET)

e ISSN: 2395 0056

Volume: 09 Issue: 07 | July 2022 www.irjet.net p ISSN: 2395 0072

communicationbetweentheunits.Theseunits,alsoreferred toasnodesorneurons,aresimpleprocessorswhichoperate inparallel.

Fig -3

Pseudocode:

•Importingthenecessarypackages

Example:importpandasaspd

•defSVM

Step1:Start

Step 2: Reading the dataset. pd.read.csv (file name) # readsthedatasetfile

Step3:Datacleaninganddatapreprocessing

• Resampling the data as normal and fraud class i.e. normal=0andfraud=1

•Undersamplingofdataisdone

• Data is scaled (if any null value then eliminated) and normalized.

•Datasetissplitintotwosetsastraindataandtestdata usingsplit()

Step4:TrainingthedatausingtheSVMalgorithm

•SVMclassifieriscalledasclassifier.predict()#which predictswhetherthetransactionisfraudornot.

Step 5: Calculating the fraud transactions and valid transactions, then calculating the recall, precision and accuracy

Step6:STOP

Artificial Neural Network (ANN):

ArtificialNeuralNetworkANNisanefficientcomputing systemwhosecentralthemeisborrowedfromtheanalogyof biologicalneuralnetworks.ANNsarealsonamedas“artificial neuralsystems,”or“paralleldistributedprocessingsystems,” or“connectionistsystems.”ANNacquiresalargecollectionof units that are interconnected in some pattern to allow

Pseudocode:

Fig 4

Thisalgorithmhastwopartsnamely,Trainingpartand testingpart.

Trainingpart:

DefANN:

Step1:Start

Step2:Loadingandobservingthedataset

•pd.read.csv(.csv)#Readsthedataset

•Resamplingofdata

•StandardScaler()#scalingandnormalizationofdata

Step3:Datapre processing

•Train test split()#Splittingofdata

Step4:Trainingthemodel

•Dense()#Addingdatatoactivationfunction

Step5Analyzingthemodel

• Prediction of fraud is made and this trained data is stored.Itcanbeusedtotest(trainingthemodeltakeslonger timesoitisstored)

Step6:Stop

Testingpart:

DefANN

It is carried out similar way only difference is that the storedtrainedmodelisusedtotestthedataandclassifyit.

Journal | Page2568
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified

International Research Journal of Engineering and Technology (IRJET)

e ISSN: 2395 0056

Volume: 09 Issue: 07 | July 2022 www.irjet.net p ISSN: 2395 0072

EVALUATION MEASURE:

Theendresultisevaluatedbasedontheconfusionmatrix andprecision,recall andaccuracyiscalculated.Itcontains twoclasses:actualclassandpredictedclass.Theconfusion matrixdependsonthesefeatures:

TruePositive:ifboththevaluesarepositivethatis1.

TrueNegative:ifbothvaluesarenegativethatis0.

FalsePositive:thisisthecasewheretrueclassis0and non trueclassis1.

FalseNegative:Itisthecasewhenactualclassis1and non trueclassis0.

•Precisiondefinedasfollows:

Precision=truepositive/Actualresult

Precision=truepositive/(truepositive+falsepositive)

•Recalldefinedasfollows:

Recall=truepositive/predictedresult

Recall=truepositive/(truepositive+falsenegative)

•Accuracydefinedas:

Accuracy=(truepositive+truenegative)/total

RESULT:

Thus the accuracy of our model is increased by using ArtificialNeuralNetwork(ANN)algorithm.Thisalgorithmis best suited to give higher accuracy when compared to all otherMachineLearningalgorithms.

9. OUTPUT

FORSVM,

FORANN,

Totalno.of.transactions:284807

No.of.genuinetransaction:284315

No.of.fraudtransaction:492

Theratiooffraudulenttransactions:0.00173047500

Accuracy:99%

Precisionscore:0.8115942028985508

Recallscore:0.9992860737567735

10. OUTPUT

Inthisresearch,wehaveproposedamethodtodetectthe fraudincreditcardtransactionsbasedondeeplearning.We first compare it with machine learning algorithm such as Support vector machine and finally we have used the ArtificialNeuralNetwork,whichwouldfitfinetomodelfor detectingafraudincreditcardtransactions.Inourmodel,by using an artificial neural network (ANN) which gives accuracy about above 95% is best suited for credit card frauddetection.Inthisresearchwork,datapre processing, normalizationandunder samplingcarriedouttoovercome theproblemsfacedbyusinganimbalanceddataset.

11. FUTURE ENHANCEMENT

This model can further be improved with the addition of more algorithms into it. However, the output of these algorithms needs to be in the same format as the others. Oncetheconditionissatisfied,themodulesareeasytoadd as done in the code. This provides a great degree of modularityandversatilitytotheproject

REFERENCES

[1] Andrew.Y.Ng,Michael.I.Jordan,"Ondiscriminativevs. generative classifiers: A comparison of logistic regression and naive bayes", Advances in neural information processing systems, vol. 2, pp. 841 848, 2002.

[2] A.Shen,R.Tong,Y.Deng,"Applicationofclassification modelsoncreditcardfrauddetection",ServiceSystems

©
| Page2569
2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal

International Research Journal of Engineering and Technology (IRJET) e ISSN: 2395 0056

Volume: 09 Issue: 07 | July 2022 www.irjet.net p ISSN: 2395 0072

and Service Management 2007 International Conference,pp.1 4,2007.

[3] A.C.Bahnsen,A.Stojanovic,D.Aouada,B.Ottersten, "CostsensitivecreditcardfrauddetectionusingBayes minimum risk", Machine Learning and Applications (ICMLA).201312thInternationalConference,vol.1,pp. 333 338,2013.

[4] B. Meena, I. S .L. Sarwani , S. V. S. S. Lakshmi,” Web Service mining and its techniques in Web Mining” IJAEGT,Volume2,Issue1,PageNo.385 389.

[5] F. N. Ogwueleka , "Data Mining Application in Credit Card Fraud Detection System", Journal of Engineering ScienceandTechnology,vol.6,no.3,pp.311 322,2011.

[6] G.Singh,R.Gupta,A.Rastogi,M.D.S.Chandel,A.Riyaz, "AMachineLearningApproachforDetectionofFraud based on SVM", International Journal of Scientific EngineeringandTechnology,vol.1,no.3,pp.194 198, 2012,ISSNISSN:2277 1581.

[7] K.Chaudhary,B.Mallick,"CreditCardFraud:Thestudy of its impact and detection techniques", International JournalofComputerScienceandNetwork(IJCSN),vol.1, no.4,pp.31 35,2012,ISSNISSN:2277 5420.

[8] M. J. Islam, Q. M. J. Wu, M. Ahmadi, M. A. Sid Ahmed, "Investigating the Performance of Naive Bayes Classifiers and K Nearest Neighbor Classifiers", IEEE InternationalConferenceonConvergenceInformation Technology,pp.1541 1546,2007.

[9] R. Wheeler, S. Aitken, "Multiple algorithms for fraud detection" in Knowledge BasedSystems, Elsevier,vol. 13,no.2,pp.93 99,2000.

[10] S. Patil, H. Somavanshi, J. Gaikwad, A. Deshmane, R. Badgujar,"CreditCardFraudDetectionUsingDecision Tree Induction Algorithm", International Journal of ComputerScienceandMobileComputing(IJCSMC),vol. 4,no.4,pp.92 95,2015,ISSNISSN:2320 088X.

© 2022,
Certified Journal | Page2570
IRJET | Impact Factor value: 7.529 | ISO 9001:2008

Turn static files into dynamic content formats.

Create a flipbook