International Research Journal of Engineering and Technology (IRJET) e ISSN: 2395 0056
Volume: 09 Issue: 07 | July 2022 www.irjet.net p ISSN: 2395 0072
![]()
International Research Journal of Engineering and Technology (IRJET) e ISSN: 2395 0056
Volume: 09 Issue: 07 | July 2022 www.irjet.net p ISSN: 2395 0072
1,2 Department of Computer Science and Engineering, Prince Shri Venkateshwara Padmavathy Engineering College, Ponmar, Chennai
3 Assistant Professor, Department of Computer Science and Engineering, Prince Shri Venkateshwara Padmavathy Engineering College, Ponmar, Chennai
4 Associate Professor, Department of Computer Science and Engineering, Prince Shri Venkateshwara Padmavathy Engineering College, Ponmar, Chennai ***
Abstract Duetotheriseandrapidgrowthof E commerce, use of credit card for online purchases has dramatically increased and it caused an explosion in the credit card fraud. In our project we mainly focused on finding out whether a transaction is fraud or genuine for that we were using a feature totally based only on time i.e.) a particular period of time the occurred transaction data. And extracted feature is given as input for further implementation using Artificial NeuralNetwork(Deep Learning)andSupportVectorMachine (Supervised Learning), accuracy of the process is individually collected and compared with one and another. while comparing ANN provides the better accuracy of above 95% while SVM provides only 93% and at last using confusion matrix total number of transactions, number of genuine transactions and number of fraud transactions will be displayed as output based on the data given as input.
Key Words: FeatureExtraction, Fraud Detection, Online Payment, Credit Card, Deep Learning, Artificial Neural Network(ANN),Machine Learning, Support Vector Machine(SVM).
Thefraudincreditcardtransactionoccurswhenthestealer uses the other person card without authorization of the respectivepersonbystealingthenecessaryinformationlike PIN, password and other credentials with or without the physical card. Using fraud detection module involving machine learning and deep learning, we can find out whethertheupcomingtransactionisfraudandlegitimate.
MachineLearningisthetreadingandmostusedtechnology because of its various applications and less time consumption,moreaccurateinresult.Machinelearningisa technologythatdealswiththealgorithm,whichprovidesthe computer, a capability to study and advance through experiencewithoutbeingexplicitlyprogrammed.Machine learninghasapplicationinmultiplefields.Example:medical, diagnosis,regressionetc.
Machinelearninginvolvesthecombinationofalgorithmand staticallymodelswhichallowcomputertoperformthetask
withouthardcodingthenamodelisbuiltthroughatraining dataandthenitistestedonthetrainedmodel.
Deeplearningisapartofmachinelearningtechniquesthat makesuseofneuralnetworks.Someofmethodsthatcome under deep learning are artificial neural network, Convolutionneuralnetwork,autoencoders,recurrentneural networks,restrictedBoltzmannmachineetc.
Deep learning makes uses of neural networks, which resembles the human brain in processing the data and makingthedecision.HereweusedbothDeepLearningand MachineLearningtechniquesbutDeepLearningAlgorithm outperformed based on accuracy. For implementation process we were used “PYTHON” programming language sinceitissimpleandeasytoread,learnandwrite.Andalso weused someofthe pythonlibrariesand packagescalled NUMPY,Pandas,SCIKITLEARN,KERAS,MySQLandTKINTER fordataanalysis,datamanipulation,useoflinearalgebraic operation,storagepurpose,usedforGraphicalUserInterface (GUI).
There are different kinds of frauds that are seen on e commercesites.OfflinetheftandrobberyoccurnearATMs; while online theft can occur over the internet and mobile phones.
1 Application fraud: Customer'scredentialsarestolenby the fraudster, then he creates a fake account and the transactionstakesplace.
2 Electronic or manual card imprints: Thefraudsterwill skimtheinformationthatispresentonthecardandusesthe credentialsandfraudtransactiontakesplace
3 Card not present: This is a type where actual physical cardisnotpresentduringtransaction
4 Counterfeit card fraud: Allthedatafrommagneticstrip willbecopiedbythefraudsterwheretherealcardlookslike originalcardandthesamecardcanbeusedforfraud
International Research Journal of Engineering and Technology (IRJET) e ISSN: 2395 0056
Volume: 09 Issue: 07 | July 2022 www.irjet.net p ISSN: 2395 0072
5 Lost/stolen card: Thishappensduetoloosingofthecard by the cardholder or by stealing the card from the cardholder.
6 Card id theft: Thishappenswhentheidofthecardholder isstolenandfraudtakesplace.
7 Mail non received card fraud: Whileissuingthecredit cardtherewillbeprocessofsendingamailtotherecipient, fraudcanoccurherebydefraudingthemailorphishing.
8 Account Takeover: The fraudster takes the complete controloftheaccountholderandmakesafraud.
9 Fake fraud in website: A malicious code will be introduced by the fraudster which does their work in the website
10 Merchant collision: The details of the cardholder are shared by the third party or the fraudster by merchants withoutcardholderauthorization.
Now a days,mostofthemareusingcreditcardsforbuying thegoodswhicharesomuchinneedbutcan’taffordatthe moment.Inordertomeettheneeds,creditcardsareused andthefraudassociatedwithitisalsoincreasing.So,thereis a needtocreateandimplementa modelthat’sfitwelland predictsathigheraccuracy.
The main objective of the research is to find a fraudulenttransactionincreditcardtransactions.
IntheProposedsystemweusetheArtificialNeuralNetwork tofindthefraudinthecreditcardtransactions.Performance ismeasuredandaccuracyiscalculatedbasedonprediction. And also classification algorithm such as Support vector machineisusedtobuildacreditcardfrauddetectionmodel. We compared both algorithms and made a decision that artificialneuralnetworkspredictswellthansupportvector machineandgivestheoutcomeofthetransactionineither0 or1.
ADVANTAGES:
• Provideshighaccuracy.
• Correctstheproblemofoverfitting.
• Addedlayerofsafety
Comparisonbetweenthesupervisedlearningand deep learning and deep learning algorithm outperformedbasedonaccuracy.
Theexistingsystemsarecarriedoutbyconsideringmachine learning algorithms like Support Vector Machine, Naïve Bayes,k NearestNeighborandsoonandsomeofthemused random dataset. Very few have used artificial neural networkforcreditcardfrauddetection.
• It produces lot of tables with relatively smallnumberofcolumns.
• Datarestrictions.
• Requireshugeprocessingtime.
Fig 1
This section explains about the implementation of the algorithm used for proposed system. In this paper, the implementation starts from the collection of data (Data Collection). Then data pre processing is carried out that includes data cleaning (Filling any missing values in the transaction by using mean, median, standard deviation techniques)andnormalizingthedata.Datasetissplit into twodatasetastraindataandtestdataandmodelistrained andtestedtomeasuretheaccuracy.Finally,systempredicts whethertransactionisfraudornon fraud.
International Research Journal of Engineering and Technology (IRJET) e ISSN: 2395 0056
Volume: 09 Issue: 07 | July 2022 www.irjet.net p ISSN: 2395 0072
flow. KERAS is mainly used while implementing deep learning algorithms such as CNN, RNN because its user friendly,modularity,andeasytoextensibility.Itrunsonboth CPUandGPU.Intheexperimentoffindingthefraudornon fraudcreditcardtransactionwehadusedKERASalongwith backendrunningtensorflow.ThisKERASalongwithtenor flow backend makes excellent choice for training neural networkarchitecture.
Fig 2
• It is the method of reducing the input variable to ourmodelbyusingonlyrelevantdataandgetting ridofnoiseindata.
• Here we use features based only on time i.e.) particular period of time occurred transactional data
Anaconda Navigator is a desktop graphical user interface (GUI)includedinAnaconda®distributionthatallowsyouto launch applications and easily manage CONDA packages, environments and channels without using command line commands.NavigatorcansearchforpackagesonAnaconda CloudorinalocalAnacondaRepository.
Someofthepythonlibraryandpackagesusedinproposed systemareasfollows:
NUMPY is a python library. Abbreviation of NUMPY is numerical python library. NUMPY package is used for multidimensionalarraysandlinearalgebraicoperations.
Pandas is a python library. Pandas is used for data analysis and data manipulation tool. It is used to read the datasetandloadthedataset.Itisfast,flexiblewhenworking withdata.
Apythonpackagewhichissuitableforstatisticalmodel andmachinelearningmodels.Abestsuitedpythonpackage formachinelearningmodeling.
KERASisadvancedstageofneuralnetworkapplication programminginterface(API).Itisableofrunontopoftensor
MySQLisdatabasewhichisusedforstoragepurpose.In theexperimentoffraudidentificationincardtransactionwe had used MySQL for storing the user details namely user name,password,email idandphonenumber.Whileentering into application, user needs to register by providing the credential. These credentials are stored in database. Thereafter, user needs to login by giving username and password. The application will validate the login and registeredinformationthanuserismovedtonextwindow.
TKINTER is python library which is used for Graphical User interface (GUI). It can be used on both Unix and Windowsplatform.WecancreateitbyimportingTKINTER module then GUI is created and one or more widgets are addedfinally,calledinloop.
Wehaveusedpythonasprogramminglanguage.Pythonis beginner’slanguage,whichprovidesvariousapplications.In recentyears,pythonhadsetthenewtrendbecauseitiseasy to use, interpreted, object oriented, high level, scripting language.Itprovidesrichpackagesandlibrariesthatusedin machinelearning. iv)CLASSIFICATION TECHNIQUES:
International Research Journal of Engineering and Technology (IRJET)
e ISSN: 2395 0056
Volume: 09 Issue: 07 | July 2022 www.irjet.net p ISSN: 2395 0072
communicationbetweentheunits.Theseunits,alsoreferred toasnodesorneurons,aresimpleprocessorswhichoperate inparallel.
Fig -3
Pseudocode:
•Importingthenecessarypackages
Example:importpandasaspd
•defSVM
Step1:Start
Step 2: Reading the dataset. pd.read.csv (file name) # readsthedatasetfile
Step3:Datacleaninganddatapreprocessing
• Resampling the data as normal and fraud class i.e. normal=0andfraud=1
•Undersamplingofdataisdone
• Data is scaled (if any null value then eliminated) and normalized.
•Datasetissplitintotwosetsastraindataandtestdata usingsplit()
Step4:TrainingthedatausingtheSVMalgorithm
•SVMclassifieriscalledasclassifier.predict()#which predictswhetherthetransactionisfraudornot.
Step 5: Calculating the fraud transactions and valid transactions, then calculating the recall, precision and accuracy
Step6:STOP
ArtificialNeuralNetworkANNisanefficientcomputing systemwhosecentralthemeisborrowedfromtheanalogyof biologicalneuralnetworks.ANNsarealsonamedas“artificial neuralsystems,”or“paralleldistributedprocessingsystems,” or“connectionistsystems.”ANNacquiresalargecollectionof units that are interconnected in some pattern to allow
Pseudocode:
Fig 4
Thisalgorithmhastwopartsnamely,Trainingpartand testingpart.
Trainingpart:
DefANN:
Step1:Start
Step2:Loadingandobservingthedataset
•pd.read.csv(.csv)#Readsthedataset
•Resamplingofdata
•StandardScaler()#scalingandnormalizationofdata
Step3:Datapre processing
•Train test split()#Splittingofdata
Step4:Trainingthemodel
•Dense()#Addingdatatoactivationfunction
Step5Analyzingthemodel
• Prediction of fraud is made and this trained data is stored.Itcanbeusedtotest(trainingthemodeltakeslonger timesoitisstored)
Step6:Stop
Testingpart:
DefANN
It is carried out similar way only difference is that the storedtrainedmodelisusedtotestthedataandclassifyit.
International Research Journal of Engineering and Technology (IRJET)
e ISSN: 2395 0056
Volume: 09 Issue: 07 | July 2022 www.irjet.net p ISSN: 2395 0072
Theendresultisevaluatedbasedontheconfusionmatrix andprecision,recall andaccuracyiscalculated.Itcontains twoclasses:actualclassandpredictedclass.Theconfusion matrixdependsonthesefeatures:
TruePositive:ifboththevaluesarepositivethatis1.
TrueNegative:ifbothvaluesarenegativethatis0.
FalsePositive:thisisthecasewheretrueclassis0and non trueclassis1.
FalseNegative:Itisthecasewhenactualclassis1and non trueclassis0.
•Precisiondefinedasfollows:
Precision=truepositive/Actualresult
Precision=truepositive/(truepositive+falsepositive)
•Recalldefinedasfollows:
Recall=truepositive/predictedresult
Recall=truepositive/(truepositive+falsenegative)
•Accuracydefinedas:
Accuracy=(truepositive+truenegative)/total
RESULT:
Thus the accuracy of our model is increased by using ArtificialNeuralNetwork(ANN)algorithm.Thisalgorithmis best suited to give higher accuracy when compared to all otherMachineLearningalgorithms.
FORSVM,
FORANN,
Totalno.of.transactions:284807
No.of.genuinetransaction:284315
No.of.fraudtransaction:492
Theratiooffraudulenttransactions:0.00173047500
Accuracy:99%
Precisionscore:0.8115942028985508
Recallscore:0.9992860737567735
Inthisresearch,wehaveproposedamethodtodetectthe fraudincreditcardtransactionsbasedondeeplearning.We first compare it with machine learning algorithm such as Support vector machine and finally we have used the ArtificialNeuralNetwork,whichwouldfitfinetomodelfor detectingafraudincreditcardtransactions.Inourmodel,by using an artificial neural network (ANN) which gives accuracy about above 95% is best suited for credit card frauddetection.Inthisresearchwork,datapre processing, normalizationandunder samplingcarriedouttoovercome theproblemsfacedbyusinganimbalanceddataset.
This model can further be improved with the addition of more algorithms into it. However, the output of these algorithms needs to be in the same format as the others. Oncetheconditionissatisfied,themodulesareeasytoadd as done in the code. This provides a great degree of modularityandversatilitytotheproject
[1] Andrew.Y.Ng,Michael.I.Jordan,"Ondiscriminativevs. generative classifiers: A comparison of logistic regression and naive bayes", Advances in neural information processing systems, vol. 2, pp. 841 848, 2002.
[2] A.Shen,R.Tong,Y.Deng,"Applicationofclassification modelsoncreditcardfrauddetection",ServiceSystems
International Research Journal of Engineering and Technology (IRJET) e ISSN: 2395 0056
Volume: 09 Issue: 07 | July 2022 www.irjet.net p ISSN: 2395 0072
and Service Management 2007 International Conference,pp.1 4,2007.
[3] A.C.Bahnsen,A.Stojanovic,D.Aouada,B.Ottersten, "CostsensitivecreditcardfrauddetectionusingBayes minimum risk", Machine Learning and Applications (ICMLA).201312thInternationalConference,vol.1,pp. 333 338,2013.
[4] B. Meena, I. S .L. Sarwani , S. V. S. S. Lakshmi,” Web Service mining and its techniques in Web Mining” IJAEGT,Volume2,Issue1,PageNo.385 389.
[5] F. N. Ogwueleka , "Data Mining Application in Credit Card Fraud Detection System", Journal of Engineering ScienceandTechnology,vol.6,no.3,pp.311 322,2011.
[6] G.Singh,R.Gupta,A.Rastogi,M.D.S.Chandel,A.Riyaz, "AMachineLearningApproachforDetectionofFraud based on SVM", International Journal of Scientific EngineeringandTechnology,vol.1,no.3,pp.194 198, 2012,ISSNISSN:2277 1581.
[7] K.Chaudhary,B.Mallick,"CreditCardFraud:Thestudy of its impact and detection techniques", International JournalofComputerScienceandNetwork(IJCSN),vol.1, no.4,pp.31 35,2012,ISSNISSN:2277 5420.
[8] M. J. Islam, Q. M. J. Wu, M. Ahmadi, M. A. Sid Ahmed, "Investigating the Performance of Naive Bayes Classifiers and K Nearest Neighbor Classifiers", IEEE InternationalConferenceonConvergenceInformation Technology,pp.1541 1546,2007.
[9] R. Wheeler, S. Aitken, "Multiple algorithms for fraud detection" in Knowledge BasedSystems, Elsevier,vol. 13,no.2,pp.93 99,2000.
[10] S. Patil, H. Somavanshi, J. Gaikwad, A. Deshmane, R. Badgujar,"CreditCardFraudDetectionUsingDecision Tree Induction Algorithm", International Journal of ComputerScienceandMobileComputing(IJCSMC),vol. 4,no.4,pp.92 95,2015,ISSNISSN:2320 088X.