Application of neural network and PSO-SVM in intrusion detection of network

Page 1

Application of neural network and PSO-SVM in intrusion detection of network

1Dept. of Electronics and Communication Engineering, Mahaguru Institute of Technology, Kerala

2Asst. Professor, Dept. of Electronics and Communication Engineering, Mahaguru Institute of Technology, Kerala ***

Abstract – Imbalanced network traffic can often be a gateway for malicious cyber-attacks to penetrate networks and go undetected. In these situations, it is challenging for Network Intrusion Detection System (NIDS) to find the attacker since they can blend in with a lot of normal data. An intrusion detection system (IDS) monitors network traffic for suspicious activities andimmediately provides notifications if it detects anything suspicious The IDS looks for any activity that might be a sign of an attack or intrusion by comparing the network activity to a set of predetermined rules and patterns. Even the most sophisticated NIDS may have trouble identifying this type of assault because of its high degree of stealth and obfuscation in cyberspace. A new approach based on deep learning and machine learning using NSL-KDD dataset for intrusion detection is proposed in this paper. The proposed approach uses an SVM classifier for the attack classification task and a 1-Dimensional Convolutional Neural Network for feature extraction.

Key Words: Machine learning, Deep learning, Convolutional Neural Network (CNN), Support Vector Machine(SVM),ParticleSwarmOptimization(PSO)

1. INTRODUCTION

Cybersecurityfacestremendousrisksasaresultoftherapid advancementoftechnologieslike5G,IoT,cloudcomputing, and others that have increased network scale, real-time traffic, and cyberattack complexity and diversity [1][2]. Securitybreachesmightsneakinwithalotofregulartraffic. Asaresult,itissimpletomisclassifybecausethemachine learning algorithm cannot fully learn the distribution of somecategories.Mostofthenewlygeneratedcyber-attacks arecreatedbysubtlyalteringalreadyknownones,whichis typicallyhandledasregulartrafficontheIoTnetwork[3].

Tofindunusualorhostileactivityinthenetwork,asystem calledNetworkIntrusionDetectionSystem(NIDS)isutilized. IDSkeepsaneyeoutforharmfulactivityinnetworktraffic. Therearenumerouswaystoidentifysuspiciousactivityin network communications. IDS monitors network traffic persistentlytolookfornetworkintrusions.Arecenttrendin many security applications is to combine deep learning methodologieswithcybersecuritybecauseoftheirexcellent performance.Foranalysis,thesystemneedsadatasetwith past traffic data. The most widely utilized dataset is the

publicly accessible NSL-KDD network dataset. It includes dataonnetworktrafficwith41trafficfeatures.AnewdeeplearningapproachforintrusiondetectionbasedontheNSLKDDdatasetispresentedinthispaper. Deeplearningand machine learning are the basis of the proposed effort. It applies the Support Vector Machine (SVM) classification algorithm, Convolutional Neural Network (CNN) feature extraction technique, and Particle Swarm Optimization (PSO)SVMalgorithmoptimization.InChapter3,thesystem description is explained. The experimental result of the systemispresentedinChapter4.Theconclusionofthework isgiveninChapter5.

2. LITERATURE REVIEW

AnetworkintrusionsystembasedonNaiveBayeshasbeen suggestedin[4].Acrossdatasetsthathavebeentaggedby the services, the framework develops the network service patterns.ThenaiveBayesClassifiermethod,togetherwith thebuilt-inpatterns,allowstheframeworktoidentifyattacks inthedatasets.Thisapproachhasagreaterdetectionrate, requireslesstimetocomplete,andislessexpensivethanthe neuralnetwork-basedapproach.However,itproducesmore falsepositivesthantrueones.

When it comes to meeting the demands of contemporary networks, there are questions about the viability and sustainability of current systems. These worries are more directlyrelatedtothedeclininglevelsofdetectionaccuracy and the rising levels of required human intervention. To address these concerns, a deep learning-based NIDS approach was proposed in [5]. This unique deep-learning classificationmodelwasdevelopedusingstackedNDAEs. Inordertoaddresstheissueofnetworktrafficdomainmodel architecturedesign,anetworkarchitecturesearchalgorithm (NAS)inthefieldofnetworktraffictogetherwithasurrogate model have been suggested in[6]. Under the premise of a specified optimization target, a neural architecture search (NAS)canautomaticallysearchthemodel'sarchitecture.A surrogatemodelwasusedinthenetworkarchitecturesearch task to determine how candidate architectures would perform. This approach increases the effectiveness of the architecturesearchand,toacertainextent,solvestheissues ofthenetworksearchalgorithm'sneedforlargecomputing resourcesandsignificanttimeconsumption.

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 04 | Apr 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page1287

An effective IDS system is introduced in [7] using architectureslikeConvolutionalNeuralNetworks(CNN)and Long-Short Term Memory (LSTM), Recurrent Neural Networks (RNN), and Gated Recurrent Units (GRU) The systemusesamalicioustrafficrecordmadeupofsequential dataoverapredeterminedtimeperiodtoconstructtheIDS. Thenetworkactivityrecordsthatarebenignandmalicious aredividedintocategories.Todemonstratetheeffectiveness of DL techniques, three separate benchmark data sets UNSWNB15,KDDCup'99,andNSL-KDD havebeenused.It has been found that DL methods are compatible with network traffic time-sequence data contained in TCP/IP packetheaders.

Unlike supervised and unsupervised learning, a new approach based on reinforcement learning has been proposed in [8]. This approach incorporates the observational power of deep learning with the decisionmaking power of reinforcement learning to enable the effectivedetectionofvariouscyberattacksontheIndustrial IoT.TheapproachisdesignedaroundGBM'sfeatureselection algorithm,whichpullsthemostimportantfeaturesetoutof IndustrialInternetofThingsdata.Followingthat,thePPO2 algorithmusesthehiddenlayerofthemultilayerperception network as the shared network structure for the value network and strategic network in addition to the deep learningalgorithm.UsingthePPO2algorithmandReLU,the intrusion detection model is created (R). 99 percent of variousnetworkattacksontheIndustrialInternetofThings aredetectedbytheproposedIDS.

3. METHODOLOGY

TheproposedIntrusionDetectionmethodologyusesCNN forfeatureextractionandSVMforclassification Anetwork intrusiondetectionmodelbasedonneuralnetworkfeature extraction and particle swarm optimizationtechnique to optimize SVMwas created to address the issue that it is challengingtoextractdelicateintrusionattributesduringthe processofintrusiondetection.

3.1 Data Collection

Thepracticeofacquiringandmeasuringdatafromvarious sources is known as data collection. The suggested model makes use of the NSL-KDD dataset. The dataset, which resembles a CPL file, was collected from Kaggle. Figure 1 displaysasampleNSL-KDDdataset.Thedatasethasatotal of42columns,forty-oneofwhichcorrespondtotheinput characteristicsandonecolumnfortheoutputlabel.The41 features consist of various network parameters, including protocoltype,service,flag,sourcebyte, etc.Thereare 23 network attacks in the NSL-KDD training set. Theclassifierswon'tbeskewedtowardmorefrequentreco

rdsbecauserepetitiverecordsareexcludedfromthetrainse tforNSL-KDD.

3.2 DATA PREPROCESSING

Pre-processing is done to make the data better for processingtasks.Non-numericattributesareconvertedto numeric attributes using label encoding. To translate the protocoltype,service,andflagcolumns'symbolicvaluesinto numericalvalues,labelencodingisused.Thetargetcolumn must be divided into 5 classes because it has 23 items. The 23valuesofthetargetclassissplicedinto5categories DOS,PROBE,U2R,R2L,andNORMAL.

3.3 Feature selection

Byselectingonlypertinentdataandeliminatingdatanoise, featureselectionisatechniqueforminimizingthenumberof input variables to the model. It is necessary to identify essential characteristics among all features before performingfeatureextraction.Thecorrelationcoefficientis employed for this. The statistical concept of correlationis frequentlyusedtodescribehownearlylinear,aconnection exists between two variables. The correlation coefficient, whichrangesfrom-1to1,representsthedegreetowhich twoparametersarelinearlyconnected.Withtheuseofjust pertinentdataandtheeliminationofirrelevantdata,feature selection is a technique for limiting the input variable for themodel. Positive correlation, negative correlation, and zero correlation are the three main types of correlation techniques.ThePearsoncoefficientspansfrom-1.0to+1.0 andisthemostoftenappliedcorrelationcoefficient.

3.4 Data Balancing

Followingpre-processingandfeatureselection,thedataset shouldbesubdividedintotwogroups:trainingandtesting. 80:20istheratiothatmustbefollowed.Inordertobalance thetrainingset,SMOTE(SyntheticMinorityOversampling Technique)isapplied.IntheSMOTEtechnique,eachclass

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 04 | Apr 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page1288
Figure1;SampleNSL-KDDdataset

receivesanequalamountofdata.Intheend,aclassification modelistrainedbasedontheprocessedtrainingset.SMOTE isamethodofoversamplinginwhichartificialsamplesare produced for the minority class. This method aids in resolving the issue of overfitting brought on by random oversampling. In order to generate artificial data, SMOTE usesthek-nearestneighbortechnique.

3.5 Feature extraction

Theimportantfeaturesneedtobederivedfromthebalanced dataset.Thetechniqueofturningrawdataintoquantifiable featuresthatcanbehandledwhilekeepingtheinformation in the original data set is known as feature extraction. ConvolutionalNeuralNetworks(CNN)areemployedinthis implementationtoextractfeatures.ThefoundationofaCNN is a convolutional layer. It has a number of filters (or kernels),whoseparametersmustbelearnedoverthecourse of training. By summarizingthe existence of features in individual feature maps,poolinglayersoffera method for down-samplingfeature maps. A bias vector is introduced aftertheinputhasbeenmultipliedbyaweightmatrixina fullyconnectedlayer.

3.6 Model development

An unauthorizedinfiltration into a computer in your company or an address in your designated domain is referred to as a network intrusion. An intrusion may be passive(whenaccessisachievedcovertlyandunnoticed)or active. There are five categories of network attacks: DoS, PROBE, U2R, R2L, and NORMAL. In a Normal Attack, the player just swings their weapon towards an enemy. A Denial-of-Service (DoS) attack attempts to shut down a computersystemornetworksothatitstargetedrecipient isunabletoaccessit.DoSattacksachievethisbyproviding thevictimwithanexcessiveamountoftrafficorinformation thatcausesafailure.Probingattacksareintrusivemethods ofevadingsecuritymeasuresbyexaminingtherealsilicon architectureofachip.Whenanintrudingpartypreviously haduser-levelaccesstoacomputerornetwork,aUser-toRoot(U2R)attackallowsanon-privilegedusertogetroot access. Attacks called Remote-to-Local (R2L) involve transmittingpacketstothetargetdevice.

The Support Vector Machine technique is utilized for classification.Here,theParticleSwarmOptimization(PSO) approachisappliedtooptimizetheSupportvectormachine algorithm. As a result, the PSO-SVM is given the CNN's retrievedfeaturesforclassification.SVMmodelscanclassify incomingtextafterbeinggivenlabeledtrainingdatasetsfor each category. They offer greater speed and improved performance with fewer samples (in the thousands). As a result,theapproachisexcellentfortextclassificationissues. SVMisusedtoidentifyahyperplaneinN-dimensionalspace (whereNisthenumberofattributes)thatcategorizesthe

data points with precision. Hyperplanes, which serve as decisionboundaries,aidinclassifyingthedatapoints.Data points on either side of the hyperplane can be classified differentlydependingonwheretheyreside

One of the bio-inspired techniques, particle swarm optimization(PSO),isstraightforwardinitssearchforthe optimumsolutionintheproblemarea.PSOisemployedto optimizethe challenging SVM data. It is a general methodology with three components: Swam(groups of particles), and Particle(smallest element), Optimization(easiest method). It aids in data optimizationandproducesbetteroutcomes.Byevaluating the fresh input with the trained model, classification is accomplished. After passing new input through CNN to extract features, the trained model receives it. Following prediction,itisdividedinto5classes.

4. EXPERIMENTAL RESULTS

The experiment uses 16GB of memory and an AMD R54600H processor running at 3GHz to verify the detecting effect of CNN and PSOSVM. VS Code was used to train the model. The NSL-KDD data set is used for this paper's experimental data. A total of 125974 samples, including 100779trainingsetsand25194testsets,werechosenfrom theNSL-KDDdataset.

The proposed network intrusion detection system is a 5classclassificationproblem. DoS, PROBE, U2R, R2L, and NORMAL are the five classes. The network attack is categorizedinto one of the five categories using SVM. The systemperformanceiscomparedwiththeXgboostalgorithm and the proposed system achieves an accuracy of 97%. Xgboostalgorithmobtained95%accuracy.Theclassification report, confusion matrix, and ROC curve of the proposed system and Xgboost algorithm for Intrusion detection are showninFigure3,4,5,6,7,8.Aconfusionmatrix,sometimes referredtoasanerrormatrix,isacondensedtableusedto evaluate how well a classification model performs. Count values are used to describe the number of accurate and inaccuratepredictionsforeachclass.

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 04 | Apr 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page1289
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 04 | Apr 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page1290
Figure3:XgboostalgorithmClassificationreport Figure5:Confusionmatrix(XgboostAlgorithm) Figure7:MulticlassROCcurve(XgboostAlgorithm) Figure4:Classificationreportoftheproposedsystem Figure6: Confusionmatrix(ProposedAlgorithm) Figure8:MulticlassROCcurve(ProposedAlgorithm)

5. CONCLUSION

Challengesinsafedigitaldataprotectionandcommunication arise from the internet's phenomenal growth and usage. Hackersutilizeavarietyofattacksintoday'senvironmentto obtain crucial data. Traditional methods can't handle advanced cyberthreats very well. This paper addressed a novel intrusion detection system based on CNN and SVM classifier.Here,CNNhasbeenusedforfeatureextractionand theSVMclassifierhasbeenusedforcategorizingthreatsinto one among the four classes of cyber-attacks named DoS, PROBE, U2R, R2L, and NORMAL. The performance of the proposed system has been compared with the Xgboost algorithm.Ithasbeenfoundthattheproposeddeeplearning andSVM-basedsystemobtainshighaccuracyof97%than theXgboostalgorithm-basedsystem.

REFERENCES

[1] TawalbehL,MuheidatF,TawalbehM,QuwaiderM.IoT PrivacyandSecurity:ChallengesandSolutions. Applied Sciences. 2020; 10(12):4102. https://doi.org/10.3390/app10124102

[2] ChenB,QiaoS,ZhaoJ,LiuD,ShiX,LyuM,ChenH,LuH, ZhaiY.ASecurityAwarenessandProtectionSystemfor 5GSmartHealthcareBasedonZero-TrustArchitecture. IEEE Internet Things J. 2020 Nov 30;8(13):1024810263. doi: 10.1109/JIOT.2020.3041042. PMID: 35783535;PMCID:PMC8768994

[3] Abu Al-Haija Q, Zein-Sabatto S. An Efficient DeepLearning-BasedDetectionandClassificationSystemfor Cyber-Attacks in IoT Communication Networks. Electronics. 2020; 9(12):2152. https://doi.org/10.3390/electronics9122152.

[4] Panda, M., & Patra, M. R. (2007). Network intrusion detection using naive bayes. International journal of computer science and network security, 7(12),258-263.

[5] Shone,N.,Ngoc,T.N.,Phai,V.D.,&Shi,Q.(2018).Adeep learningapproachtonetworkintrusiondetection. IEEE transactions on emerging topics in computational intelligence, 2(1),41-50.

[6] LyuR,HeM,ZhangY,JinL,WangX.NetworkIntrusion Detection Based on an Efficient Neural Architecture Search. Symmetry. 2021; 13(8):1453. https://doi.org/10.3390/sym13081453

[7] MeliboevA,AlikhanovJ,KimW.PerformanceEvaluation of Deep Learning Based Network Intrusion Detection System across Multiple Balanced and Imbalanced Datasets. Electronics. 2022; 11(4):515. https://doi.org/10.3390/electronics11040515

[8] Tharewal, S., Ashfaque, M. W., Banu, S. S., Uma, P., Hassen,S.M.,&Shabaz,M.(2022).Intrusiondetection systemforindustrialInternetofThingsbasedondeep reinforcement learning. Wireless Communications and Mobile Computing, 2022,1-8.

[9] Latif,Shahid,ZebaIdrees,ZhuoZou,andJawadAhmad. (2020)"DRaNN:ADeepRandomNeuralNetworkModel for Intrusion Detectionin Industrial IoT." 2020 International Conference on UK-China Emerging Technologies(UCET),1-4.IEEE.

[10] Kasongo,SydneyMambwe,andYanxiaSun.(2020)"A deep learning method with wrapper based feature extraction for wireless intrusion detection system." Computers&Security92:101752.

[11] Choudhary, Sarika, and Nishtha Kesswani. (2020 "Analysis of KDD-Cup'99, NSL-KDD and UNSW-NB15 Datasets using Deep Learning in IoT." Procedia ComputerScience,167:1561-1573

[12] Vinayakumar, R., Mamoun Alazab, K. P. Soman, Prabaharan Poornachandran, Ameer Al-Nemrat, and Sitalakshmi Venkatraman. (2019) "Deep Learning Approach for Intelligent Intrusion Detection System." IEEEAccess7:41525-41550.

[13] Khraisat, A., Gondal, I., Vamplew, P. et al. Survey of intrusion detection systems: techniques, datasets and challenges. Cybersecur 2, 20 (2019). https://doi.org/10.1186/s42400-019-0038-7

[14] S. Kumar, S. Gupta and S. Arora, "Research Trends in Network-BasedIntrusionDetectionSystems:AReview," in IEEE Access, vol. 9, pp. 157761-157779, 2021, doi: 10.1109/ACCESS.2021.3129775.

[15] R. Heady, G. Luger, A. Maccabe and M. Servilla, "The architecture of a network level intrusion detection system",1990.

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 04 | Apr 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page1291

Turn static files into dynamic content formats.

Create a flipbook