A study on techniques to detect and classify acute lymphoblastic leukemia using deep learning.

Page 1

A study on techniques to detect and classify acute lymphoblastic leukemia using deep learning.

1Associate Professor, Dept. of Computer Science and Engineering, VNR Vignana Jyothi Institute of Engineering and Technology, Hyderabad, India

2,3,4,5 Student, Dept. of Computer Science and Engineering, VNR Vignana Jyothi Institute of Engineering and Technology, Hyderabad, India ***

Abstract - Cancer starts when cells in the body begin to grow out of control. Leukemia is a cancer that affects the body's blood-forming tissues, lymphatic system, and bone marrow. White blood cells are typically involved in Leukemia (WBC). When an individual has Leukemia, their bone marrow makes an excessive amount of dysfunctional WBC. While Acute Lymphoblastic Leukemia (ALL) predominantly affects children but is not limited to them and can also develop in adults. As a widely occurring cancer, the accurate diagnosis of ALL necessitates costly, invasive, and time-intensive diagnostic tests. The use of PBS images for diagnosing ALL plays a crucial role in the initial screening of cancer cases versus non-cancer cases. Our project aims to Automate the process of detection of Acute Lymphoblastic Leukemia (ALL) using Peripheral Blood Smear (PBS) images and provide a channel between patients and doctors for consultancy regarding the diagnosis process.

Key Words: Acute Lymphoblastic leukemia, peripheral Blood smear, Lymphatic system, CNN (Convolution Neural Network), CBC (Complete Blood Count), Differential Leukocyte Count (DLC)

1. INTRODUCTION

Over900,000peoplearediagnosedwithleukemiaeachyear, sometimes known as blood cancer, however many people are unaware of the risks associated with such frequently fatal illnesses. Most of blood cancers are uncommon, lifethreatening diseases that only affect small patient populations.Patientsmayhaveasenseofabandonmentasa resultofleukemia'srarityandfinditchallengingtolocate thesupportandinformationtheyrequire.

IftreatmentforacuteLeukemiaisnotstartedpromptly,the patient may pass away from the condition within a few months. Any cancer must be detected early to receive prompt treatment and improve survival rates. Individuals who are ill cannot waste time because they require quick attention.Weneedsystemsthatquicklyusethemostrecent technological advancements and accurately analyze blood samples.EarlydetectionofAcuteLymphoblasticLeukemia (ALL) symptoms in individuals can considerably improve theirchancesofsurvival

2. RELATED WORK

In [1], Naina Sharma, Ankit Mukopadhyay, Aman Shrivastava, and Aman Garg proposed a system to help identify Leukemia through pictures of blood cells. The proposed system is supposed to be more accurate than presentphysicians.Asystemisproposedduetothescrutiny of blood or bone marrow pictures being negative for detection and very time-consuming. The authors propose usingsegment-stainedperipheralbloodsmearsusingcolorbased clustering to divide a cell into a nucleus and cytoplasm. Then, SVM is used to differentiate the types of WBC and use CNN with the last layer of the CNN fed to a RandomForestalgorithmtocategorizetheWBC.

In[2],C.T.Tchapga,M.ThomasAttia,A.T.Kouanou,T.F. Fonzin,P.K.Fogang,M.BriceAnicet,T.Danielproposedhow MLalgorithmscanbeusedwithbigdatausingApacheSpark Frameworkandhowtoclassifybiomedicalimagesusingthe machinelearningalgorithms.ApacheSparkovercomesthe limitationsoftheHadoopframeworkbybeingfaster(by100 timesinmemory),makingdataforiterations,queries,and loading,and supportingSQL Queryover MapReduce.The authorsalsosuggestatwo-stepprocess:thefirstiscreating an algorithm using labelled images, and the second is classificationdonethroughunlabeledimages.

In[3],A.Genovese,M.S.Hosseini,V.Piuri,K.N.Plataniotis, andF.Scottiproposea systemtodetectALL basedonthe adaptive un-sharpening of the peripheral blood smear images.Theauthorsproposeusingadaptiveun-sharpening using Computer Aided Diagnosis (CAD). Adaptive unsharpening is a process used to increase the focus of an image until a certain threshold. The system used for ALL detectionperformsadaptiveun-sharpeninginitiallyandthen performs the classification. After experimenting with 260 images of WBC, the authors identified the accuracy of the detectionmodeltobe96.84%.Theonlyconofthemodelis thatthemodelisadetectionmodelbutnotaclassification model.

In[4],N.Mahmood,S.Shahid,T.Bakhshi,S.Riaz,H.Ghufran, andM.Yaqoobproposedthesignificanceofclinicaldataand phenotypic data, i.e., environmental conditions to detect AcuteLymphoblasticLeukemia.Theyuseddifferentmodels

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 05 | May 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 901

like classification and regression trees, Random Forest, gradient boost and c5.0 decision tree algorithm. They appliedten-foldcross-validationtocompareallalgorithms and found that classification and regression trees (CART) havehighaccuracy,i.e.,80.6%.TheCARTmodelshowsthe significance of each feature to identify ALL. Some feature authors considered are age, gender, WBC count, platelets count,hemoglobinlevel,financialstatusanddrinkingwater. Authorsfoundouttheimportanceofeachvariablewiththeir importance in percentages, i.e., platelet 43%, hemoglobin 24%,whitebloodcells45%.Themainadvantagehereisthat thereisnoneedtoprocessimagestodeterminewhethera personhasALL.Thedatacollectediseasytoget.However, usingthisdata,authorscannotclassifyALLstages

In[5],SahanaKAdyanthayadiscusseddifferenttechniques used in text recognition and stages involved in text recognition like pre-processing, segmenting the text, extractionoffeaturesfromtextandpostprocessing.Some findingsarethatTesseractOCRperformswellinextracting text. However, by dividing images into segments, the classifierrecognizestextineachimage.Duetothismethod, thereisanimprovementofabout20%inrecognizingtext. The author also discussed tasks various tasks in text recognition, i.e., noise removal using a gaussian filter or meanfilter,segmentationusingLinearDiscriminantAnalysis (LDA),IndependentComponentAnalysis(ICA),ChainCode (CC)andthenclassifyingusingdistinctclassifiersbasedon techniquessuchasANNorSVM.

In[6],VasundharaAcharyaandPreethamKumardeveloped a system to segment blood smear images accurately. This paper primarily focused on image segmentation, as most features extracted directly depend on segmentation. CytoplasmcanbeextractedfromtheimageusingK-medoids and k-means. However, K-medoids show better performance,soK-medoidsareusedtoextractcytoplasm. The nucleus is extracted, producing a binary image of cytoplasmandbinaryimagesofwhitebloodcells.Separating thesurroundings,RBCsfromtheWBCsgivespreciseresults. However,the maindisadvantages foundwere thatborder cells located at the extreme corner are not dealt with accurately,andsub-classificationisnotdone

In[7],Dr.LeenaPatilandMiss.AnaghaM.Pawarproposed thebestmethodstodetectALL.Aspartoftheexperiment, variousCNNmodelswereevaluated,includingR-CNN,Fast R-CNN, and YOLO, including regional-based convolutional neural networks. The obtained outcomes were the Model MeanAveragePrecisionandFramesPerSecond.TheR-CNN achieved62.4%and0.5,theFastR-CNNrecorded7-%and 0.5, while YOLO showed 63.4% and 45, respectively. Yolo offers a rapid static processing paradigm for real-time streaming analysis and image classification. Fast R-CNN, however,hasaveryhighrateofaccuracy.

In[8],A.Rehman,N.Abbas,T.Saba,SyedIjazurRahman,Z. Mehmood, and H. Kolivand suggested a deep learning method for classifying acute lymphoblastic leukemia. To obtainaccurateclassificationresults,themodelistrainedon imagesofbonemarrowusingCNNsandstrongsegmentation techniques. The results of the experiment were then comparedtothoseoftheNaiveBayesian,K-NN,andSupport VectorMachineclassifiers.Experimentalresultsshowedthat thesuggestedmethodproducedanaccuracyof97.79%.The findingsallowpathologiststousethesuggestedtechniqueto identifyacutelymphoblasticleukemiaanditssubgroups.

In [9], Sarmad Shafique and Samabia Tehsin proposed utilizing Pretrained Deep CNNs to detect and the categorizationleukemia.TheyhaveconstructedadeepCNN fortheautomaticdetectionofALLandcategorizationofits variants into four categories, L1, L2, L3, and Normal. To mitigatetheissueofoverfitting,adataaugmentationmethod wasemployed.Toassesstheefficacyacrossdifferentphotos, theyalsoanalyzeddatasetswithdistinctcolormodels.They demonstrated 100% sensitivity, 98.11% specificity, and 99.50%accuracyinthediagnosisofALL.Thecategorization of ALL subtypes had sensitivity, specificity, and accuracy scoresof96.74%,99.03,and96.06%,respectively.

In[10],L.Pan,G.Liu,F.Lin,S.Zhong,H.Xia,X.Sun,andH. Liang suggested a strategy to forecast relapse in acute lymphoblasticleukemia.On the336newlydiagnosedALL children'strainingsets,clinicalvariablesweregradedusing Monte Carlo cross validation nested within 10-fold cross validation.Aforwardfeatureselectionmethodwasutilized todetermine theminor list ofdistinguishing variables. To ensure assessment of the model’s performance for new patients,anadditionalsetofeighty-fourpatients,whoare not part of the initial training or testing, was included for evaluation.The14-featureRandomForestmodelperforms wellamongallrisk-levelcategories,withthestandard-risk groupshowingthebestaccuracyat0.829

In[11],J.Wang,N.Y.MinShen,X.Zhang,Y.Wang,MSN,Y. Liu,Z.Geng,C.Yuan,FAANpresentedaMobileApptoassist carersofchildrenwithALL.Mobileapps,whicharetypically practicalanduser-friendly,areessentialm-healthtools.ALL isthemostcommoncancer-relateddeathinchildren,which amountsto26.84%ofallpediatricmalignancies.Itaffects patients under 15 the most frequently of pediatric cancer types.ThefrequencyofALLinyoungchildrenishighestin thosebetweentheageof2and4.ThecurabilityofALLhas shown significant progress in the last few decades, with a rise of over 75% due to advancements in the diagnostic processandtreatment.Thecancerdiagnosisofakidstillhas an impact on the parents and their family. The eight components in the app, them being Personal Information, Treatment Tracking, Family Care, Economic and Social Assistance, Information Center, Self-assessment, Questionnaires,InteractivePlatforms,andReminders,were

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 05 | May 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 902

created to assist caretakers. Medical professionals and parentshaverunpilottests.

In[12]Dr.J.SasiKiran,N.V.Kumar,N.S.Prabha,M.Kavya suggestedarchitectureforGeneralCRS.Thesystembegins byperformingpreprocessingstepssuchasnoiseremoval, thresholding,andskeletonization,whichareprimarilyused toseparatetheforeground(ink)fromthebackgroundpaper. Afterpreprocessing,word,character,andlinesegmentation utilizingtechniquessuchasprojectionanalysis,whitespaceand-pitch, and CCL follow. The architecture contains normalization,whichconvertstherandomlysizedimageto the standard sized image, after segmentation. Feature Extraction, Classification (characteristics can include Bayesianclassifier,KNN-classifier,RB-function, SVM),and Preprocessing are all included in the architecture's final portion(Involvesgroupingofsymbols)

In [13], Chaitali Raje and Jyoti Rangole, using image processing, devised a way to identify leukemia in microscopicimages.Thetechniqueentailscellidentification, pictureacquisition,preprocessing,imagesegmentation,and featureextraction.Thefirsttechnique,nucleussegmentation with Labview, entails converting the color image to a grayscale image, improving the grayscale image with the histogram equalization approach, calculating statistical parameters, and then classifying the cell as a blast or ordinarycell.Anothertechnique,nucleus-segmentationwith Matlab, entails converting the color image to grayscale, conducting ‘L’ and ‘H’, utilizing otsu's Thresholding, and finallyclassifyingthecell.

In [14], Atin Mathur, Ardhendu S. Tripathi, and Manohar Kuse use Leishman-stained blood stain images to classify whitebloodcells,whicharesubsequentlyusedtodiscover, diagnose,andmonitorhematologicalandnon-hematological illnesses.Imagesarefirststainadjustedwrtatargetimage that is chosen because there is a stark visual contrast between the WBCs. The background cells are crucial in medical image processing and extracting the White Blood Cellnucleusandcytoplasmindividually,asseenintheRGB imagefollowedbyWBCsegmentation.Thesegmentthathas the biggest impact on how well a classifier performs is feature extraction. The size, compactness, NCR, and propertiesincludingANR,NOL,MCP,androughnessareall factorstakenintoconsideration whenclassifyingcells.An algorithmforclassificationissupervised.

In[15],CarticRamakrishnan,AbhishekPatnia,EduardHovy, andGullyAPCBurnshavedevelopedamethodtodrawout textfromapdffilebybreakingitupintowordblocksusing theGPLversionofJPedal,andthenmethodicallycombining wordblockstocreate"chunk-blocks"byfollowingspecified rules.Sectionheadsandsubheadingsmustbeidentifiedas different segments from the body of their associated sections, and segments should be rectangular to assist sequence-preserving text extraction. The LA-PDFText

iterates over the classified blocks in the last step of the technique,stitchingtogethermultiplecontiguousportionsas wellassections,sub-sections,andheadings.

3. CONCLUSIONS

Fromtheaboveliteraturesurvey,wecansaythatmostof the papers deal with preprocessing, i.e., ways to extract featuresfromimagesbydifferenttechniquesun-sharpening algorithmsorconvertingimagesintobinaryimagesorPCA forsegmentation,thussaysthatpreprocessingimagesisthe firststeptogetgoodresults.Thensomepapersworkedwith CNNmodelstoclassifytheimages,andmostmodelsarepretrained with modifications helping in detecting and classifying Acute Lymphoblastic Leukemia. Some papers show models that reach over 98% accuracy and more, whereassomepapersdiscussedtextrecognitiontechniques likeOCR,Layout-AwarePDFTextExtractionsystemswith reasonableaccuracies.Eachpaperhasitslimitationswhere detectionfailswitha noisyimageorcannotdetectborder cells;however,theclassificationcanbecomemoreaccurate byovercomingthem.

REFERENCES

[1] Naina Sharma,AnkitMukopadhyayAmanShrivastava Aman Garg (2021), A Brief Survey on Leukemia Detection Systems, international Research Journal of EngineeringTechnology(IRJET).

[2] Tchapga, C. T., Thomas Attia, M. A., Kouanou, A. T., Fonzin, T. F., Fogang, P. K., Brice, M. A., & Daniel, T. (2021). Biomedical image classification in a big data architectureusingmachinelearningalgorithms.Journal ofHealthcareEngineering,2021.

[3] Angelo Genovese, Mahdi Hosseini, S., Vincenzo Piuri, Konstantinos, N. P., & Fabio, S. (2021, June). Acute Lymphoblastic Leukemia detection based on adaptive un-sharpeningandDeepLearning.InICASSP2021-2021 IEEEInternationalConferenceonAcoustics,Speech,and SignalProcessing(ICASSP)(pp.1205-1209).IEEE

[4] Nasir, M., Saman, S., Taimur, B., Sehar, R., Hafiz, G., & Muhammad,Y.(2020).Identificationofsignificantrisks inpediatricacutelymphoblasticleukemia(ALL)through machinelearning(ML)approach.Medical&Biological Engineering&Computing,58(11),2631-264

[5] Adyanthaya,S.K.(2020).TextRecognitionfromImages: AStudy.

[6] Vasundhara Acharya, & Preetham Kumar (2019). Detectionofacutelymphoblasticleukemiausingimage segmentation and data mining algorithms. Medical & biologicalengineering&computing,57(8),1783-1811

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 05 | May 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 903

[7] Dr. Leena Patil, Miss. Anagha M. Pawar.A Literature SurveyonClassificationofImagesusingStateoftheArt MachineLearningTechniques

[8] Ajmad,R.,Naveed,A.,Tanzila,S.,Syed,I.U.R.,Zahid,M., & Hoshang, K. (2018). Classification of acute lymphoblasticleukemiausingdeeplearning.Microscopy ResearchandTechnique,81(11),1310-1317.

[9] Sarmad Shafique, & Samabia Tehsin, (2018). Acute lymphoblasticleukemiadetectionandclassificationof itssubtypesusingpretraineddeepconvolutionalneural networks.Technologyincancerresearch&treatment, 17,1533033818802789.

[10] Liyan,P.,Guangjian,L.,Fangqin,L.,Shuling,Z.,Huimin, X., Xin, S., & Huiying, L. (2017). Machine learning applicationsforpredictionofrelapseinchildhoodacute lymphoblasticleukemia.Scientificreports,7(1),1-9.

[11] Jinting,W.,Nengliang,Y.S.M.,Xiaoyan,Z.,Yuanyuan,W., Yanyan, L., MSN & Changrong, Y. (2016). Supporting caregivers of children with acute lymphoblastic leukemiaviaasmartphoneapp:apilotstudyofusability andeffectiveness.CIN:Computers,Informatics,Nursing, 34(11),520-527.

[12] Kiran,J.S.,Kumar,N.V.,Sashi,N.P.,&Kavya,M (2015). A literature survey on digital image processing techniquesincharacterrecognitionofIndianlanguages. International Journal of Computer Science and InformationTechnologies,6(3),2065-2069

[13] ChaitaliRaje,&JyothiRangole(2014,April).Detection of Leukemia in microscopic images using image processing. In 2014 International Conference on Communication and Signal Processing (pp. 255-259). IEEE

[14] Atin Mathur, Ardhendu Tripathi, S., & Manohar Kuse (2013).Scalablesystemforclassificationofwhiteblood cellsfromLeishman-stainedbloodstainimages.Journal ofpathologyinformatics,4(2),15

[15] CarticRamakrishnan,AbhishekPatnia,EduardHovy,& GullyAPCBurns(2012).Layout-awaretextextraction fromfull-textPDFofscientificarticles.Sourcecodefor biologyandmedicine,7(1),1-10.

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 05 | May 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 904

Turn static files into dynamic content formats.

Create a flipbook
A study on techniques to detect and classify acute lymphoblastic leukemia using deep learning. by IRJET Journal - Issuu