Influence Analysis of Image Feature Selection TechniquesOver Deep Learning Model

Page 1

Influence Analysis of Image Feature Selection TechniquesOver Deep Learning Model

1Master of Engineering Student, Dept. Of Computer Engineering, S.G.S.I.T.S, Indore, India, 2 Assistant Prof., Dept. Of Computer Engineering, S.G.S.I.T.S, Indore ***

ABSTRACT

The digital images are the data storage methods which store the real world information on a matrix based on pixels. The images are now become very valuable due to increasing applications in medical, engineering, and social. Therefore, Image processing and Classification plays an essential role. In this paper, we are investigating the employment of three different features i.e., shape, color and texture for image classification.Inaddition,thecombinedfeature isalso usedfor demonstrating the impact on classifier. The Deep learning based Convolutional Neural Network is used for feature and their combination classification. In this experiment, Diabetic Retinopathy Detection dataset is usedThe performance of the model is evaluated in terms of accuracy which demonstrates the feature selection techniques are able to improve the classification accuracy and also minimize the resource utilization.

Keywords:Color Feature, Feature Selection, Image Classification,ShapeFeature,TextureFeature.

1. INTRODUCTION

Digital images are a technique of capturing and storing the real world data in form of sparse vector. The images are playingessentialroleinvariousrealworldapplicationssuch as in engineering, medical, media and advertisements and many more. Additionally, the growth in internet and communicationtechnologyi.e.5Gincreasestheapplicationof images. In this context, the analysis and investigation of image classification techniques is an essential domain of study. By using the images and machine learning (ML) a number of applications can be developed such as plant leaf imagebased disease detection, cancer cells analysis, manufacturingdefectanalysisandmanymore.

The machine learning techniques based image classification andanalysisrequiresthreekeystepsi.e.datapreprocessing, feature extraction and classification. The preprocessing techniques are used for filtering and balancing the noise on image data. In addition, some normalization process may alsoinvolve.Thesecondstepistoobtaintheimagefeatures using various kinds of feature selection techniques. The aim

of feature extraction techniques may involve the identification of color distribution in image, and shape and objects hidden in image. Finally, wecan use the features for classifying the objects based using machine learning algorithms.Additionally, the influence of feature selection technique on the machine learning algorithm’s performance hasalsobeen measured. In addition, we havealsoproposed to investigate the combinations of the features and their impactonclassificationalgorithmsperformance.

Thefeatureselectiontechniquesareuseful in enhancingthe valuable insights in image and minimize the amount of data to be process. The key insights have used for enhancing the accurate classification of the images, similarly the minimization of data can improve the efficiency in terms of time and memory utilization. Thus, we have needed to identify which kind of feature selection technique has more benefitforperformingtheclassificationwithhigheraccuracy and low computational resource consumption. However, there are a number of feature extraction techniques available, some of them are classical and some of them are new. But each kind of feature has their potential and limitations. Additionally, each feature extraction techniques are process image data differently. Therefore, the image feature selection techniques may demonstrate different behavior with different applications and ML algorithm’s performance.The main objective of this work involves the following:

1. Studyofdifferentimagefeatureselectiontechniques

2. Study of performance influence with individual featureselectiontechniques

3. Study and measuring the performance influence usingcombinationoffeatureselectiontechniques

2. LITERATURE SURVEY

This section offers the study of recent work which has been done in the area of image processing and their relevant applications.

© 2022,
|
|
Certified Journal | Page1858
Volume:09Issue:06|June2022 www.irjet.net p-ISSN:2395-0072
IRJET
Impact Factor value: 7.529
ISO 9001:2008
International Research Journal of Engineering and Technology (IRJET) e-ISSN:2395-0056

2.1Feature selection based image classification

Y. Bi et al [1] proposeaensemblelearningstructureutilizing GP (EGP). The strategy incorporates feature learning, classifier choice, learning, and mixing into a solitary program. To accomplish this, a program structure, another functionset,andaterminalsetarecreated.Theexhibitionof EGP is inspected on nine image data collections. The outcomes exhibit EGP accomplishes better execution. The investigation uncovers that EGP advances great groups adjusting variety and exactness. Z. F. Lai et al [2] propose a deep learning model that coordinates Coding Network with Multilayer Perceptron (CNMP), which consolidates significant level features from a Convolutional neural organization (CNN) and classical features. The model incorporates (1), preparing CNN as a coding organization, and the result encodes pixels into feature vectors. (2) Extracted a bunchofconventional features(3)plana model inlightof neural organizationstomeldthevariousfeatures. The model accomplishes classification precision of 90.1% and90.2%.

M. K. Alsmadi et al [3] show the extraction of hearty and significant features and storage. The feature contains color, shape, and texture highlights. A likeness assessment with a meta heuristiccalculationhasbeenachievedbetweentheQI elements and dataset pictures. The distance measurements are utilized to find the pictures. The CBIR methods are depicted and developed. These strategies increment the recoveryexecution.Asindicatedby Dr. A. Nazir [4],medical nutritiontherapy(MNT)isabasicpartofdiabetestheboard. A sound eating regimen is an eating routine that gives the supplementsyourbodyneeds.Individualswithdiabetesare urgedtopickanassortmentoffiber containingfoodsources, like entire grains, organic products, and vegetables since theygivenutrients,minerals,fiber,anddifferentsubstances. The essential dietary fat objective in people with diabetesis torestrictimmersedfatandcholesteroladmission.

Apicturerecoverystrategyisproposedby M. J. J. Ghrabat et al [5]. This plans to recover pictures utilizing the best element extraction process. Gaussian sifting strategy is utilized to eliminate undesirable information. Highlight extractionisapplied,likesurfaceandshading.Thesurfaceis sorted as a dim level co event network and shading as power based elements. These highlights are grouped by k implies. An adjusted hereditary calculation is utilized to improve the elements and characterized utilizing an SVM based CNN. The presentation is assessed as far as responsiveness, particularity, accuracy, review, recovery, and acknowledgment rate. N. Varish et al [6] a various leveled picture recovery conspire to utilize shading, surface, and shape visual substance is proposed. This lessens the looking through space. The shape included has been registered by a basic combination of histograms and

invariant minutes. The recovery kept pictures into the transitional dataset. Then, the surface has been processed. Thisgivesthemulti goalpicturesandthenearbycalculation. The dim level co event lattice based surface is gotten. The vastmajorityoftheunessentialpicturesaredisposedofyeta few undesired pictures are left. Combined shading data is caughtonbothnon uniformquantizedshadingparts.Atlast, the shading highlight creates the ideal outcomes and low computationalupwardwithbetterprecision.

2.2. Applications of image classification

S. S. Yadav et al [7] explorehowtoapplytheCNNonachest X beam dataset. The strategies are, SVM with neighborhood turnanddirectionfreeelements,movelearningontwoCNN models:VGG16andInceptionV3,andcontainerorganization. The outcomes show that an increase is a viable way Additionally, Transfer learning is helpful on a little dataset. In move picking up, retraining on an objective dataset is fundamental to further develop execution. Q. Liu et al [8] present a clever connection driven semi directed structure. It is a consistency based technique that takes advantage of the unlabeled information to create excellent consistency. Present an example of connection consistency (SRC) by displaying the relationship among tests. The system authorizes the consistency of semantic connection among various examples, to investigate additional semantics. The analyses on datasets for example ISIC 2018 and ChestX ray14showpredominance.

X. Yang et al [9] exploit profound learning methods to address hyperspectral picture characterization. This technique can take advantage of both spatial and ghostly relationships to upgrade grouping. They advocate four models,specifically2D CNN,3D CNN,repetitive2D CNN (R 2D CNN), and R 3D CNN. They directed trials in view of six informational collections. Results affirm the prevalence of the profound learning models. R. J. S. Raj et al [10] presented a superior classifier i.e., Optimal Deep Learning (DL) for order of cellular breakdown in the lungs, cerebrum picture, and Alzheimer's. The Classification is fusing preprocessing,includingchoice,andgrouping.Theobjective is to infer an ideal component choice model to upgrade the exhibition utilizing the Opposition based Crow Search calculation. The OCS picks the ideal highlights, here Multi surface, dark level elements were chosen. The outcomes were contrasted and existing models and classifiers. The model accomplished the greatest presentation as far as exactness,awareness,andparticularitybeing95.22%, 86.45 %,and100percent.

2.3. Review summary

Therecentliteratureabouttheimageclassificationhasbeen explored and found that the images are classically utilized

Research Journal of Engineering and Technology
© 2022, IRJET |
|
Certified Journal | Page1859
International
(IRJET) e ISSN:2395 0056 Volume:09Issue:06|June2022 www.irjet.net p-ISSN:2395-0072
Impact Factor value: 7.529
ISO 9001:2008

Volume:09Issue:06|June2022 www.irjet.net p-ISSN:2395-0072

withtheinformationretrievalandmedicalimageanalysis.In this context, various methods for image classification and image retrieval techniques have been studied. During this studywehavefoundtherearevariouskindsofimagefeature extraction techniques are used which will help for data classificationorretrievalofaccurateinformation.

3. PROPOSED MODEL

This section involves two parts first contains the different algorithms used n this study And the second section includestheproposedexperimentalmodel

3.1. Algorithms used

A. Local Binary Pattern

Inthepicture,LBP[11]iscalculatedbyequatingitwithits neighbors:

values,witha search table of elements inirregulardesign couldbedescribedas:

Where, the center pixel is , and neighbor pixel is given by , P is number of neighbors and radius is given by R. if we considering a pixel coordinate (0, 0), then the is calculatedby

The conversion from to have contain P+2 outputvalues

B. Grid Color Moment

This feature is widely utilized types of feature. These features demonstrate enhanced constancy and are more tactless of picture. Color is not only used for prettiness of image in also provides additional info. In color indexing, the goalistorecoverallthepictureswhosecolorconfigurations are analogous to image query. The feature vector is called "Grid basedColorMoment":[18]

TransformimageRGBtoHSV

Nextconvertthewholeimageinto3x3blocks

Foreachblocks

Computemeancolor

WhereNisthenumberofpixelsand isthepixelstrength.

Calculateitsvariance

The neighbor values are not in grids can be projected by exclamation. Presume the image is of size I*J afterward the LBP design of every pixel is recognized, a histogram is constructedtosignifytheimage:

Computeitsskewness

Where, K is the greatest LBP design value,U is described as thenumberofspatialgrowth(0/1changes)inthatdesign

Individual block have total of 9 features, therefore the entire image has a total of 81 features. In order to use these features we need to normalize them. To do the standardization, for every features, Calculate the mean fromthedataset

The unchanging LBP refer to the designs which have partial changeover ( ) in the spherical binary performance. The plotting from to has P*(P 1) +3 yield

2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page1860

International Research Journal of Engineering and Technology (IRJET) e ISSN:2395 0056
©

Volume:09Issue:06|June2022 www.irjet.net p-ISSN:2395-0072

WhereMisthenumberofpictures,and isthearticleofith sample. The "whitening" alters for all the data, and get the regularizedfeature:

C. Sobel Operator

The Sobel operator carries out 2 D spatial gradient computation and considersareas of high spatial frequency relevant to edges. In an input image this operator fined the absolute gradient level This operatorcomprisesa couple of 3×3convolutionkernels,whichisdemonstrated inFigure1. The aim of these kernels is to identify edges vertically and horizontally.Thekernelscanbeapplied,toproduceseparate computationofthegradientcomponents.

Figure2: BasicCNNexamplemodel

CNN was for the most part used in the work of post offices where that is used for identification of areas and the pin codes and others. The major thing with the deep learning is thatitrequiresalargeamountofdatatotrainandrelevantly computationalresourcesarerequired.Thatisadisadvantage of CNNs. The CNN is just a variant of deep learning architecture;these architectures are basically used for computation of visual objects. It utilizes an extraordinary strategycalledConvolution.Theaimofthisconvolutionisto reduce down the image size which is easy and efficient for processing without loss of the essential features in the imageswhichisrequiredtorecognize.

Figure1: Sobelkernels

Both the kernels can be combined to find the absolute gradientmagnitudeusing:

Typically,itiscomputedusing:

The angle of orientation of the edge to the gradient is calculatedby:

D. Convolutional Neural Networks (CNN)

The specialists have battled to create a framework that can comprehend visual information. This field is known as Computer Vision. PC vision fostered an AI model that outperformedthebestpictureacknowledgmentcalculations, known as AlexNet, with an 85% exactness. At the core of AlexNet was Convolutional Neural Networks an extraordinary kind of neural organization that generally mirrorshumanvision.

Figure3:Imagecolorchannels

An RGB picture is a lattice of pixels having three planes. Figure4showswhataconvolutionis.Wetakeachannel/bit (3×3 lattice) and apply it to the info picture to get a component. This convolved highlight is given to the following layer. CNN is made out of different layers. The layers are made with the neurons, which are the computational functions that are used for calculating the weighted sum of input and usages activation for producing the results of layer The layers involved in network have generatingtheinputfornextlayerofthenetwork.Theinitial

International Research Journal of Engineering and Technology (IRJET) e ISSN:2395 0056
©
Page
2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal |
1861

layers of the network are computing the features form the imagessuchasedgesorshape

Figure4:Kernelconvolutionprocess

As the output of one layer is passed to another layer the network will uncover the more complex features In light of the actuation map, the final layer produces a score value whichismostrelevanttothefinalclasslabelofobject.There are two types of Pooling layer used i.e.,max and average pooling. The aim of these layers to improve the efficiency of dataprocessingbyminimizingthesizeofinputsamples.

Figure6:Systemarchitectureofproposedwork

The first component of our model is image dataset. We obtainedourdatasetusingtheKagglerepository.Therearea numberofimagedatasetsareavailableamongthemwehave selectedtheDiabeticRetinopathyDetectiondataset[12].The dataset has a set of retina images captured under different imagingcircumstances.Thesesetofimagesaremarkedwith a subject id. A clinician has rated the presence of diabetic retinopathy in each image on a scale of 0 to 4, according to thefollowingscale:

• Class0showsNotpresentDR

• Class1showsMildpresenceDR

• Class2showsDRModerate

• Class3demonstrateSeverelevelofDR

Figure5: Poolinglayerworking

3.2Proposed System Architecture

The proposed system architecture for finding the impact of featureselectionondeeplearningalgorithmisdemonstrated infigure6

• Class4showsProliferativeDR

Theaimistoprepareadatamodelwhichisabletoprovidea score according to the above given classes. Therefore, the proposed work involves the recognition of images into five different classes. In order to utilize the image dataset, we have required to preprocess the images using the following formula:

Next, we have employed feature selection technique to obtain features from the images. The feature selections techniques are very important process in image processing thatareutilizedfortwomajorreasons:

International Research Journal of Engineering and Technology (IRJET) e ISSN:2395 0056
Volume:09Issue:06|June2022 www.irjet.net p-ISSN:2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page1862

International Research Journal of Engineering and Technology (IRJET) e ISSN:2395 0056

1. Enhancingtheusefulinsightsfromtheinputimages data

2. Reducing the size of data for improving the efficiencyoflearningalgorithm

There are three different features namely edge, color and texture has been extracted from the input images. In this context the following process is used to generate the set of featurestobetrainwiththelearningalgorithm

Algorithm1:ExtractingimageFeatures

Input: preprocessedimagesP

Output: image features Color C, texture , Edge , CombinedFeaturesCom

Process: 1. 2. a. b. c. d. 3. 4.

Inthisexperimentwehaveimplementeda2DConvolutional neural network (2D CNN). The implemented CNN consist of thefollowingconfiguration.

1. Input layer: it is a 2D Convolutional layer and having the input size of 60*60*1 for individual feature training and 60*60*3 for combined feature training. The input layer is configured with the ‘ReLu’activationfunction.

2. Max Pooling Layer: here 2D max pooling layer has beenappliedwiththekernelsize2*2

3. 2D Convolutional layer: this layer is configured with the “ReLu” activation function and kernel size istakenas3*3.

4. Max Pooling Layer: here 2D max pooling layer has beenappliedwiththekernelsize2*2

5. Flatten layer: this layer is used to convert 2D featuresextractedbythe2D CNNintoaflatvector.

6. Dense Layer: that is the back propagation layer usedforlearningwiththeflattenfeatures.Thislayer is configured with the “ReLu” activation function and64neurons.

7. Output Layer: that is also a dense layer and used hereforproducingtheoutputofthenetworkwhich consist of 5 neurons as the number of output variables and configured with the “SoftMax” activationfunction.

The extracted features from the feature’s extraction phase has used with the 2D CNN based configuration described. Theextractedfeaturesarefirstsplitintotwosetstrainingset and second are validation set in the ratio of 80% and 20%. The model will trained and validated using the following process.

Algorithm2:Modelvalidation

Input: colorfeatureC,texturefeature ,edgefeature , combinedfeatures Output: classificationperformance

Process: 1. 2. 3. 4. 5. Return

4. RESULTS ANALYSIS

Volume:09Issue:06|June2022 www.irjet.net p-ISSN:2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page1863

This section describes the performance of the deep learning model with different feature selection techniques and their combination in terms of accuracy. The accuracy is the measurement of correctness of the model, which can be defined using the ratio of total correctly classified samples over total samples to classify. That can also be measured using: The performance of the different feature selection approaches and their combination with the deep learning duringthetrainingisdemonstratedinfigure7.TheXaxisof diagramshowsthenumberofepochcyclesandYaxisshows the model performance with the increasing number of

epochs. According to the obtained results the combined features are rapidly moving towards the convergence as compared to the individual features. Additionally, combined features are demonstrating the higher performance during thetrainingofthemodel.

5. CONCLUSION

The aim is to study the different feature section approaches for image data. The work also involves the measurement of performance influence of the classification algorithm due to feature selection approaches and their combination. In this context, an experiment has been designed based on the different feature learning models, combination and deep learning classifier. The experimental analysis has been carriedoutwithapubliclyavailabledatasetwhichisusedfor diabeticretinopathy.Theexperimentalanalysisuncoversthe followingfacts.

1. The feature selection techniques are useful for improving the classification accuracy as well as reductionintimeandmemoryutilization

2. Combination of features are not always advantageous but,inmany applications,, itprovides betterconsequences.

Figure7:TrainingAccuracyofmodels

The training performance demonstrates the higher accurate resultswiththecombinedfeatures,butwhenthemodelsare validated against the test dataset, then we found the differentbehaviorofmodel’soutcome.

3. Not all the combinations of features are beneficial for all the application’s use cases; we need variations and evaluation before utilization of the modelsbeforeutilization.

This work will be extended in various new application use casessomeessentialofthemarehighlightedbelow:

1. Identifymoremethodsforcombiningthefeaturesto improveclassification performance in global applicationutilization.

2. Prepare more variants of feature combinations whichimprovestheaccuracy.

3. Apply the feature extraction and combinations into morethanoneapplicationforidentifyingtheobjects inimagedata.

6. REFERENCES

Figure8: ValidationaccuracyofModels

According to the validation results the sobel operator and color feature demonstrate higher performance as compared tothecombinedfeatures.Thevalidationperformanceofthe combined feature is highly fluctuating but fewer then the twootherindividualfeatureselectionmodels.

Volume:09Issue:06|June2022 www.irjet.net p-ISSN:2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page1864

[1] Y. Bi, B. Xue, M. Zhang, “An Automated Ensemble Learning Framework Using Genetic Programming for Image Classification”, GECCO ’19, July 13 17, 2019, Prague, Czech Republic © 2019 Association for Computing Machinery. ACM ISBN 978 1 4503 6111 8/19/07

[2] Z. F. Lai, H. F. Deng, “Medical Image Classification Based on Deep Features Extracted by Deep Model and Statistic Feature Fusion with Multilayer Perceptron”, Hindawi Computational Intelligence and Neuroscience Volume 2018, Article ID 2061516, 13 pageshttps://doi.org/10.1155/2018/2061516

International Research Journal of Engineering and Technology (IRJET) e ISSN:2395 0056

[3] M.K.Alsmadi,“Content-Based Image Retrieval Using Color, Shape and Texture Descriptors and Features”, Arabian Journal for Science and Engineering, https://doi.org/10.1007/s13369 020 04384 y

[4] Dr. A. Nazir, “Nutrition and Diabetes Mellitus”,Indep RevJan Jun2018;20(1 6)

[5] M. J. J. Ghrabat, G. Ma, I. Y. Maolood, S. S. Alresheedi, Z. A. Abduljabbar, “An effective image retrieval based on optimized genetic algorithm utilized a novel SVM based Convolutional neural network classifier”, Hum. Cent. Comput. Inf. Sci. (2019) 9:31, https://doi.org/10.1186/s13673 019 0191 8

[6] N. Varish, A. K. Pal, R. Hassan, M. K. Hasan, A. Khan, N. Parveen, D. Banerjee, V. Pellakri, A. U. Haqis, I. Memon, “Image Retrieval Scheme Using Quantized Bins of Color Image Components and Adaptive Tetrolet Transform”, IEEEAccess,VOLUME8,2020

[7] S. S. Yadav, S. M. Jadhav, “Deep convolutional neural network based medical image classifcation for diseasediagnosis”,JBigData(2019)6:113.

[8] Q. Liu, L. Yu, L. Luo, Q. Dou, P. A. Heng, “Semisupervised Medical Image Classification with Relation driven Self ensembling Model”, IEEE TransactionsonMedicalImaging.

[9] X. Yang, Y. Ye, X. Li, R. Y. K. Lau, X. Zhang, X. Huang, “Hyperspectral Image Classification with Deep Learning Models”, IEEE Transactions on Geo science andRemoteSensing

[10] R. J. S. Raj, S. J. Shobana, I. V. Pustokhina, D. A. Pustokhin, D. Gupta, K. Shankar, “Optimal Feature Selection Based Medical Image Classification Using Deep Learning Model in Internet of Medical Things”, Special Section on Deep Learning Algorithms forInternetofMedicalThings,Volume8,2020

[11] Z. Guo, L. Zhang, D. Zhang, “A Completed Modeling of Local Binary Pattern Operator for Texture Classification”, IEEE transaction on image processing, 2010

[12] https://www.kaggle.com/c/aptos2019 blindness detection/data?select=train_images

Volume:09Issue:06|June2022 www.irjet.net p-ISSN:2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page1865

International Research Journal of Engineering and Technology (IRJET) e ISSN:2395 0056

Turn static files into dynamic content formats.

Create a flipbook