Survey On Broken and Joint Devanagari Handwritten Characters Recognition Using Deep Learning

Page 1

Survey On Broken and Joint Devanagari Handwritten Characters Recognition Using Deep Learning

Abstract - TherecognitionofhandwrittenDevanagari characters presents a significant challenge due to the script's complexity and variability. The complexity is further compounded by the variability of broken and joint characters that are written differently by different individuals. In recent years, deep learning models have emerged as a powerful solution for character recognition, achieving remarkable performance in various applications. This survey paper presents an indepth analysis of the deep learning-based approaches used for recognizing handwritten Devanagari broken and joint characters. We extensively review the architectures and techniques applied in deep learning models such as convolutional neural networks (CNNs), recurrentneuralnetworks(RNNs),andhybridmodels,to identify these characters. We also discuss the datasets utilized for training and testing these models and the performance metrics used for evaluating their performance. Additionally, we conduct a comparative analysis of the different approaches, highlighting their respective strengths, and limitations, and proposing possible directions for future research. Our survey is intended to serve as a valuable resource for researchers and practitioners engaged in the area of handwritten Devanagari character recognition using deep learning.

Key Words: Feature extraction, Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), TensorFlow, ImageDataGenerator, Text recognition, Wavelets.

1. INTRODUCTION

Handwrittencharacterrecognitionisvitalincomputer vision and pattern recognition, with practical applications such as optical character recognition, automaticformprocessing,andintelligenthandwriting recognitionsystems.Devanagariisaprominentscript usedinseverallanguagessuchasHindi,Marathi,and Nepali, and recognizing handwritten Devanagari charactersisachallengingtaskduetothecomplexity andvariabilityofthescript.Therecognitionofbroken andjointcharactersinDevanagarifurtheraddstothe

complexity, as these characters are often written differentlybydifferentindividuals,makingitdifficult todeveloparobustrecognitionsystem.

This survey paper aims to provide a comprehensive overview of the existing approaches for broken and joint handwritten Devanagari character recognition, withaparticularfocusontheinvolvementofwavelet transform and recent deep learning-based technologies. The paper will also discuss the advantages and limitations of each approach and highlightthecurrenttechniques.Additionally,publicly availabledatasetssuchastheDevanagariHandwritten Character Dataset (DHCD) and Indian Language Handwritten Character Dataset (ILHCD) will be reviewed, which have been widely used for training and evaluation of various approaches. This survey paper will provide a useful resource for researchers andpractitionersworkinginthefieldofhandwritten Devanagari character recognition, with the aim of improving the accuracy and efficiency of character recognitionsystems.

2. Main Terminologies:

2.1 Devanagari script: A script used for writing several languages, including Hindi, Marathi, andNepali.

2.2 Broken characters: Devanagaricharactersthat are separated or disjointed, which require additional techniquesforrecognition.

2.3Jointcharacters: Devanagaricharactersthatare connectedtoothercharacters,oftenwritteninacursive manner,requireadditionaltechniquesforrecognition.

2.4 Handwritten character recognition: The process of identifying and transcribing handwrittencharactersfromanimageordocument.

2.5 Devanagari Handwritten Character Dataset (DHCD): A publicly available dataset containinghandwrittenDevanagaricharacters.

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 04 | Apr 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page1767
Prachi Pachang1 , Jiya Shaikh2 , Ms. Vina M. Lomate3, Tanishka Sinha4 , Manjeet Kour5 3Hod Dept. of Computer Engineering, RMD Sinhgad School of Engineering, Warje Pune, India
***
1,2,4,5UG Student, Computer Engineering, RMD Sinhgad School of Engineering, Warje Pune, India

1 Performance Evaluationof Learning-based Frameworksfor Devanagari Character Recognition

Saptarshi Kattyayan,P. Kanungo

2 Character Recognition System for Devanagari Script Using Machine Learning Approach

Shilpa Mangesh Pande, Bineet KumarJha

Prepossessing: CNNModel,Denoising,Size,and Contrast Retuning

TheDevanagari CharacterDataset includescharactersfrom threedistinctclasses: Vowels,Consonants,and Numerals.Thenumerical Datasetconsistsof10 classesrangingfrom0to 9foreachdigitthereare 2000samplespresentin thedataset.Vowels datasetconsistsof12 classescontaining2000 samplesinagivenclass. Theconsonantdataset containsthehighest data.EachConsonant contains2000samples.

9801% Thereisaneedforan efficient character recognition and classificationsystem.

3 Deformed character recognition using convolutional neuralnetworks

Pre-Processing: Normalization, Thinningand noiseremoval.

Classification: Decisiontree classifier,Nearest Centroid classifier,K Neighbors Classifier,Extra treeClassifier

Pre-Processing: Data Augmentation

Classification: TreeClassifier, SVMClassifier

There is a scanned Devanagari script alphabets database consisting of 43 thousand of images of 32x32pixels

78% For complex model deep learning can be used.

The datasets employed for training in case of printeddatasamplesare extracted from ancient Kannada documents whereasthehandwritten datasamplearecollected invariedenvironments

98.05% Only 52 classes are present, which does not represent fully complexity of the recognitionproblem

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 04 | Apr 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page1768 Sr. no Publication details Tech used Dataset Accurac y Research Gap Identified
3. Literature Survey:

4 Handwritten DevanagariCharacter Recognitionusing Wavelet-Based FeatureExtraction andClassification Scheme AdwaitDixit, AshwiniNavghane, YogeshDandawate IEEEIndiaConference (INDICON)

5 Transfer Learning using CNN for Handwritten Devanagari Character Recognition

Prepossessing: Banalization

Feature Extraction: Wavelet Transform

Classification: ANN,OCR

FeatureExtraction: AlexNet,DenseNet, Vgg,Inception, ConvNet

Thereis adataset, whichcontains almost2000 differentcharacters takenfromdifferent peopleforeach20 characters

The dataset of Devanagarihas46 classes. Each class has 2000 images. The dataset consists of 92000 images. 78200 images for trainingand13800 fortesting.

70% Only Devanagari characters were considered. Further research can be on other Indian regional languages

6 RecognitionofHandwritten Characters Based on WaveletTransformandSVM Classifier, TheInternational ArabJournalofInformation Technology, Vol. 15, No. 6, Malika Ait Aider, Kamal Hammouche, and Djamel Gaceb

Featureextraction: Wavelettransform

Pre-Processing: Normalization

Classification:SVM

MNISTDatasetwas used

98% Possibility of overfitting and limited sets of pre trainedmodel.

7 Handwritten Devanagari CharacterRecognition UsingLayer-WiseTraining ofDeepConvolutional NeuralNetworksand AdaptiveGradientMethods

Pre-processing involvesusing deepconvolutional neuralnetworks. Featureextraction isdonethrough theboxapproach, dividingthe characterinto24 cells.Anormalized vectordistanceis computedforeach box,exceptthe emptycells.

ISIDCHAR: a database with 36,172 grayscale images of 47 Devanagari characters. V2DMDC: a database with 20,305 samples of handwritten Devanagari characters.

98%

Thepapersuggests futureworkon integratinga normalization operationasa preprocessing procedure,butthis wasnotexploredin thecurrentstudy.

98% Devanagari characters only wereused.

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 04 | Apr 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page1769

8 Marathi Handwritten Character Recognition UsingSVMand KNNClassifier. Diptee Chikmurge; R. Shriram. Springer HIS Advances in Intelligent Systems and Computing, vol 1179; Published

9 Handwritten Marathi Compound Character Recognition. AmolA.Kadam, Dr. Milind V. Bhalerao, Mohit N. Tanurkar, IJERT

10 Handwritten Marathi character (vowel) recognition;Ajmire P.E. andWarkhede S.E.

K-Nearest Neighboursand SVM

Thedatasetof Marathi handwritten characters availableon Kaggleconsistsof 58,000imagesof characters, coveringatotalof 58differenttypes ofMarathi characters.

The accuracy achieved was90%for KNNand 95%for SVM.

The drawback of HOG isitsslowcomputation speed, which can be addressedbyemploying an alternative technology.

Classifiersused wereSVMand KNN.

There are 3500 images of compound Marathi characters writtenbyhand.

TheSVM classifier achievedan accuracyof 96.49%, whilethe KNN classifier achievedan accuracyof 95.67%.

The number of featuresislow.

Gaussian Distribution Function

There are 120 imagesthatshow Marathivowelsin differentstyles.

60% The accuracy is not satisfactory and requires further optimization.

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 04 | Apr 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page1770

2 Character Recognition System for Devanagari Script Using Machine Learning Approach.

and Sandhya

Universiti Brunei

Brunei

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 04 | Apr 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page1771
Algorithmic Survey: Sr no. Publication Details Algorithm Used Accuracy Language 1 Performance Evaluationof Learning-based Frameworksfor Devanagari
Saptarshi Kattyayan,P.Kanungo GaussianNaiveBayes, DecisionTree,KNearest Neighbor(KNN)and CNN 98.01% Devanagari
4.
Character Recognition.
Shilpa
DecisionTreeclassifier,
Neighborsclassifier,
andRandomForest classifier 78% VariousScriptslike HandwrittenDevanagari, Arabic,EnglishandChinese
KannadaDataset
Dandawate
ANNwithWavelet feature 70% HandwrittenDevanagari Script
Mangesh Pande, BineetKumarJha
NearestCentroid classifier,KNearest
ExtraTreesclassifiers
3 Deformed character recognition using convolutional neural networks
4 Handwritten Devanagari Character Recognition using Wavelet-Based Feature Extraction and Classification Scheme Adwait Dixit, Ashwini Navghane, Yogesh
IEEE India Conference(INDICON)
Aneja
Aneja
Darussalam
Darussalam DCNN 98% HandwrittenDevanagari
5 TransferLearningusing CNN for Handwritten Devanagari Character Recognition Nagender

8 Marathi Handwritten Character Recognition UsingSVMand KNNClassifier. Diptee Chikmurge; R. Shriram. Springer HIS Advances in Intelligent Systems and Computing,

SVM(SupportVector Machine)andKNN (KNearestNeighbours)

The accuracy achieved was90%for KNNand 95%for SVM.

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 04 | Apr 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page1772 6 Recognitionof
onWavelet Transformand
TheInternational ArabJournalof Information Technology,Vol.
AitAider,Kamal
DjamelGaceb CWT(ContinuousWavelet Transform),SVM,Knearestneighbor 98% Off-lineHandwritten Characters 7 Handwritten
Neural
Gradient
Mahesh Jangid, SumitSrivastava DCNNandAdaptive GradientMethods 98% DevanagariScript
Handwritten CharactersBased
SVMClassifier,
15,No.6,Malika
Hammouche,and
Devanagari Character Recognition UsingLayerWiseTrainingof Deep Convolutional
Networks andAdaptive
Methods
Publishedin 2020
vol 1179;
Marathi

9 Handwritten Marathi Compound Character Recognition. Amol A Kadam, Dr. Milind V. Bhalerao, Mohit

N. Tanurkar, IJERT

10 Handwritten Marathicharacter recognitionusing R-HOG Feature; Parshuram M Kamble,Ravindra

S.Hegadi

Classifiers used were SVM and KNN.

TheSVMclassifierachieved an accuracy of 96.49%, while the KNN classifier achieved an accuracy of 95.67%.

SVM and FFANN The TAR calculated to be 97.15% for FFANNand 95.64%for SVMrespectively

The 2022 Fourth International Conference on Emerging Research in Electronics, Computer Science and Technology (ICERECT) by IEEE includes experimentation on broken and joint handwritten character recognition using deep learning. The experimentinvolvestestingvariousCNNarchitectures with different depths and structures. The results are then compared with state-of-the-art methods like VGG16,VGG19,InceptionV3,MobileNet,ResNet50,and Xceptionusingtransferlearningofpre-trainedweights

6. Conclusion:

Inconclusion,therehasbeensignificantresearchinthe field of broken and joint handwritten character recognitionusingdeeplearning.Severalstudieshave utilized various deep learning architectures, such as convolutionalneuralnetworks(CNNs),toimprovethe accuracy of recognition models. Techniques such as featureextraction,pre-processing,andtransferlearning have also been employed to improve model performance.Theresultsofthesestudiesindicatethat deep learning-based approaches can achieve high accuracy rates in recognizing broken and joint characters in handwritten scripts. However, more

researchisstillneededtooptimizetheperformanceof thesemodelsandmakethemmoreefficientandreliable forreal-worldapplications.

7. References:

1. S. Kattyayan, T. Kar and P. Kanungo, "Performance Evaluation of Learning Based Frameworks for Devanagari Character Recognition," 2020 IEEE 7th Uttar Pradesh SectionInternationalConferenceonElectrical, Electronics and Computer Engineering (UPCON),2020

2. P.Gupta,S.Deshmukh,S.Pandey,K.Tonge,V. Urkunde and S. Kide, "Convolutional Neural Network based Handwritten Devanagari Character Recognition," 2020 International Conference on Smart Technologies in Computing, Electrical and Electronics (ICSTCEE),2020,pp.322-326

3. N.AnejaandS.Aneja,"TransferLearningusing CNN for Handwritten Devanagari Character Recognition," 2019 1st International Conference on Advances in Information Technology(ICAIT),2019

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 04 | Apr 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page1773
5. Live Survey:
Marathi
Marathi

4. R. Karnik, "Recognition of Handwritten Devanagari Characters", Fifth International Conference on Document Analysis and RecognitionICDAR'99

5. Rani,NShobha&Chandan,Nagabasavanna& Jain, Sajan & Kiran, Hena. (2018). Deformed character recognition using convolutional neural networks. International Journal of Engineering & Technology. 7. 1599. 10.14419/ijet.v7i3.14053.

6. Recognition of broken and overlapping handwrittenBangladigitsusingconvolutional neuralnetworkbyS.R.ChowdhuryandS.M.A. Bhuiyan,publishedintheInternationalJournal of Advanced Computer Science and Applications(IJACSA),2019

7. "Overlapping handwritten character recognition using CNN with geometric normalization" by M. Khan and A. Aziz, published in the International Journal of AdvancedComputerScienceandApplications (IJACSA),2018.

8. "Broken Bangla handwritten character recognition using CNN with data augmentation"byM.R.Rahman,M.A.Rahman, and M. Z. Rahman, published in the International Conference on Intelligent SystemsDesignandApplications(ISDA),2020.

9. "Recognition of overlapping and broken handwrittendigitsusingdeeplearning"byB. H.N.R.Ramasamy,M.S.T.Khan,andS.S.Ali, published in the International Conference on Signal Processing and Intelligent Systems (ICSPIS),2019.

10. "A review on overlapping and touching characterrecognitionusingdeeplearning"by A.B.A.Azhar,M.N.M.Nasir,andN.A.M.Isa, publishedintheJournalofTelecommunication, ElectronicandComputerEngineering(JTEC), 2020.

11. "Acomprehensivesurveyonoverlappingand broken character recognition" by A. Sharma and M. Vatsa, published in the Journal of PatternRecognitionResearch,2021.

12. "RecognizingOverlappingHandwrittenDigits Using Convolutional Neural Networks" by L.

Mai,X.Chen,andD.D.Feng,publishedinthe IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC),2018.

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 04 | Apr 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page1774

Turn static files into dynamic content formats.

Create a flipbook