Survey On Broken and Joint Devanagari Handwritten Characters Recognition Using Deep Learning
Abstract - TherecognitionofhandwrittenDevanagari characters presents a significant challenge due to the script's complexity and variability. The complexity is further compounded by the variability of broken and joint characters that are written differently by different individuals. In recent years, deep learning models have emerged as a powerful solution for character recognition, achieving remarkable performance in various applications. This survey paper presents an indepth analysis of the deep learning-based approaches used for recognizing handwritten Devanagari broken and joint characters. We extensively review the architectures and techniques applied in deep learning models such as convolutional neural networks (CNNs), recurrentneuralnetworks(RNNs),andhybridmodels,to identify these characters. We also discuss the datasets utilized for training and testing these models and the performance metrics used for evaluating their performance. Additionally, we conduct a comparative analysis of the different approaches, highlighting their respective strengths, and limitations, and proposing possible directions for future research. Our survey is intended to serve as a valuable resource for researchers and practitioners engaged in the area of handwritten Devanagari character recognition using deep learning.
Key Words: Feature extraction, Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), TensorFlow, ImageDataGenerator, Text recognition, Wavelets.
1. INTRODUCTION
Handwrittencharacterrecognitionisvitalincomputer vision and pattern recognition, with practical applications such as optical character recognition, automaticformprocessing,andintelligenthandwriting recognitionsystems.Devanagariisaprominentscript usedinseverallanguagessuchasHindi,Marathi,and Nepali, and recognizing handwritten Devanagari charactersisachallengingtaskduetothecomplexity andvariabilityofthescript.Therecognitionofbroken andjointcharactersinDevanagarifurtheraddstothe
complexity, as these characters are often written differentlybydifferentindividuals,makingitdifficult todeveloparobustrecognitionsystem.
This survey paper aims to provide a comprehensive overview of the existing approaches for broken and joint handwritten Devanagari character recognition, withaparticularfocusontheinvolvementofwavelet transform and recent deep learning-based technologies. The paper will also discuss the advantages and limitations of each approach and highlightthecurrenttechniques.Additionally,publicly availabledatasetssuchastheDevanagariHandwritten Character Dataset (DHCD) and Indian Language Handwritten Character Dataset (ILHCD) will be reviewed, which have been widely used for training and evaluation of various approaches. This survey paper will provide a useful resource for researchers andpractitionersworkinginthefieldofhandwritten Devanagari character recognition, with the aim of improving the accuracy and efficiency of character recognitionsystems.
2. Main Terminologies:
2.1 Devanagari script: A script used for writing several languages, including Hindi, Marathi, andNepali.
2.2 Broken characters: Devanagaricharactersthat are separated or disjointed, which require additional techniquesforrecognition.
2.3Jointcharacters: Devanagaricharactersthatare connectedtoothercharacters,oftenwritteninacursive manner,requireadditionaltechniquesforrecognition.
2.4 Handwritten character recognition: The process of identifying and transcribing handwrittencharactersfromanimageordocument.
2.5 Devanagari Handwritten Character Dataset (DHCD): A publicly available dataset containinghandwrittenDevanagaricharacters.
1 Performance Evaluationof Learning-based Frameworksfor Devanagari Character Recognition
Saptarshi Kattyayan,P. Kanungo
2 Character Recognition System for Devanagari Script Using Machine Learning Approach
Shilpa Mangesh Pande, Bineet KumarJha
Prepossessing: CNNModel,Denoising,Size,and Contrast Retuning
TheDevanagari CharacterDataset includescharactersfrom threedistinctclasses: Vowels,Consonants,and Numerals.Thenumerical Datasetconsistsof10 classesrangingfrom0to 9foreachdigitthereare 2000samplespresentin thedataset.Vowels datasetconsistsof12 classescontaining2000 samplesinagivenclass. Theconsonantdataset containsthehighest data.EachConsonant contains2000samples.
9801% Thereisaneedforan efficient character recognition and classificationsystem.
3 Deformed character recognition using convolutional neuralnetworks
Pre-Processing: Normalization, Thinningand noiseremoval.
Classification: Decisiontree classifier,Nearest Centroid classifier,K Neighbors Classifier,Extra treeClassifier
Pre-Processing: Data Augmentation
Classification: TreeClassifier, SVMClassifier
There is a scanned Devanagari script alphabets database consisting of 43 thousand of images of 32x32pixels
78% For complex model deep learning can be used.
The datasets employed for training in case of printeddatasamplesare extracted from ancient Kannada documents whereasthehandwritten datasamplearecollected invariedenvironments
98.05% Only 52 classes are present, which does not represent fully complexity of the recognitionproblem
4 Handwritten DevanagariCharacter Recognitionusing Wavelet-Based FeatureExtraction andClassification Scheme AdwaitDixit, AshwiniNavghane, YogeshDandawate IEEEIndiaConference (INDICON)
5 Transfer Learning using CNN for Handwritten Devanagari Character Recognition
Prepossessing: Banalization
Feature Extraction: Wavelet Transform
Classification: ANN,OCR
FeatureExtraction: AlexNet,DenseNet, Vgg,Inception, ConvNet
Thereis adataset, whichcontains almost2000 differentcharacters takenfromdifferent peopleforeach20 characters
The dataset of Devanagarihas46 classes. Each class has 2000 images. The dataset consists of 92000 images. 78200 images for trainingand13800 fortesting.
70% Only Devanagari characters were considered. Further research can be on other Indian regional languages
6 RecognitionofHandwritten Characters Based on WaveletTransformandSVM Classifier, TheInternational ArabJournalofInformation Technology, Vol. 15, No. 6, Malika Ait Aider, Kamal Hammouche, and Djamel Gaceb
Featureextraction: Wavelettransform
Pre-Processing: Normalization
Classification:SVM
MNISTDatasetwas used
98% Possibility of overfitting and limited sets of pre trainedmodel.
7 Handwritten Devanagari CharacterRecognition UsingLayer-WiseTraining ofDeepConvolutional NeuralNetworksand AdaptiveGradientMethods
Pre-processing involvesusing deepconvolutional neuralnetworks. Featureextraction isdonethrough theboxapproach, dividingthe characterinto24 cells.Anormalized vectordistanceis computedforeach box,exceptthe emptycells.
ISIDCHAR: a database with 36,172 grayscale images of 47 Devanagari characters. V2DMDC: a database with 20,305 samples of handwritten Devanagari characters.
98%
Thepapersuggests futureworkon integratinga normalization operationasa preprocessing procedure,butthis wasnotexploredin thecurrentstudy.
98% Devanagari characters only wereused.
8 Marathi Handwritten Character Recognition UsingSVMand KNNClassifier. Diptee Chikmurge; R. Shriram. Springer HIS Advances in Intelligent Systems and Computing, vol 1179; Published
9 Handwritten Marathi Compound Character Recognition. AmolA.Kadam, Dr. Milind V. Bhalerao, Mohit N. Tanurkar, IJERT
10 Handwritten Marathi character (vowel) recognition;Ajmire P.E. andWarkhede S.E.
K-Nearest Neighboursand SVM
Thedatasetof Marathi handwritten characters availableon Kaggleconsistsof 58,000imagesof characters, coveringatotalof 58differenttypes ofMarathi characters.
The accuracy achieved was90%for KNNand 95%for SVM.
The drawback of HOG isitsslowcomputation speed, which can be addressedbyemploying an alternative technology.
Classifiersused wereSVMand KNN.
There are 3500 images of compound Marathi characters writtenbyhand.
TheSVM classifier achievedan accuracyof 96.49%, whilethe KNN classifier achievedan accuracyof 95.67%.
The number of featuresislow.
Gaussian Distribution Function
There are 120 imagesthatshow Marathivowelsin differentstyles.
60% The accuracy is not satisfactory and requires further optimization.
2 Character Recognition System for Devanagari Script Using Machine Learning Approach.
and Sandhya
Universiti Brunei
Brunei
8 Marathi Handwritten Character Recognition UsingSVMand KNNClassifier. Diptee Chikmurge; R. Shriram. Springer HIS Advances in Intelligent Systems and Computing,
SVM(SupportVector Machine)andKNN (KNearestNeighbours)
The accuracy achieved was90%for KNNand 95%for SVM.
9 Handwritten Marathi Compound Character Recognition. Amol A Kadam, Dr. Milind V. Bhalerao, Mohit
N. Tanurkar, IJERT
10 Handwritten Marathicharacter recognitionusing R-HOG Feature; Parshuram M Kamble,Ravindra
S.Hegadi
Classifiers used were SVM and KNN.
TheSVMclassifierachieved an accuracy of 96.49%, while the KNN classifier achieved an accuracy of 95.67%.
SVM and FFANN The TAR calculated to be 97.15% for FFANNand 95.64%for SVMrespectively
The 2022 Fourth International Conference on Emerging Research in Electronics, Computer Science and Technology (ICERECT) by IEEE includes experimentation on broken and joint handwritten character recognition using deep learning. The experimentinvolvestestingvariousCNNarchitectures with different depths and structures. The results are then compared with state-of-the-art methods like VGG16,VGG19,InceptionV3,MobileNet,ResNet50,and Xceptionusingtransferlearningofpre-trainedweights
6. Conclusion:
Inconclusion,therehasbeensignificantresearchinthe field of broken and joint handwritten character recognitionusingdeeplearning.Severalstudieshave utilized various deep learning architectures, such as convolutionalneuralnetworks(CNNs),toimprovethe accuracy of recognition models. Techniques such as featureextraction,pre-processing,andtransferlearning have also been employed to improve model performance.Theresultsofthesestudiesindicatethat deep learning-based approaches can achieve high accuracy rates in recognizing broken and joint characters in handwritten scripts. However, more
researchisstillneededtooptimizetheperformanceof thesemodelsandmakethemmoreefficientandreliable forreal-worldapplications.
7. References:
1. S. Kattyayan, T. Kar and P. Kanungo, "Performance Evaluation of Learning Based Frameworks for Devanagari Character Recognition," 2020 IEEE 7th Uttar Pradesh SectionInternationalConferenceonElectrical, Electronics and Computer Engineering (UPCON),2020
2. P.Gupta,S.Deshmukh,S.Pandey,K.Tonge,V. Urkunde and S. Kide, "Convolutional Neural Network based Handwritten Devanagari Character Recognition," 2020 International Conference on Smart Technologies in Computing, Electrical and Electronics (ICSTCEE),2020,pp.322-326
3. N.AnejaandS.Aneja,"TransferLearningusing CNN for Handwritten Devanagari Character Recognition," 2019 1st International Conference on Advances in Information Technology(ICAIT),2019
4. R. Karnik, "Recognition of Handwritten Devanagari Characters", Fifth International Conference on Document Analysis and RecognitionICDAR'99
5. Rani,NShobha&Chandan,Nagabasavanna& Jain, Sajan & Kiran, Hena. (2018). Deformed character recognition using convolutional neural networks. International Journal of Engineering & Technology. 7. 1599. 10.14419/ijet.v7i3.14053.
6. Recognition of broken and overlapping handwrittenBangladigitsusingconvolutional neuralnetworkbyS.R.ChowdhuryandS.M.A. Bhuiyan,publishedintheInternationalJournal of Advanced Computer Science and Applications(IJACSA),2019
7. "Overlapping handwritten character recognition using CNN with geometric normalization" by M. Khan and A. Aziz, published in the International Journal of AdvancedComputerScienceandApplications (IJACSA),2018.
8. "Broken Bangla handwritten character recognition using CNN with data augmentation"byM.R.Rahman,M.A.Rahman, and M. Z. Rahman, published in the International Conference on Intelligent SystemsDesignandApplications(ISDA),2020.
9. "Recognition of overlapping and broken handwrittendigitsusingdeeplearning"byB. H.N.R.Ramasamy,M.S.T.Khan,andS.S.Ali, published in the International Conference on Signal Processing and Intelligent Systems (ICSPIS),2019.
10. "A review on overlapping and touching characterrecognitionusingdeeplearning"by A.B.A.Azhar,M.N.M.Nasir,andN.A.M.Isa, publishedintheJournalofTelecommunication, ElectronicandComputerEngineering(JTEC), 2020.
11. "Acomprehensivesurveyonoverlappingand broken character recognition" by A. Sharma and M. Vatsa, published in the Journal of PatternRecognitionResearch,2021.
12. "RecognizingOverlappingHandwrittenDigits Using Convolutional Neural Networks" by L.
Mai,X.Chen,andD.D.Feng,publishedinthe IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC),2018.