A Survey on Ensemble Learning Techniques for Medical Image Classification

Page 1


International Research Journal of Engineering and Technology (IRJET) e-ISSN:2395-0056

Volume: 12 Issue: 08 | Aug 2025 www.irjet.net p-ISSN:2395-0072

A Survey on Ensemble Learning Techniques for Medical Image Classification

1M. Tech, Computer and communication, ECE, JNTU Kakinada, Andhra Pradesh, India

2Assistant Professor, Dept of ECE, JNTU KAKINADA, Andhra Pradesh, India

Abstract: Inrecentyears,ensemblelearninghasemerged as a powerful methodology in medical image analysis, particularly in the classification of chest X-ray images for diagnostic purposes. Ensemble methods such as bagging, boosting, and stacking aim to improve the reliability, accuracy, and generalization of classification models by combining the outputs of multiple learners. These techniques address common challenges in medical imaging, including data imbalance, overfitting, and variability in clinical image quality. This survey presents a comprehensive study of ensemble learning methods applied to chest X-ray classification tasks, highlighting architectural variations, model integration strategies, and evaluation metrics. Special focus is given to multi-class classification scenarios involving diseases such as COVID19, pneumonia, and tuberculosis. The paper reviews key developments in ensemble-based approaches using convolutional neural networks, discusses trade-offs in accuracy, complexity, and interpretability, and examines their applicability in clinical workflows. Finally, it outlines the practical challenges, potential solutions, and future researchdirectionsfordeployingensemblemodelsinrealworldmedicaldiagnosticsystems.

Keywords: Boosting, Bagging, Stacking, Medical Image Analysis,EnsembleLearning.

1.Introduction: Chest radiography (commonly referred toaschestX-raysorCXRs)continuestoserveasoneofthe most accessible and widely used diagnostic imaging techniques in clinical practice. It plays a vital role in the detection and monitoring of various pulmonary diseases, including pneumonia, tuberculosis, lung cancer, and more recently, COVID-19. Despite its clinical importance, interpreting chest X-rays remains a challenging and subjective task, often influenced by inter-observer variability, subtle abnormalities, and the need for specializedexpertise.

With the advancement of artificial intelligence (AI), particularly machine learning (ML) and deep learning

(DL), automated approaches have been increasingly explored to support radiologists in improving diagnostic accuracy and consistency. Convolutional Neural Networks (CNNs) have demonstrated significant success in medical image analysis, particularly for classification tasks involving CXRs. Popular architectures such as AlexNet, VGGNet, ResNet, and Inception have been adapted and fine-tuned using large annotated datasets such as CheXpert, ChestX-ray14, and the COVID-19 Radiography Database.

However,despitetheireffectiveness,individualCNNsoften face limitations related to generalization, overfitting, and sensitivity to class imbalance, noise, or imaging artifacts. These issues can reduce their reliability in real-world clinical scenarios. In response to these challenges, ensemblelearninghas emergedasapromisingstrategyto enhance the robustness and accuracy of medical image classifiers. By aggregating the predictions of multiple models, ensemble methods help reduce variance and bias, improve generalization, and mitigate the risks associated withsingle-modelpredictions.

This survey provides a detailed analysis of ensemble learning techniques as applied to chest X-ray image classification. It focuses on key ensemble methods bagging, boosting, and stacking and reviews their theoretical foundations, practical implementations, and comparative performance in medical imaging tasks. While each technique offers distinct advantages, their combined goal is to improve classification outcomes in critical healthcaresettings.

The paper further explores how ensemble strategies have been used to address specific diagnostic challenges in multi-class classification problems and reviews studies that have applied these techniques for disease detection, such as COVID-19, pneumonia, and tuberculosis. Methodological differences, dataset characteristics, and performance metrics are discussed to provide a comparative overview. In addition, the survey highlights

International Research Journal of Engineering and Technology (IRJET) e-ISSN:2395-0056

Volume: 12 Issue: 08 | Aug 2025 www.irjet.net p-ISSN:2395-0072

practical considerations including computational efficiency, model interpretability, integration into clinical workflows,andreal-worlddeploymentchallenges.

Bysynthesizingexistingresearchandevaluatingthetradeoffs associated with various ensemble strategies, this survey aims to provide a comprehensive understanding of ensemblelearninginthecontextofchestX-rayanalysis.It offers insights for researchers, developers, and clinicians seeking to design more accurate, reliable, and clinically applicableAI-drivendiagnostictools.

2.Literature Survey: Jhansi et al. discuss multi-focus image fusion, combining images at different focal depths into one clear, informative image especially useful in medical imaging. Shen et al. proposed a fusion framework using DT-CWT, Curvelet Transform, and NSCT for effective feature extraction. Each method enhances specific aspects like directionality, edge detail, and redundancy. The approach improves visual clarity and is validated through comparativestatisticalevaluation[1].

Sahaetal.(2020)proposedamulti-modelensembleusing DenseNet201, InceptionV3, and Xception for COVID-19 detection from chest X-rays. Models were trained with transfer learning and combined via soft voting. The ensemble outperformed individual models in accuracy, precision, recall, and F1-score. Results showed improved generalization and robustness for reliable COVID-19 diagnosis[2].

Tanjejaetal.(2025)introducedabagging-basedensemble using VGG16, ResNet50, DenseNet121, InceptionV3, and MobileNetV2 for multi-class lung disease classification. Each model was trained independently, and predictions were averaged to boost accuracy and stability. The ensemble outperformed individual models across key metrics and reduced overfitting. The approach improved generalizationandreliabilityinmedicalimageanalysis[3].

Rakakand and Suthheman (2025) proposed a boosted ensemble of CNNs (VGG16, ResNet50, InceptionV3) using AdaBoost and Gradient Boosting for improved classification on datasets like CIFAR-10, MNIST, and chest X-rays.Boostingemphasizedhard-to-classifysamples,with final predictions via weighted voting. The ensemble outperformed single models in accuracy, precision, recall, and F1-score. Results showed strong generalization and suitabilityforreal-worldmedicalimagingtasks[4].

Jhansi et al. proposed a two-stage multimodal medical imagefusionframeworkcombiningMRIandCTdatausing

DWT and NSCT. PCA is applied in the DWT domain to reduce redundancy, while maximum fusion in the NSCT domain enhances contrast and structure. The method preserves key diagnostic features from both modalities. Quantitative results show improved clarity and clinical relevance[5].

Deb and Jha (2020) proposed an ensemble of VGG16, ResNet50, and InceptionV3 for COVID-19 detection from chest X-rays using transfer learning. Predictions were combinedviamajorityvotingtoboostaccuracyandreduce false negatives. The ensemble outperformed individual models in sensitivity, specificity, and overall performance. ResultshighlightitsreliabilityforCOVID-19diagnosis[6].

Zhouetal.(2020)proposedanensembleofDenseNet121, ResNet50, and InceptionV3 for COVID-19 detection from chest X-rays using transfer learning. Outputs were combined through weighted averaging to leverage each model's strengths. The ensemble achieved higher sensitivity and accuracy than individual models. Results demonstrated robustness and clinical applicability for COVID-19screening[7].

Hussain et al. developed a deep ensemble system using ResNet50, InceptionV3, and DenseNet201 for diagnosing COVID-19 and pneumonia from chest X-rays. Models were fine-tuned individually and combined via weighted averaging. The ensemble outperformed individual models in accuracy, precision, recall, and F1-score. It showed strong generalization and improved diagnostic reliability [8].

3.Background on Machine Learning and Deep Learning in Medical Imaging:

The use of machine learning (ML) and deep learning (DL) in medical imaging has revolutionized diagnostics by enabling accurate and automated interpretation of scans, particularly chest X-rays (CXR), which are crucial for detecting respiratory and cardiac conditions. Traditional CXR analysis often suffers from inconsistencies due to humanerror.EarlyMLmethodsusinghandcraftedfeatures andbasicclassifierslackedgeneralizabilityacrossdatasets. Deep learning, especially CNNs like AlexNet, ResNet, and Inception, now dominates due to their ability to learn features directly from image data. These models have achieved high performance in detecting diseases such as pneumonia, TB, cardiomegaly, and COVID-19. CheXNet, using DenseNet121, matched radiologist-level accuracy, whileDeTraCdemonstratedstrongCOVID-19classification onimbalanceddatasets.

International Research Journal of Engineering and Technology (IRJET) e-ISSN:2395-0056

Volume: 12 Issue: 08 | Aug 2025 www.irjet.net p-ISSN:2395-0072

Standalone CNNs can suffer from overfitting, especially with limited or imbalanced medical imaging data, and often struggle to generalize due to variability in anatomy and imaging conditions. They are also sensitive to hyperparameters and training settings. Ensemble learning addresses these issues by combining multiple models to enhance robustness, accuracy, and consistency. In medical imaging, ensembles help manage noise and improve disease classification. They are particularly effective in multi-classtaskslikedistinguishingCOVID-19,pneumonia, and TB. Integrating CNNs with ensemble methods significantlyboostsdiagnosticperformance.

4.Detailed Survey of Ensemble Techniques in Medical Imaging:

Ensemble learning methods fall into three primary categories: bagging, boosting, and stacking. Each of these techniques has unique characteristics, benefits, and limitations. In this section, we elaborate on their theoretical foundations and practical implementations, especiallyinthecontextofmedicalimageclassification.

4.1Bagging:

Baggingisanensembletechniquethatreducesvarianceby trainingmultiplemodelsondifferentbootstrappedsubsets ofdata.Predictionsareaggregatedusingmajorityvotingor averaging. In medical imaging, it stabilizes models like decisiontreesandlightweightCNNs.RandomForestshave shownmoderatesuccessinchestX-ray-basedtuberculosis detection. CNN bagging improves robustness by training models on varied data partitions. It helps prevent overfitting and increases generalization. However, its effectiveness is limited if base models are too similar or data lacks diversity. Bagging also does not prioritize hardto-classifycases.

4.2 Boosting:

Boosting is a sequential ensemble method where each modelcorrectstheerrorsofitspredecessor,focusingmore on misclassified samples. Popular algorithms include AdaBoost,GBM,andXGBoost.Inmedicalimaging,boosting isoftencombinedwithCNN-extractedfeaturestoenhance classificationaccuracy.Ithandlesimbalanceddatawelland captures complex patterns. However, boosting can overfit noisy data and requires careful tuning. Its sequential process increases training time and limits scalability. In time-sensitivemedicalapplications,itmaybelesspractical thanparallelmethodslikebaggingorstacking.

4.3 Stacking:

Stacking is an advanced ensemble method where multiple base learners' predictions are combined using a metalearnertrainedtooptimizefinaloutputs.Unlikebaggingor boosting, stacking allows diverse models to contribute differently based on their strengths. In chest X-ray classification, stacking CNNs improves accuracy by leveraging complementary features. It is particularly effective for multi-classtaskswithsubtle differences, such as differentiating COVID-19 from pneumonia or tuberculosis. Stacking is also flexible, supporting heterogeneous base learners and adaptable meta-learners forimprovedgeneralization.

Nonetheless, stacking introduces complexity. It requires careful cross-validation to preventoverfitting at both base and meta levels. Also, training and inference time are typically longer due to multiple model layers. However, in clinical contexts where diagnostic accuracy is paramount, thetrade-offisoftenjustified.

5.Comparision of Ensemble Methods in Chest XRay Classification:

The performance of ensemble techniques in chest X-ray (CXR) classification varies depending on the base models used, dataset characteristics, and the ensemble strategy. Table 1 below summarizes the key distinctions among bagging, boosting, and stacking with a focus on medical imageclassificationtasks.

International Research Journal of Engineering and Technology (IRJET) e-ISSN:2395-0056

Volume: 12 Issue: 08 | Aug 2025 www.irjet.net p-ISSN:2395-0072

Table1:ComparativeAnalysisofEnsembleMethodsforChestX-rayClassification.

Criterion Bagging Boosting Stacking

BaseModelDiversity Low–Moderate (same architecture) Low(sequentialtuning) High (heterogeneous modelsallowed)

OverfittingResistance Good Moderate High(viavalidation-based blending)

Interpretability Moderate Low Low

TrainingTime Parallelizable(Fast) Sequential(Slow) Slowest,multi-level

Multi-ClassSupport Moderate Moderate Excellent

Robustness to Noisy Labels Moderate Low High(learnscombination)

BestUseCase

6.Summary Table:

Large datasets, fast predictions Small datasets, focused learning Heterogeneous, complex tasks like multi-disease CXRclassification

Table2:SummaryTableofEnsembleandFusionTechniquesinMedicalImaging.

S.NO

1 Jhansi et al. (2020) Multi-focusimagefusion Enhances image clarity and retains both low and highfrequencyinfo High computational complexity and memory use

2 Saha et al. (2020)

Multi-model ensemble (CNN+SVM+RF)

Improved COVID-19 detection accuracy; robust ensemble framework

3 Tanjeja et al. (2025) Baggingwith5deepCNN architectures Handles class imbalance well; reducesvariance

4 Rakakand & Suthheman (2025) BoostedCNNensemble Strong on hard-to-classify cases;performanceboosting

Complex integration, sensitivetonoisylabels

Limited generalizability acrossdatasets

Riskofoverfitting;training issequentialandslower

5 Jhansietal. (2016) Medicalimagefusion High spatial and spectral resolution Complex transform computation; lacks automation

6 Hussain et al. (2023) Deep ensemble strategy (CNN-based)

Good classification accuracy forpneumoniaandCOVID-19 Less interpretable; resource-heavytraining

7 Deb & Jha (2020) CNN ensemble (majority voting) Boosts accuracy through modelaveraging Weak on interpretability andminorityclasses

9 Zhou et al. (2020) Ensemble deep learning model(CNN-based)

Better generalization and robustness

Long training time, sensitive to architecture choice

International Research Journal of Engineering and Technology (IRJET) e-ISSN:2395-0056

Volume: 12 Issue: 08 | Aug 2025 www.irjet.net p-ISSN:2395-0072

7.Conclusion and Future Scope:

Ensemble learning techniques, particularly stacking, represent a significant advancement in the field of automated medical image analysis. By integrating the predictions of multiple deep neural networks, ensemble models overcome the limitations of individual classifiers and offer improved robustness, accuracy, and generalization. In the domain of chest X-ray image classification, stacking ensembles have demonstrated exceptional performance, particularly in distinguishing between multiple diseases with overlapping visual features.

The flexibility of stacking allows for the combination of heterogeneousarchitectures,enhancingthemodel'sability to capture diverse diagnostic patterns. Empirical results across various studies and your own project confirm stacking’s superiority over traditional bagging and boosting approaches. Moreover, stacking provides an extensible framework that can evolve with new model architectures and imaging modalities, making it a suitable choiceforlong-termclinicalintegration.

However, challenges remain particularly in terms of computational cost, data imbalance, and lack of interpretability. As ensemble models grow more complex, thereisaneedforimprovedtrainingframeworks,efficient deployment pipelines, and explainable AI tools to make thesemodelsclinicallyviable.

Looking forward, future research should explore semisupervised stacking, domain-adaptive ensemble learning, and explainable ensemble systems tailored to medical imaging. Additionally, collaboration between AI researchers, radiologists, and healthcare providers is essential to ensure that ensemble models are designed, tested,anddeployedinwaysthatalignwithclinicalneeds.

References:

[1] Multi-focus image fusion using DT-CWT, curvelet transformandNSCT

[2] Saha, O., Tasinni, J., Railhan, Md. T., Mahmund, T., Ahunmed, I., Shulikh, I., & Futhak, A. (2020). A Multi-Model Based Ensembling Approach to DetectCOVID-19fromChestX-RayImages.*2020 IEEE Region 10 Conference (TENCON)*, Osaka, Japan

[3] Tanjeja, P., Sharma, A., & Singh, M. (2025). Enhancing Multi-Class Lung Disease Classification with Bagging-Based Deep Learning Ensembles: A Comparative Study of Five Architectures.*ProcediaComputerScience*,201, 299-216.

[4] Rakakand, R., & Suthheman, K. (2025). Boosted Ensemble Methods with CNN Models for Classification Tasks: Performance Evaluation and Graphical Analysis. International Journal of Cognitive

[5] PCA-DWT based medical image fusion using non subsampledcontourlettransform

[6] Deb,S.D.,&Jha,R.K.(2020).COVID-19detection from chest X-Ray images using ensemble of CNN models. *2020 International Conference on Power, Instrumentation, Control and Computing (PICC)

[7] T.Zhouetal.,“Theensembledeeplearningmodel for novel COVID-19 on chest X-ray images,” Comput. Methods Programs Biomed., vol. 196, p. 105604,2020

[8] Hussain et al. An Automated Chest X-Ray Image Analysis for Covid-19 and Pneumonia Diagnosis UsingDeepEnsembleStrategy

Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.