Issuu

International Research Journal of Engineering and Technology (IRJET)

e-ISSN: 2395-0056

Volume: 13 Issue: 02 | Feb 2026

p-ISSN: 2395-0072

www.irjet.net

A REVIEW OF DECENTRALIZED COLLABORATIVE MODEL TRAINING FOR MEDICAL DATA USING FEDERATED LEARNING WITH ENHANCED PRIVACY CONTROLS Divyanshi Singh1, Mr. Manish Kumar Soni2 1Master of Technology, Computer Science and Engineering, Bansal Institute of Engineering & Technology,

Lucknow, India

2Assistant Professor, Department of Computer Science and Engineering, Bansal Institute of Engineering &

Technology, Lucknow, India ---------------------------------------------------------------------***---------------------------------------------------------------------

Abstract - The rapid digitization of healthcare systems has

emerged as a distributed learning paradigm that enables collaborative model training without sharing raw data, thereby addressing privacy and compliance concerns while maintaining analytical utility (McMahan et al., 2017). This section contextualizes the motivation, challenges, and scope of decentralized collaborative training in medical environments.

generated vast volumes of sensitive medical data, creating significant opportunities for advanced machine learning applications while simultaneously raising critical privacy concerns. Traditional centralized training approaches require data aggregation, which conflicts with regulatory frameworks and institutional data governance policies. Federated Learning (FL) has emerged as a promising paradigm that enables collaborative model training across distributed medical institutions without transferring raw patient data. This review systematically examines decentralized collaborative model training frameworks for healthcare applications, with a particular focus on enhanced privacypreserving mechanisms integrated into FL systems. It analyzes architectural models, secure aggregation protocols, differential privacy techniques, homomorphic encryption schemes, and block chain-assisted decentralized infrastructures. The paper further evaluates performance trade-offs between model accuracy, communication efficiency, and privacy guarantees, highlighting challenges such as nonIID data distribution, adversarial attacks, scalability limitations, and regulatory compliance. By synthesizing current research trends and identifying persistent technical and ethical gaps, this review outlines future research directions aimed at achieving secure, scalable, and trustworthy decentralized medical AI systems. The study provides a structured foundation for researchers and practitioners developing privacy-aware federated learning frameworks in healthcare environments.

1.1 Background and Motivation Healthcare institutions generate heterogeneous data streams including electronic health records (EHRs), radiological images, genomic sequences, and biosignals. The application of deep learning techniques to such datasets has demonstrated superior performance in disease detection and outcome prediction compared to conventional statistical approaches (Esteva et al., 2017). Nevertheless, training robust models requires multi-institutional data collaboration to overcome local data bias and limited sample sizes. Traditional centralized machine learning architectures require pooling data into a single repository, creating risks related to privacy breaches, data misuse, and regulatory non-compliance. Legislative frameworks such as GDPR and HIPAA impose strict limitations on cross-border and interinstitutional data sharing. Consequently, decentralized collaborative training has gained attention as a viable alternative that enables knowledge sharing without direct data exchange.

1.2 Challenges in Medical Data Sharing and Model Training

Key Words: Federated Learning; Decentralized Machine Learning; Medical Data Privacy; Differential Privacy; Secure Aggregation; Healthcare AI

Medical data sharing is constrained by legal, ethical, and technical barriers. Privacy concerns arise due to the highly sensitive nature of patient information, where unauthorized access may lead to identity disclosure or discrimination. Even anonym zed datasets remain vulnerable to reidentification attacks when combined with auxiliary information (Narayanan and Shmatikov, 2008).

1. INTRODUCTION The integration of artificial intelligence (AI) into healthcare has transformed diagnostic systems, clinical decision support, medical imaging analytics, and predictive modeling. However, the effectiveness of machine learning (ML) models largely depends on access to large-scale, diverse, and highquality datasets. In the medical domain, such data are inherently sensitive and governed by strict regulatory frameworks, making centralized data aggregation increasingly impractical. Federated Learning (FL) has

Impact Factor value: 8.226

From a technical standpoint, healthcare data are typically non-independent and identically distributed (non-IID), imbalanced, and institution-specific. Variations in imaging protocols, demographic distributions, and disease

ISO 9001:2008 Certified Journal

Page 499