International Research Journal of Engineering and Technology (IRJET)
e-ISSN: 2395-0056
Volume: 12 Issue: 08 | Aug 2025
p-ISSN: 2395-0072
www.irjet.net
A REVIEW ON DEEP LEARNING MODELS FOR IMAGE ENHANCEMENT THROUGH VISIBLE AND INFRARED IMAGE FUSION Richa Sukhdev Ambapkar1, Dr. A. S. Yadav2 1Department of Computer Science and Engineering, D.Y. Patil College of Engineering & Technology,Kolhapur,
Maharashtra, 416006, India 2 Associate Professor, Department of Computer Science and Engineering, D.Y. Patil College of Engineering & Technology,Kolhapur, Maharashtra, 416006, India ---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - Combining visible and infrared (IR) images has
IR inputs enables better recognition of pedestrians and obstacles under varying environmental conditions. Medical imaging benefits from the fusion of thermal and visible data to detect conditions such as inflammation or tumors with higher accuracy. Additionally, environmental monitoring applications— such as wildfire detection or crop health assessment— can be improved through the integration of spatial and thermal information.
emerged as an integral technique in image enhancement, allowing for improved perception, identification, and decision- making in a range of industrial and real-world situations. By combining spatial detail from visible light with thermal information from infrared imaging, fused outputs offer significantly improved clarity and utility, particularly in low-light or obscured environments. This review presents a comprehensive study of recent deep learning models developed for visible and IR image fusion. It explores a variety of approaches including convolutional neural networks (CNNs), encoder-decoder architectures, and attention mechanisms. The paper also discusses key application areas such as surveillance, autonomous vehicles, medical diagnostics, and environmental monitoring. In addition, major challenges such as heterogeneous data alignment, high computational cost, and the scarcity of labeled datasets are examined. In an attempt to facilitate the creation of more resilient and intelligent image fusion systems, future research directions are finally noted.
Traditionally, image fusion methods relied on manual feature extraction or statistical models, which often struggled with robustness, adaptability, and computational efficiency. The advent of deep learning has introduced a paradigm shift, offering data-driven models that learn rich feature representations automatically and can adapt across multiple fusion scenarios. Convolutional Neural Networks (CNNs), encoder-decoder models, residual learning, and attention mechanisms have been increasingly applied to enhance merged pictures' efficacy and quality. Recent developments in deep learning models for visible and infrared image fusion are thoroughly examined in this article. It explores a range of techniques, compares their strengths and limitations, and identifies challenges and opportunities for future research. The objective is to guide researchers and practitioners in selecting and designing effective image fusion solutions for real-world applications.
Key Words: Image Fusion, Deep learning, Convolutional Neural Networks (CNN), Image enhancement, feature extraction,Visible Images, Infrared Images
1.INTRODUCTION An important field of study in computer vision and image processing is picture fusion, specifically the combination of visible and infrared (IR) images. Visible images capture high-resolution spatial details and color textures under optimal lighting conditions, whereas infrared images record thermal radiation, allowing visibility even in darkness, smoke, or fog. Combining these complementary modalities through fusion techniques results in enhanced images with greater detail, contrast, and semantic richness.
2. Methodology The proposed methodology begins with collecting dualmodality image inputs—visible and IR images—from relevant datasets. These inputs undergo preprocessing and normalization steps including data cleaning, formatting, and scaling to ensure consistency across modalities. The core feature extraction is performed using a dual-branch convolutional neural network (CNN), where each branch independently processes one image type. The extracted features are then fused using an attention-based fusion layer to preserve and emphasize significant image details. The fused features are passed through a classification layer and tested using standard model
The demand for robust image fusion techniques is growing across several critical domains. In surveillance and security, fused images can improve the detection of intruders and suspicious activities in low-light or nighttime conditions. In autonomous driving, fusing visible and
© 2025, IRJET
|
Impact Factor value: 8.315
|
ISO 9001:2008 Certified Journal
|
Page 348