ROBUST ONLINE LEARNING MODELS IN HIGH-NOISE SCENARIOUS: MACHINE LEARNING APPROACHES TO NOISE REDUCTI by IRJET Journal

International Research Journal of Engineering and Technology (IRJET)

e-ISSN: 2395-0056

Volume: 12 Issue: 06 | Jun 2025

p-ISSN: 2395-0072

www.irjet.net

ROBUST ONLINE LEARNING MODELS IN HIGH-NOISE SCENARIOUS: MACHINE LEARNING APPROACHES TO NOISE REDUCTION Namarata Kumari1, Deepshikha2 1Master of Technology, Computer Science and Engineering, Lucknow Institute of Technology, Lucknow, India 2Assistant Professor, Department of Computer Science and Engineering, Lucknow Institute of Technology,

Lucknow, India ---------------------------------------------------------------------***--------------------------------------------------------------------stream and requires real time processing and learning. Abstract - The need to cope with noisy and evolving data

The model based on online learning has become the most popular solution to accommodate these requirements because it learns gradually, receiving every new data sample, and even makes the prediction in real time, but without retraining on all data present in the dataset. The said method is especially important in adaptive systems that are supposed to be continuous, i.e., network intrusion detection, health monitoring devices - guest worn, and automated equity trading systems.

streams has become a major concern in the light of the increasing spread of real-time data in areas such as monitoring of IoT, healthcare diagnostics and cybersecurity. Nevertheless, the traditional online learners fail when placed in high-noise scenarios where there are corrupt features, mislabeled moments and adversarial interferences. The proposed research work suggests such a framework to be called Robust Anomaly Detector (RAD), which incorporates noise-aware components in the online learning pipeline to enhance robustness to the numerous types of noise at the expense of little predictive accuracy and model stability drop. The RAD architecture has 2 parts namely the model of the label quality, its prediction, which is the dynamics of eliminating unknown data because there will be unreliable information, and the online classifier which To boost the flexibility even further, three longer variants, namely, RAD Voting, RAD Active Learning and RAD Slim, are being presented to provide the support to ensemble disagreement, human-in-the-loop feedback, and resourceefficient calculations respectively. With synthetic and realworld dataset testing (including data on IoT intrusion, cloud task failure, and facial recognitions) it is concluded that the RAD framework shows a large margin of improvement compared to the normal online learners in terms of robustness and improved by up to 30 percent in standards of robustness, with no reduction in efficiency of the computational process. The results make RAD an adaptable and scalable tool to use to implement credible online learning systems to noisy real-time environments.

Even though online learning in the real world has several benefits, noise very often undermines its quality. The sensor data streams, or other forms of logging, or user interactions can have corrupted features determined by hardware failures or information transmission error, and mislabeled instances because of a human error during annotation, or malicious attack by an adversary. In an extremely dynamic environment, such as one in IoT or dealing with cybersecurity, the noise will be significant and common, and it will have a dire impact on accuracy and stability of learning systems. There is hence an urgent need in bolstering online learning models with strong noise-control responsibilities in order to guarantee reliability and upsurge in such un-predictable and highnoise positions.

Key Words: Robust Online Learning, Label Noise, Streaming Data, Noise Reduction, Ensemble Learning, Anomaly Detection, Real-Time Learning, Machine Learning Robustness.

1. INTRODUCTION 1.1 Background

Figure-1: Machine Learning Approaches to Noise

Exponential rise in the amount of data produced in the real-time systems such as the Internet of Things (IoT) systems and healthy monitors as well as the cyberprotections has brought about a paradigm change in the study of machine learning. Big Data is not yet static and batch-based but it comes in high velocity, continuous

Impact Factor value: 8.226

1.2 Problem Statement Basic online learning algorithms are quite effective and good in case of processing sequential data, but, at the same time, they are predisposed to the negative influence of corrupted input. The training models tend to expect clean

ISO 9001:2008 Certified Journal

Page 648