International Research Journal of Engineering and Technology (IRJET) Volume: 10 Issue: 03 | Mar 2024
www.irjet.net
e-ISSN: 2395-0056 p-ISSN: 2395-0072
Continuous Robustness and Fairness Evaluation for Deployed Vision Transformers Laraib Ahmad Siddiqui1, Mohd Shahzad2 1Program Control Services Analyst, Accenture, India 2AWS and DevOps Consultant, Deloitte, India ---------------------------------------------------------------------***--------------------------------------------------------------------1.2 Problem Statement Abstract - While transformer-based vision models
achieve state-of-the-art accuracy on curated benchmarks, their reliability often collapses under real-world distribution shifts, demographic imbalance, and adversarial perturbations. We present a continuous evaluation framework that monitors robustness and fairness of deployed vision transformers through automated telemetry and adaptive retraining triggers. Our system combines selfsupervised pretraining for domain generalization with biasaware performance metrics integrated into a Kubernetesbased MLOps pipeline. It continuously audits model drift using real-time inference logs, quantifies degradation via composite robustness–fairness indicators, and initiates retraining when thresholds are violated. Experiments across ImageNet-R, CIFAR-C, and FairFace demonstrate a 28% improvement in out-of-distribution accuracy and 35% reduction in demographic bias drift compared with static baselines. The results suggest a viable path toward trustworthy, self-auditing computer-vision systems suitable for regulated or safety-critical deployments.
Existing robustness research primarily focuses on enhancing model architecture (e.g., adversarial training, data augmentation), but it rarely addresses how to maintain these guarantees once the model is in production. Fairness metrics are typically computed post-hoc, without integration into continuous-delivery pipelines. This separation between research and deployment creates a “trust gap” where models silently degrade after release. We therefore ask: How can we design a continuous evaluation pipeline that jointly monitors the robustness and fairness of vision models under real-world shifts, and automatically responds when reliability deteriorates?
1.3 Proposed Approach
Key Words: Continuous Evaluation, Robustness, Fairness, Vision Transformers, Model Drift, Self-Supervised Learning, MLOps, Composite Reliability Index (CRI)
We propose a Continuous Robustness and Fairness Evaluation (CRFE) framework built on three principles:
1. INTRODUCTION
1.
1.1 Motivation 2.
Computer-vision models increasingly operate in openworld settings autonomous vehicles, retail analytics, and healthcare imaging, where the data distribution can shift unpredictably. Yet most models are trained and evaluated once, assuming static test conditions. When lighting, texture, demographic mix, or camera device changes occur, performance drops sharply, a phenomenon known as distribution shift or robustness decay.
3.
This framework turns computer-vision deployment from a static artifact into a living system that continually aligns with the real world.
Simultaneously, fairness concerns arise: even if a model is accurate on average, its error rate can vary across gender, age, or ethnicity, creating bias drift over time. To maintain reliability, deployed systems require continuous measurement, diagnosis, and correction, not occasional offline testing.
© 2025, IRJET
|
Impact Factor value: 8.315
Compositional Monitoring: embed robustness and fairness probes directly into the inference loop, producing live metrics on each batch of incoming data. Self-Supervised Domain Anchoring: periodically update model representations using unlabelled production data via masked-autoencoder (MAE) and SimCLR objectives to reduce domain drift. Automated Feedback Loop: trigger retraining, recalibration, or alerting when drift thresholds are crossed, implemented as Kubernetes microservices integrated with MLflow and Prometheus.
1.4 Contributions 1.
|
A robustness-fairness co-monitoring pipeline for deployed vision transformers.
ISO 9001:2008 Certified Journal
|
Page 591