AULoRA: Anomaly Understanding with Low-Rank Adaptation for Zero-Shot Anomaly Detection
Author(s) -
Seunghyun Oh,
Seongsu Lee,
Seunghye Chae,
Youngmin Ro
Publication year - 2025
Publication title -
ieee access
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 0.587
H-Index - 127
eISSN - 2169-3536
DOI - 10.1109/access.2025.3614713
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
Zero-Shot Anomaly Detection (ZSAD) aims to identify anomalies in unseen categories or scenarios. Recently, Vision-Language Models (VLMs), most notably CLIP, have been utilized to enhance anomaly detection performance. However, CLIP struggles to capture local anomalies, which has led to the development of additional modules that significantly increase model complexity and computational overhead. To address this challenge, we propose AULoRA , a novel approach that enhances anomaly understanding by integrating Low Rank Adaptation (LoRA) into CLIP’s visual encoder and efficiently injecting visual context into the textual representation. While preserving CLIP’s general visual knowledge, we utilize Singular Value Decomposition (SVD) to selectively fine-tune only the most relevant singular components, enabling precise identification of semantic anomalies. Nevertheless, anomaly detection often requires capturing highly diverse and category-specific characteristics, which simple text prompts alone struggle to represent adequately. To overcome this, we adapt textual representations based on the visual context extracted from input images, allowing the model to achieve category-aware and anomaly sensitive alignment. AULoRA maintains the original architecture and inference efficiency of CLIP, while achieving state-of-the-art performance on both image-level and pixel-level anomaly detection benchmarks across diverse industrial datasets.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom