z-logo
open-access-imgOpen Access
Multimodal Outlier Optimizer for Textual, Numeric, and Image Data
Author(s) -
Krittika Das,
Nilanjan Dey,
Bitan Misra,
Satyabrata Roy,
R. Simon Sherrat
Publication year - 2025
Publication title -
ieee access
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 0.587
H-Index - 127
eISSN - 2169-3536
DOI - 10.1109/access.2025.3619826
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
Ensuring the quality and reliability of multimodal video data is critical for applications that rely on accurate interpretation, such as medical imaging, surveillance, remote sensing and intelligent manufacturing. However, the presence of outliers across different data types such as visual, textual, and numerical poses a major challenge. To address this, we propose the Multimodal Outlier Optimizer (MOO), a unified framework designed to detect and filter outliers from heterogeneous data modalities within video files. MOO decomposes each video into still images, text, and numeric sequences, allowing specialized algorithms to handle each modality: Nonlocal Means (NLM) for removing Gaussian noise in image frames and Local Outlier Factor (LOF) for detecting contextual outliers in textual and numerical data. These filtered components are then recombined into a cleaned, optimized video. The system is trained and evaluated using synthetically generated datasets to simulate real-world noise while ensuring scalability and control. Performance is assessed using Jaccard Similarity Score (JSS) and Structural Similarity Index (SSIM), with results demonstrating consistent improvements even under high contamination levels (up to 50%), achieving SSIM scores above 0.77 across three domains: medical imaging, remote sensing, and zoomed video data. These results highlight MOO’s potential as an effective and adaptable tool for enhancing the integrity of multimodal video data in complex, real-world environments.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom