z-logo
open-access-imgOpen Access
Estimation of Performance Bounds for Computational Models of Visual Saliency in Immersive Videos
Author(s) -
Aline F. G. De Sousa,
Clebson Ismael S. Silva,
Aldebaro Klautau,
Ronaldo F. Zampolo
Publication year - 2025
Publication title -
ieee access
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 0.587
H-Index - 127
eISSN - 2169-3536
DOI - 10.1109/access.2025.3592569
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
The omnidirectional (360°) video format favors user interaction within an immersive virtual environment, in a manner that is closer to real-world situations. Several studies have been proposing computational models to predict salient regions for this type of media. Currently, strategies inspired by methods originally applied to 2D videos have been adapted to assess the performance of attention models for omnidirectional videos, ignoring the fact that immersive media provides the observer with a different experience when compared with conventional media. Furthermore, the lack of consolidated proposals for performance limits for 360° videos represents a relevant gap in the literature. This paper evaluates the applicability of 2D approaches (Mean Eye Position and Human Infinite), introduces Equator Bias as a performance bound, and also proposes Saliency Sum as an approach for estimating upper limits in visual attention models for immersive media. For validation purposes, we compare these bounds with outputs from three attention models (DAVE, Static CP360 and Spherical U-Net) for six metrics (AuC-Judd, AuC-Borji, NSS, D KL , CC and SIM). The experiments were conducted on a subset of 6 videos from the PAVS10K dataset, which includes eye-tracking data from approximately 20 observers. In our experiments, the proposed Saliency Sum scored best on three metrics and Human Infinite achieved the best scores in two metrics and was never ranked as the worst. In contrast, Mean Eye Position showed the lowest performance, obtaining the worst score in five metrics and not ranking as the best in any. The Equator Bias showed worse score in one metric and scores close to the worst scores in three metrics. These results suggest that these approaches may serve as upper and lower performance bounds for evaluating saliency models in immersive videos.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom