A Newly Developed Ground Truth Dataset for Visual Saliency in Videos
Author(s) -
Muhammad Zeeshan,
Muhammad Majid,
Imran Fareed Nizami,
Syed Muhammad Anwar,
Ikram Ud Din,
Muhammad Khurram Khan
Publication year - 2018
Publication title -
ieee access
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.587
H-Index - 127
ISSN - 2169-3536
DOI - 10.1109/access.2018.2826562
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
Visual saliency models aim to detect important and eye catching portions in a scene by exploiting human visual system characteristics. The effectiveness of visual saliency models is evaluated by comparing saliency maps with a ground truth data set. In recent years, several visual saliency computation algorithms and ground truth data sets have been proposed for images. However, there is lack of ground truth data sets for videos. A new human labeled ground truth is prepared for video sequences that are commonly used in video coding. The selected videos are from different genres including conversational, sports, outdoor, and indoor having low, medium, and high motion. Saliency mask is obtained for each video by nine different subjects, which are asked to label the salient region in each frame in the form of a rectangular bounding box. A majority voting criteria is used to construct a final ground truth saliency mask for each frame. Sixteen different state-of-the-art visual saliency algorithms are selected for comparison and their effectiveness is computed quantitatively on the newly developed ground truth. It is evident from results that multiple kernel learning and spectral residual-based saliency algorithms perform best for different genres and motion-type videos in terms of F-measure and execution time, respectively.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom