z-logo
open-access-imgOpen Access
Real-Time Weakly Supervised Object Detection Using Center-of-Features Localization
Author(s) -
Hatem Ibrahem,
Ahmed Diefy Ahmed Salem,
Hyun-Soo Kang
Publication year - 2021
Publication title -
ieee access
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.587
H-Index - 127
ISSN - 2169-3536
DOI - 10.1109/access.2021.3064372
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
We propose a high-speed convolutional neural network approach for weakly supervised localization (WSL) and weakly supervised object detection (WSOD). The proposed method, called center-of-features localization (COFL), performs localization of objects in a visual scene by combining both multi-label classification and regression for the number of instances of each class object. A modified Xception network architecture is used as the main feature extractor, and a classification-plus-regression loss function is used to perform the detection task. The method does not require bounding box annotations but only image labels and counts of the objects of each class in the image. This combination can produce a clear localization of objects in the scene through a masking technique between class activation maps (CAMs) and regression activation maps (RAMs). The proposed method was trained and tested on the PASCAL VOC2007 and VOC2012 datasets; it attained a mean average precision (mAP) of 47.0% and a correct localization CorLoc of 64.1% on PASCAL VOC2007 and a mAP of 42.3% and a CorLoc of 65.5% on PASCAL VOC2012 while performing object detection at a speed of ~50 fps. These results demonstrate that the network can perform object detection accurately in real-time using only image labels and object counts, which are inexpensive to annotate compared with the bounding box annotations typically employed in fully supervised object detection methods. The network far outperforms other weakly supervised methods and some fully supervised methods in terms of processing time while achieving fair accuracy.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom