z-logo
open-access-imgOpen Access
Multiple-Model Fully Convolutional Neural Networks for Single Object Tracking on Thermal Infrared Video
Author(s) -
Mohd Asyraf Zulkifley,
Niki Trigoni
Publication year - 2018
Publication title -
ieee access
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.587
H-Index - 127
ISSN - 2169-3536
DOI - 10.1109/access.2018.2859595
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
The availability of affordable thermal infrared (TIR) camera has instigated its usage in various research fields, especially for the cases that require images to be captured in dark surroundings. One of the low-level tasks required by most TIR-based researches is the need to track an object throughout a video sequence. The main challenge posed by TIR camera usage is the lack of texture to differentiate two nearby objects of the same class. According to the VOT-TIR 2016 challenge, the best fully convolutional neural network (FCNN)-based tracker has only managed to obtain the third place. The discriminative ability of the FCNN tracker is not fully utilized because of the homogenous appearance pattern of the tracked object. This paper aims to improve FCNN-based tracker ability to predict object location through comprehensive sampling approach as well as better scoring scheme. Hence, a multiple-model FCNN is proposed, in which a small set of fully connected layers is updated on the top of pre-trained convolutional neural networks. The possible object locations are generated based on a two-stage sampling that combines stochastically distributed samples and clustered foreground contour information. The best sample is selected according to a combined score of appearance similarity, predicted location, and model reliability. The small set of appearance models is updated by using positive and negative training samples, accumulated from two periods of time which are the recent and parent node intervals. To further improve training accuracy, the samples are generated according to a set of adaptive variances that depends on the trustworthiness of the tracker output. The results show an improvement over TCNN, an FCNN-based tracker that won the VOT 2016 challenge with the expected average overlap increasing from 0.248 to 0.257. The performance enhancement is attributed to the better robustness with a 20% reduction in tracking failure rate compared to the TCNN.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom