
Deeper Siamese network with multi‐level feature fusion for real‐time visual tracking
Author(s) -
Yang Kang,
Song Huihui,
Zhang Kaihua,
Fan Jiaqing
Publication year - 2019
Publication title -
electronics letters
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.375
H-Index - 146
ISSN - 1350-911X
DOI - 10.1049/el.2019.1041
Subject(s) - discriminative model , bittorrent tracker , computer science , fuse (electrical) , artificial intelligence , feature (linguistics) , tracking (education) , computer vision , pattern recognition (psychology) , eye tracking , representation (politics) , engineering , pedagogy , linguistics , psychology , philosophy , electrical engineering , politics , law , political science
In recent years, using Siamese network (SiamN) for visual tracking has witnessed a great success in terms of accuracy and efficiency. Nevertheless, most SiamN‐based trackers employ shallow network such as AlexNet to extract the top‐layer features as target representation that are less discriminative, usually leading to tracking performance degeneration when suffering from large deformation and similar distractors. A straightforward idea to address this issue is to replace the backbone network of SiamN with deeper ResNet. However, this cannot boost performance much due to the low resolution of high‐level feature maps with useful spatial details losing. To address this issue, the authors propose a lightweight yet effective feature agglomeration module (FAM) to adaptively fuse low‐level and high‐level features for robust tracking. Specifically, they first develop a generalised non‐local attention module to enhance the discriminative capability of high‐level semantic features. Then, they design an inception‐like module to enhance the representative power of low‐level features with more spatial details. Both types of features are then adaptively fused in the FAM to complement their characteristics. Extensive evaluations on OTB‐2015 and VOT2017 challenge demonstrate that the proposed tracker consistently achieves favourable performance against several state‐of‐the‐art trackers and runs at 50 fps.