
Integration Between Cascade Region-Based Convolutional Neural Network and Bi-Directional Feature Pyramid Network for Live Object Tracking and Detection
Author(s) -
Zhong Le-hai,
Jiao Li,
Feifan Zhou,
Xiaoan Bao,
Weiyin Xing,
Zhengyong Han,
Jie Luo
Publication year - 2021
Publication title -
traitement du signal/ts. traitement du signal
Language(s) - English
Resource type - Journals
eISSN - 1958-5608
pISSN - 0765-0019
DOI - 10.18280/ts.380437
Subject(s) - cascade , artificial intelligence , computer science , pyramid (geometry) , convolutional neural network , object detection , pattern recognition (psychology) , feature (linguistics) , computer vision , tracking (education) , feature extraction , video tracking , frame (networking) , object (grammar) , mathematics , engineering , telecommunications , psychology , pedagogy , linguistics , philosophy , geometry , chemical engineering
The current target tracking and detection algorithms often have mistakes and omissions when the target is occluded or small. To overcome the defects, this paper integrates bi-directional feature pyramid network (BiFPN) into cascade region-based convolutional neural network (R-CNN) for live object tracking and detection. Specifically, the BiFPN structure was utilized to connect between scales and fuse weighted features more efficiently, thereby enhancing the network’s feature extraction ability, and improving the detection effect on occluded and small targets. The proposed method, i.e., Cascade R-CNN fused with BiFPN, was compared with target detection algorithms like Cascade R-CNN and single shot detection (SSD) on a video frame dataset of wild animals. Our method achieved a mean average precision (mAP) of 91%, higher than that of SSD and Cascade R-CNN. Besides, it only took 0.42s for our method to detect each image, i.e., the real-time detection was realized. Experimental results prove that the proposed live object tracking and detection model, i.e., Cascade R-CNN fused with BiFPN, can adapt well to the complex detection environment, and achieve an excellent detection effect.