z-logo
open-access-imgOpen Access
Object detection based on RGC mask R‐CNN
Author(s) -
Wu Minghu,
Yue Hanhui,
Wang Juan,
Huang Yongxi,
Liu Min,
Jiang Yuhan,
Ke Cong,
Zeng Cheng
Publication year - 2020
Publication title -
iet image processing
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.401
H-Index - 45
eISSN - 1751-9667
pISSN - 1751-9659
DOI - 10.1049/iet-ipr.2019.0057
Subject(s) - pascal (unit) , artificial intelligence , computer science , pattern recognition (psychology) , backbone network , overfitting , convolutional neural network , object detection , pyramid (geometry) , feature (linguistics) , minimum bounding box , computer vision , artificial neural network , image (mathematics) , mathematics , computer network , linguistics , philosophy , geometry , programming language
Object detection is a crucial topic in computer vision. Mask Region‐Convolution Neural Network (R‐CNN) based methods, wherein a large intersection over union (IoU) threshold is chosen for high quality samples, have often been employed for object detection. However, the detection performance of such methods deteriorates when samples are reduced. To address this, the authors propose an improved Mask R‐CNN‐based method: the ResNet Group Cascade (RGC) Mask R‐CNN. First, they compared ResNet with different layers, finding that ResNeXt‐101‐64 × 4d is superior to other backbone networks. Secondly, during the training of the test model, the performance of Mask R‐CNN suffered from a small batch processing scale, resulting in inaccurately calculated mean and variance; thus, group normalisation was added to the backbone, feature pyramid network neck and bounding box head of the network. Finally, the higher the intersection of Mask R‐CNN than the threshold, the easier it is to obtain high‐quality samples. However, blindly selecting a high threshold leads to sample reduction and overfitting. Thus, a proposed cascade network configuration with three IoU thresholds was utilised in the process of model training. The model was trained and tested on the COCO and PASCAL VOC07 datasets. Their proposed algorithm demonstrated superior performance compared to that of the Mask R‐CNN.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here