Lightweight Spatial Sliced-Concatenate-Multireceptive-Field Enhance and Joint Channel Attention Mechanism for Infrared Object Detection | Zendy

Zhiheng Pan | Zendy; Liuchao Xu | Zendy; Chuandong Liang | Zendy; Kui Pan | Zendy; Mi Zhao | Zendy; Min Lu | Zendy

Open Access

Lightweight Spatial Sliced-Concatenate-Multireceptive-Field Enhance and Joint Channel Attention Mechanism for Infrared Object Detection

Author(s) -

Zhiheng Pan,

Liuchao Xu,

Chuandong Liang,

Kui Pan,

Mi Zhao,

Min Lu

Publication year - 2022

Publication title -

ieee access

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.587

H-Index - 127

ISSN - 2169-3536

DOI - 10.1109/access.2022.3172504

Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation

Infrared object detection has high application value in the field of remote sensing due to its anti-interference ability and long detection distance. However, infrared images suffer from many disadvantages such as poor fine-grained information, low resolution and contrast, which makes infrared object detection methods have rather poor performance while utilizing conventional object detection methods. Two novel lightweight attention mechanisms were proposed in this study to solve the problem. Sliced concatenate and multi receptive-field spatial group-wise enhance (SCMR-SGE) module, utilizing grouping feature operation, enhances the sub-features by generating attention factors at each location in each semantic group and suppresses irrelevant information. Joint attention module is used to selectively enhance or inhibit channel information through attention factors generated by three different pooling layers. Unlike the previous work, each module was used only once, and was embed into two modules into feature pyramid network (FPN) instead of backbone network. The mAP50 of our method based on YOLOv5m alone reached 82.7%, which was the best result on the original FLIR dataset which didn’t process the imbalanced sample problem. At the same time, the detection speed can still be maintained at around 60 FPS on single GPU. Our experiments demonstrated that our lightweight attention mechanisms have better performance than mainstream ones, and the method of embedding our attention mechanisms into the CNN is effective and universal.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore