z-logo
open-access-imgOpen Access
Pillars-SCANet: A 3D Object Detection Algorithm Integrating Multi-Head Spatial and Channel Attention with Feature Pyramid
Author(s) -
Hao Jiang,
Ge Peng,
Xin Wang,
He Huang,
Junxing Yang
Publication year - 2025
Publication title -
ieee access
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 0.587
H-Index - 127
eISSN - 2169-3536
DOI - 10.1109/access.2025.3596703
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
Point cloud 3D object detection technology has increasingly gained attention due to its precision in rendering three-dimensional environments essential for autonomous driving. However, prevalent detection methods demonstrate limited adaptability to variable target scales, leading to inadequate detection across different target types. Furthermore, voxel-based methods, which are commonly adopted to accelerate detection speeds, convert point clouds into voxels or pillars. This transformation often neglects the disparity in the receptive fields horizontally and vertically during the generation of 2D pseudo-images by pillars. To mitigate these limitations, this study introduces the Pillars-SCANet, a model equipped with a novel multi-scale feature extraction network and an adaptive attention feature fusion network. The former employs grouped residual attention modules that stra-tegically balance the receptive fields horizontally and vertically within the Pillar encoding module. It methodically progresses through four stages, each deepening the analysis to construct a comprehensive multi-level feature pyramid. The latter network enhances the model’s adaptability to various target sizes by guiding features across both channel and spatial dimensions. Extensive experimental results indicate that Pillars-SCANet opti-mally balances inference speed and detection accuracy. The innovative design of its mod-ules contributes to a parameter count of only 6.63M, achieving an inference speed of 24 FPS. Evaluated on the KITTI dataset, Pillars-SCANet attains mean average precisions (mAP) of 69.85%, 62.27%, and 71.68% in BEV, 3D detection box, and AOS benchmarks, respectively. These results represent improvements of 3.66%, 3.07%, and 2.82% over the Pointpillars network. On the Waymo dataset, the Pillars-SCANet model achieved mean Average Precision (mAP) of 72.3% and mean Average Precision weighted by Heading (mAPH) of 69.6%, representing improvements of 9.5% and 11.8% respectively compared to the PointPillars network.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom