Side-scan Sonar Image Synthesis with CycleGAN Enhanced by Shape-Adaptive Convolution and Multi-Scale Hybrid Attention
Author(s) -
Yanjie Wang,
Yunfei Chen,
Dapeng Yu,
Zhen Zhang
Publication year - 2025
Publication title -
ieee access
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 0.587
H-Index - 127
eISSN - 2169-3536
DOI - 10.1109/access.2025.3614197
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
Large amounts of side-scan sonar (SSS) data are essential for training effective underwater object detection models. However, conducting real-world SSS surveys is both complex and costly, while the limited availability of open-source data hinders the training and performance enhancement of these models. To overcome the challenge of insufficient SSS data, we propose an enhanced Cycle-Consistent Generative Adversarial Network (CycleGAN) framework, integrating Shape-Adaptive Convolution (SAConv) and Multi-Scale Hybrid Attention (MSHA) modules to generate SSS images. Specifically, SAConv improves structural consistency and contour integrity of the synthesized images by adapting convolutional kernels to object shapes, while MSHA enhances texture fidelity and spatial detail by capturing both local and global contextual information at multiple scales. Experimental results demonstrate that the generated sonar images outperform the baseline CycleGAN, achieving improvements of 8.2% in PSNR, 30.2% in SSIM, 3.8% in LPIPS, 5.1% in FID, and 29.9% in KID. Furthermore, incorporating the generated images into the training set of theYOLOv12 detector resulted in improvements of 7.5% in precision, 11.4% in recall, 1.0% in mAP@0.5, and 9.0% in mAP@0.5:0.95. These results confirm the effectiveness of the proposed framework in generating high-quality SSS data to support downstream perception tasks.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom