z-logo
open-access-imgOpen Access
HiTrans-SAM: Hierarchical Transformer Encoder and SAM-Augmented Inputs for Multi-Scale Remote Sensing Image Segmentation
Author(s) -
Yulian Li,
Jiyang Gao,
Yikang Du,
Yuxuan Xiao,
Zhengjie Gao,
Haitao Huang
Publication year - 2025
Publication title -
ieee access
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 0.587
H-Index - 127
eISSN - 2169-3536
DOI - 10.1109/access.2025.3617388
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
Semantic segmentation of remote sensing images is challenging due to complex scenes, substantial variations in object scales, and ambiguous boundaries. In this study, we propose a novel method, HiTrans-SAM: Hierarchical Transformer Encoder and SAM-Augmented Inputs for Multi-Scale Remote Sensing Image Segmentation. The framework adopts an encoder-decoder architecture. First, prior to encoding, the input image is enhanced using SAM to incorporate boundary prior maps generated by SAM, thereby mitigating boundary ambiguity. Subsequently, a Hierarchical Transformer Encoder is integrated into the encoding network to facilitate information propagation. This module captures high-resolution spatial details while effectively leveraging global contextual relationships. During the decoding phase, multi-scale feature fusion is performed to ensure comprehensive utilization of features across varying scales, ultimately improving segmentation accuracy. Experiments on the LoveDA and Potsdam datasets demonstrate state-of-the-art performance, achieving mean Intersection over Union (mIoU) values of 53.52% (LoveDA), 79.45% (Potsdam) and 75.12%(Vaihingen), significantly outperforming existing methods. The results validate the algorithm’s efficacy in enhancing segmentation accuracy through boundary refinement, context modeling, and multi-scale feature fusion.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom