z-logo
open-access-imgOpen Access
Swin Transformer with Spatial and Local Context Augmentation for Enhanced Semantic Segmentation of Remote Sensing Images
Author(s) -
Rong-Xing Ding,
Yi-Han Xu,
Gang Yu,
Wen Zhou,
Ding Zhou
Publication year - 2025
Publication title -
ieee open journal of signal processing
Language(s) - English
Resource type - Magazines
eISSN - 2644-1322
DOI - 10.1109/ojsp.2025.3573202
Subject(s) - signal processing and analysis
Semantic segmentation of remote sensing images is extensively used in crop cover and type analysis, and environmental monitoring. In the semantic segmentation of remote sensing images, owning to the specificity of remote sensing images, not only the local context is required, but also the global context information makes an important role in it. Inspired by the powerful global modelling capability of Swin Transformer, we propose the LSENet network, which follows the encoder-decoder architecture of the UNet network. In encoding phase, we propose spatial enhancement module (SEM), which helps Swin Transformer further enhance feature extraction by encoding spatial information. In decoding stage, we propose local enhancement module (LEM), which is embedded in the Swin Transformer to improve the Swin Transformer to assist the network to obtain more local semantic information so as to classify pixels more accurately, especially in the edge region, the adding of LEM enables to obtain smoother edges. The experimental results on the Vaihingen and Potsdam datasets demonstrate the effectiveness of our proposed method. Specifically, the mIoU metric is 78.58% on the Potsdam dataset, 72.59% on the Vaihingen dataset and 64.49% on the OpenEarthMap dataset.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here