z-logo
open-access-imgOpen Access
A Remote Sensing Semantic Self-Supervised Segmentation Model Integrating Local Sensitivity and Global Invariance
Author(s) -
Buxun Zhang,
Xiaoyan Guo,
Sen Yang
Publication year - 2025
Publication title -
ieee journal of selected topics in applied earth observations and remote sensing
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 1.246
H-Index - 88
eISSN - 2151-1535
pISSN - 1939-1404
DOI - 10.1109/jstars.2025.3572960
Subject(s) - geoscience , signal processing and analysis , power, energy and industry applications
Self-supervised semantic segmentation is a crucial approach for addressing the issue of insufficient labeled data. However, traditional self-supervised learning methods designed for natural images are often unsuitable for remote sensing images, as they struggle to capture both local and global information simultaneously. In response to this challenge, this paper proposes a self-supervised learning framework, MSCSL, tailored to the characteristics of remote sensing images. This framework combines the strengths of Contrastive Learning and Masked Image Modeling, focusing on both global feature information invariant to augmentations and the rich internal detail of the images. To fully exploit the self-supervised signals from the original image, a parallel self-supervised signal generation module is introduced, updating the network from three perspectives: masked image reconstruction, semantic slot matching contrast, and global view alignment. This enables the model to learn feature representations that balance local sensitivity and global invariance in the semantically complex remote sensing images. To mitigate the problem of small object loss during masking operations, a portion of the original pixels is retained. Furthermore, semantic grouping is incorporated to improve the differentiation of various semantic features, making MSCSL more suitable for the characteristics of remote sensing images, which include a large number of small objects and complex semantics. Experimental results demonstrate that MSCSL achieves mIoU values of 80.49% and 75.32% on the GID and DeepGlobe-Land-Cover datasets, respectively. Compared with BYOL, Barlow-Twins, MoCo-v2, MAE, SimMIM, CMID, and IndexNet, MSCSL outperforms these methods across all three evaluation metrics—mIoU, aACC, and mF1—and achieves optimal performance even with limited data.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here