A Remote Sensing Semantic Self-Supervised Segmentation Model Integrating Local Sensitivity and Global Invariance | Zendy

Buxun Zhang | Zendy; Xiaoyan Guo | Zendy; Sen Yang | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

A Remote Sensing Semantic Self-Supervised Segmentation Model Integrating Local Sensitivity and Global Invariance

Author(s) -

Buxun Zhang,

Xiaoyan Guo,

Sen Yang

Publication year - 2025

Publication title -

ieee journal of selected topics in applied earth observations and remote sensing

Language(s) - English

Resource type - Magazines

SCImago Journal Rank - 1.246

H-Index - 88

eISSN - 2151-1535

pISSN - 1939-1404

DOI - 10.1109/jstars.2025.3572960

Subject(s) - geoscience , signal processing and analysis , power, energy and industry applications

Self-supervised semantic segmentation is a crucial approach for addressing the issue of insufficient labeled data. However, traditional self-supervised learning methods designed for natural images are often unsuitable for remote sensing images, as they struggle to capture both local and global information simultaneously. In response to this challenge, this paper proposes a self-supervised learning framework, MSCSL, tailored to the characteristics of remote sensing images. This framework combines the strengths of Contrastive Learning and Masked Image Modeling, focusing on both global feature information invariant to augmentations and the rich internal detail of the images. To fully exploit the self-supervised signals from the original image, a parallel self-supervised signal generation module is introduced, updating the network from three perspectives: masked image reconstruction, semantic slot matching contrast, and global view alignment. This enables the model to learn feature representations that balance local sensitivity and global invariance in the semantically complex remote sensing images. To mitigate the problem of small object loss during masking operations, a portion of the original pixels is retained. Furthermore, semantic grouping is incorporated to improve the differentiation of various semantic features, making MSCSL more suitable for the characteristics of remote sensing images, which include a large number of small objects and complex semantics. Experimental results demonstrate that MSCSL achieves mIoU values of 80.49% and 75.32% on the GID and DeepGlobe-Land-Cover datasets, respectively. Compared with BYOL, Barlow-Twins, MoCo-v2, MAE, SimMIM, CMID, and IndexNet, MSCSL outperforms these methods across all three evaluation metrics—mIoU, aACC, and mF1—and achieves optimal performance even with limited data.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore