z-logo
open-access-imgOpen Access
Cross-modality Consistency Network for Remote Sensing Text-image Retrieval
Author(s) -
Yuchen Sha,
Yujian Feng,
Miao He,
Yichi Jin,
Shuai You,
Yimu Ji,
Fei Wu,
Shangdong Liu,
Shaoshuai Che
Publication year - 2025
Publication title -
ieee journal of selected topics in applied earth observations and remote sensing
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 1.246
H-Index - 88
eISSN - 2151-1535
pISSN - 1939-1404
DOI - 10.1109/jstars.2025.3586914
Subject(s) - geoscience , signal processing and analysis , power, energy and industry applications
Remote Sensing Cross-modality Text-Image Retrieval (RSCTIR) aims to retrieve a specific object from a large image gallery based on a natural language description, and vice versa. Existing methods mainly capture local and global context information within each modality for cross-modality matching. However, these methods are prone to interference from redundant information, such as background noises and irrelevant words, and neglect the capture of co-occurrence semantic relations between modalities ( i.e. , the probability of semantic information co-occurring with other information). To filter out intra-modality redundant information and capture inter-modality co-occurrent relations, we propose a Cross-modality Consistency Network (CCNet) including a Text-image Attention-conditioned Module (TAM) and a Co-occurrent Features Module (CFM). Firstly, TAM interacts with visual and textual feature representations by employing the cross-modality attention mechanism to focus on semantically similar fine-grained image features and then generate aggregated visual representations. Secondly, CFM is designed to estimate co-occurrence probability by measuring fine-grained feature similarity, thereby reinforcing the relations of target-consist features across modalities. In addition, we propose the Cross-modality Distinction (CD) loss function to learn semantic consistency between modalities by compacting intra-class samples and separating inter-class samples. Extensive benchmark experiments on three benchmarks demonstrate that our approach outperforms state-of-the-art methods.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom