
A Light-weighted Fusion Vision Mamba for Multimodal Remote Sensing Data Classification
Author(s) -
Xin He,
Xiao Han,
Yushi Chen,
Lingbo Huang
Publication year - 2025
Publication title -
ieee journal of selected topics in applied earth observations and remote sensing
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 1.246
H-Index - 88
eISSN - 2151-1535
pISSN - 1939-1404
DOI - 10.1109/jstars.2025.3598755
Subject(s) - geoscience , signal processing and analysis , power, energy and industry applications
Recent studies have shown that Vision Mamba (VMamba) excels in long-sequence modeling capabilities, offering efficient visual representation learning. However, existing VMamba-based methods primarily focus on single modality and are not readily adaptable for multimodal data processing. In this study, we aim to leverage the power of VMamba by investigating a light-weighted fusion VMamba for multimodal remote sensing data classification. Firstly, to integrate information from various modalities, we propose a spatial and channel fusion VMamba for multimodal remote sensing classification. For spatial fusion, a two-branch state space model is constructed based on VMamba, where the parameters of each branch interact to merge the spatial information from different modalities. Regarding channel fusion, a channel fusion VMamba is introduced for multimodal remote sensing data classification, which employs a specific eigenvalue computation in the frequency domain for more effective feature fusion based on the fast Fourier transformation. Secondly, to minimize the computational cost of the fusion VMamba in multimodal remote sensing data classification, we explore a light-weighted fusion VMamba. Specifically, information from different modalities is reconstructed by adopting a skip sampling scanning scheme within VMamba, which replaces the standard scanning scheme and reduces the number of parameters in VMamba. Extensive experiments on three public multimodal remote sensing datasets have demonstrated that our proposed light-weighted fusion VMamba surpasses state-of-the-art methods in terms of classification accuracy and computational cost. For instance, the proposed light-weighted fusion VMamba achieves a 20% reduction in FLOPs compared to the standard VMamba on the Houston dataset for multimodal remote sensing classification.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom