z-logo
open-access-imgOpen Access
Weighted feature fusion network based on large kernel convolution and Transformer for multi-modal remote sensing image segmentation
Author(s) -
Jianxia Wang,
Shaozu Qiu,
Jia Cai,
Xiaoming Zhang
Publication year - 2025
Publication title -
ieee access
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 0.587
H-Index - 127
eISSN - 2169-3536
DOI - 10.1109/access.2025.3598116
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
The heterogeneity and complexity of multi-modal data in high-resolution remote sensing images posed a severe challenge to existing cross-modal networks that aim to fuse complementary information of high-resolution optical and elevation data information (DSM) to achieve accurate semantic segmentation. To solve this problem, a weighted feature fusion network based on large kernel convolution and Transformer (LTFCNet) was proposed. The model uses two parallel encoders to extract the features of different modalities, an improved cross-fusion module to enhance the encoder’s feature extraction capability, and a gate module based on large kernel and Transformer to achieve multi-modal fusion. Finally, a Difference information Feature Fusion Module (DFFM) leveraging attention to differential regions is used to achieve cross-level feature fusion and enhance small object detection. To evaluate the network, we compare it with several state-of-the-art models (SOTA), using the Potsdam and Vaihingen datasets. The experimental results demonstrate that the proposed model outperforms other SOTA models by approximately 2% in the mIoU metric, validating its effectiveness in multi-modal feature fusion.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom