z-logo
open-access-imgOpen Access
RFTransUNet: Res-Feature cross Vision Transformer-Based UNet for Building Extraction from High Resolution Remote Sensing Images
Author(s) -
Xiufang Zhou,
Zhuotao Liu,
Xunqiang Gong,
Shunan Qin,
Tieding Lu,
Yuting Wan,
Ailong Ma,
Yanfei Zhong
Publication year - 2025
Publication title -
ieee journal of selected topics in applied earth observations and remote sensing
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 1.246
H-Index - 88
eISSN - 2151-1535
pISSN - 1939-1404
DOI - 10.1109/jstars.2025.3618110
Subject(s) - geoscience , signal processing and analysis , power, energy and industry applications
As the core carriers of human activities, buildings represent not only the fundamental components of urban spatial structures but also serve critical functions in global resource management, urban planning decisions, disaster risk assessment, and the monitoring of sustainable development. Consequently, they constitute significant features with substantial value in the analysis and application of remote sensing imagery. The issues of mistake and omission extraction and blurred margins that are caused by the UNet's insufficient utilization of features in different scales prompt the proposal of an improved UNet, i.e., RFTransUNet, which is in support of the Feature Cross Transformer block based on residual network and Vision Transformer. This net, based on UNet, takes residual blocks as the network backbone, uses the FTrans block in the skip connection part to conduct multi-scale feature fusion, and adopts the Feature Pyramid Network as deep supervision in training. Among them, the encoder and decoder based on residual blocks can better retain semantic information when extracting detailed features of images, the FTrans block fuses shallow detailed information and deep semantic information, and the FPN introduces reference labels to each layer of the network during training. The contrast experiments, aiming at verifying the proposed method, are conducted on two publicly available datasets and a self-built dataset. Versus other comparative methods, the proposed method has clearer and more accurate extracted results with inconspicuous error extraction and better marginal maintenance. The Intersection of Union (IoU) of the public satellite and aerial imagery datasets and the self-built UAV imagery dataset achieves 71.7862%, 90.6190%, and 84.7210%, respectively. The link to the code and dataset is available at https://github.com/RSIDEA-ECUT/RFTransUNet.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom