z-logo
open-access-imgOpen Access
SFobNet: An Improved Swin Transformer Integrating Urban Functional Zones for Object-Level Building Height Estimation from Sentinel-2 Images
Author(s) -
Wenxuan Bao,
Yinyin Dou,
Wenhui Kuang,
Changqing Guo,
Zhishou Wei,
Yali Hou,
Zherui Yin,
Hongger
Publication year - 2025
Publication title -
ieee journal of selected topics in applied earth observations and remote sensing
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 1.246
H-Index - 88
eISSN - 2151-1535
pISSN - 1939-1404
DOI - 10.1109/jstars.2025.3619085
Subject(s) - geoscience , signal processing and analysis , power, energy and industry applications
Building height is a critical parameter in modeling urban spatial structures, serving as an essential basis for interpreting urban form and evaluating spatial efficiency. Despite the remarkable progress in building height estimation, most existing methods still rely on Convolutional Neural Network (CNN) and perform pixel-level estimation using physical features extracted from remote sensing imagery. However, these approaches often struggle to capture global structural patterns, fail to represent height heterogeneity at the individual building level, and overlook the intrinsic relationship between building functional types and height. To address these issues, this study proposes a novel deep learning method called the improved Swin Transformer integrated with urban Functional zones for Object-level Building height estimation Network (SFobNet). The proposed method utilizes an improved Swin Transformer to accurately extract the local and global features of buildings, effectively reducing systematic bias through the integration of urban functional zones, thereby achieving consistent representation of height information at the individual building level. The experimental results indicated that SFobNet achieved superior validation accuracy in Beijing, with R 2 = 0.7155 and RMSE = 8.0889 m, reducing error by 9.4% compared with the state-of-the-art SEASONet and showing clear advantages over other baseline models. Cross-city evaluations on Tianjin and Shijiazhuang further confirmed its generalization performance, achieving R 2 = 0.5058 and RMSE = 11.1161 m, while consistently outperforming SEASONet. Ablation experiments further verified the effectiveness of the proposed method in addressing the aforementioned challenges. In conclusion, SFobNet significantly enhances the precision and robustness of object-level building height estimation, offering a particularly promising and solid methodological foundation for future large-scale urban 3D morphological reconstruction.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom