Premium
Constructing a long‐term monthly climate data set in central Asia
Author(s) -
Zhou Hang,
Aizen Elena,
Aizen Vladimir
Publication year - 2017
Publication title -
international journal of climatology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.58
H-Index - 166
eISSN - 1097-0088
pISSN - 0899-8418
DOI - 10.1002/joc.5259
Subject(s) - precipitation , data set , climatology , environmental science , mean squared error , meteorology , range (aeronautics) , principal component analysis , mathematics , statistics , geography , geology , materials science , composite material
ABSTRACT We compiled and merged in situ observation data from several sources, creating a comprehensive unified monthly air temperature and precipitation data set with 457 stations in central Asia (CA). Stations with a valid data rate higher than 80% were selected, and the remaining gaps in selected station time series were filled with an iterative‐principal component analysis (PCA) gap‐filling method. The result is a gap‐filled station data set for the period 1951–2010, with 369 and 381 stations for air temperature and precipitation, respectively. The cross‐validation shows that the iterative‐PCA gap‐filling algorithm provides stable and trustworthy estimations of gaps, with mean root mean squared error (RMSE) of 0.03 °C (in the range of 0.01–0.13 °C) for air temperature, and mean RMSE of 0.60 mm (in the range of 0.10–1.99 mm) for precipitation. A gridded data set was created by interpolating the gap‐filled station data set with the geographically weighted regression method. Comparison of the gridded data set with the National Centers for Environmental Prediction (NCEP) reanalysis data set shows that though both data sets present the long‐term mean climate situation similarly, the gridded data set exhibits less annual and monthly variability. And the gridded data set has stronger correlations with stations time series than the reanalysis data set (mean correlations are 0.994 vs 0.975 for air temperature, and 0.787 vs 0.515 for precipitation), especially for precipitation in high mountain stations. The gridded data set is more suitable for climate and hydrological studies in CA, especially in high mountains regions.