
A Machine Learning Technique for Spatial Interpolation of Solar Radiation Observations
Author(s) -
Leirvik Thomas,
Yuan Menghan
Publication year - 2021
Publication title -
earth and space science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.843
H-Index - 23
ISSN - 2333-5084
DOI - 10.1029/2020ea001527
Subject(s) - standard deviation , interpolation (computer graphics) , multivariate interpolation , random forest , data set , variable (mathematics) , set (abstract data type) , standard error , mathematics , statistics , computer science , environmental science , remote sensing , artificial intelligence , geology , mathematical analysis , motion (physics) , bilinear interpolation , programming language
This study applies statistical methods to interpolate missing values in a data set of radiative energy fluxes at the surface of Earth. We apply Random Forest (RF) and seven other conventional spatial interpolation models to a global Surface Solar Radiation (SSR) data set. We apply three categories of predictors: climatic, spatial, and time series variables. Although the first category is the most common in research, our study shows that it is actually the last two categories that are best suited to predict the response. In fact, the best spatial variable is almost 40 times more important than the best climatic variable in predicting SSR. Furthermore, the 10‐fold cross validation shows that the RF has a Mean Absolute Error (MAE) of 10.2 Wm −2 and a standard deviation of 1.5 Wm −2 . On the other hand, the average MAE of the conventional interpolation methods is 21.3 Wm −2 , which is more than twice as large as the RF method, in addition to an average standard deviation of 6.4 Wm −2 , which is more than four times larger than the RF standard deviation. This highlights the benefits of using machine learning in environmental research.