Premium
Modeling Depth of the Redox Interface at High Resolution at National Scale Using Random Forest and Residual Gaussian Simulation
Author(s) -
Koch Julian,
Stisen Simon,
Refsgaard Jens C.,
Ernstsen Vibeke,
Jakobsen Peter R.,
Højberg Anker L.
Publication year - 2019
Publication title -
water resources research
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.863
H-Index - 217
eISSN - 1944-7973
pISSN - 0043-1397
DOI - 10.1029/2018wr023939
Subject(s) - kriging , random forest , residual , variance (accounting) , gaussian , gaussian process , computer science , environmental science , spatial correlation , statistics , data mining , mathematics , algorithm , artificial intelligence , machine learning , physics , business , accounting , quantum mechanics
Abstract The management of water resources needs robust methods to efficiently reduce nitrate loads. Knowledge on where natural denitrification takes place in the subsurface is thereby essential. Nitrate is naturally reduced in anoxic environments and high‐resolution information of the redox interface, that is, the depth of the uppermost reduced zone is crucial to understand the variability of the denitrification potential. In this study we explore the opportunity to use random forest (RF) regression to model redox depth across Denmark at 100‐m resolution based on ~13,000 boreholes as training data. We highlight the importance of expert knowledge to guide the RF model in areas where our conceptual understanding is not represented correctly in the training data set by addition of artificial observations. We apply random forest regression kriging in which sequential Gaussian simulation models the RF residuals. The RF model reaches a R 2 score of 0.48 for an independent validation test. Including sequential Gaussian simulation honors observations through local conditioning, and the spread of 800 realizations can be utilized to map uncertainty. Emphasis is put on adequate handling of nonstationarities in variance and spatial correlation of the RF residuals. The RF residuals show no spatial correlation for large parts of the modeling domain, and a local variance scaling method is applied to account for the nonstationary variance. Moreover, we present and exemplify a framework where newly acquired field data can easily be integrated into random forest regression kriging to quickly update local models.