Premium
Insights Into Preferential Flow Snowpack Runoff Using Random Forest
Author(s) -
Avanzi Francesco,
Johnson Ryan Curtis,
Oroza Carlos A.,
Hirashima Hiroyuki,
Maurer Tessa,
Yamaguchi Satoru
Publication year - 2019
Publication title -
water resources research
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.863
H-Index - 217
eISSN - 1944-7973
pISSN - 0043-1397
DOI - 10.1029/2019wr024828
Subject(s) - snowpack , snow , lysimeter , surface runoff , environmental science , snowmelt , atmospheric sciences , standard deviation , meteorology , climatology , soil science , geology , mathematics , geography , statistics , soil water , ecology , biology
Abstract Using 12 seasons of data from a multicompartment snow lysimeter and a statistical learning algorithm (Random Forest), we investigated to what extent preferential flow snowpack runoff can be predicted from concurrent weather and snow conditions, as well as the relative importance of factors affecting this process. We found that preferential flow development can be partially predicted based on concurrent weather and snow conditions. In this case study where snow is generally wet and coarse, the most important predictors of standard and maximum deviation from mean spatial snowpack runoff are related to weather inputs and their interaction with the snowpack (rainfall, longwave radiation, and snow‐surface temperature) and to more season‐specific snow properties (number of macroscopic snow layers and snowfall days to date, the latter being a feature we included to account for microstructural heterogeneity developing at smaller scales than macroscopic layers). This combination between weather and season‐specific snow factors and the fact that several of these important features are correlated with other processes result in significant seasonal variability of the Random Forest algorithm's accuracy. All versions of the Random Forest algorithm underestimated seasonal peaks in preferential flow, which points to these peaks being either undersampled in our data set or caused by poorly understood redistribution processes acting at larger spatial scales than the size of our multicompartment lysimeter (e.g., dimples).