Premium
Machine‐Learning Variables at Different Scales vs. Knowledge‐based Variables for Mapping Multiple Soil Properties
Author(s) -
Shi Jingjing,
Yang Lin,
Zhu A-Xing,
Qin Chengzhi,
Liang Peng,
Zeng Canying,
Pei Tao
Publication year - 2018
Publication title -
soil science society of america journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.836
H-Index - 168
eISSN - 1435-0661
pISSN - 0361-5995
DOI - 10.2136/sssaj2017.11.0392
Subject(s) - scale (ratio) , silt , topsoil , soil map , digital soil mapping , mathematics , variables , soil science , environmental science , statistics , soil water , geography , cartography , geology , paleontology
Core Ideas Explored the impact of variables selected by different means on mapping at a watershed scale area. Identified influential topographic variables with appropriate scales for mapping soil properties. Finding the appropriate scales of local attributes is helpful for mapping soil properties accurately. Local attribute scale helped map accuracy vs. use of local and regional attributes at single scale. The influential environmental variables and their appropriate scales for mapping different soil properties are usually different. Comparisons between variables selected using machine learning and knowledge‐based approaches and their impacts for soil mapping are necessary to provide guidelines on selecting environmental variables. This study compared topographic variables with single and multiple scales selected for five soil properties (topsoil clay content, sand content, silt content, topsoil organic matter content (SOM), and soil depth) using a recursive feature elimination (RFE) algorithm and variables selected based on expert knowledge on mapping. One hundred and seventy‐three variables were generated including five single‐scale variables derived with a 3 × 3 neighborhood size and seven multi‐scale variables with various neighborhood sizes. Subsets 1 and 2 were selected from single‐scale variables (Pool 1) and all variables (Pool 2), respectively. A reference Subset 3 was selected based on soil pedogenesis knowledge. Results showed that variables in Subset 1 included both local and regional terrain attributes, but were considerably different to the reference variables. When considering scales, local variables at specific scales became the most important variables. Mapping accuracies using Subset 2 were the highest for all soil properties. This indicates that finding the appropriate scales of local attributes and using common local attributes with appropriate scales is more helpful than using both local and regional attributes at single scale for mapping soil properties accurately. The RFE approach is efficient for selection of variables. Also, the input environmental variables for selection determine whether the selected variables perform better than all input variables without selection for predicting soil properties.