Premium
Evaluation of calibration subsetting and new chemometric methods on the spectral prediction of key soil properties in a data‐limited environment
Author(s) -
Clingensmith C. M.,
Grunwald S.,
Wani S. P.
Publication year - 2019
Publication title -
european journal of soil science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.244
H-Index - 111
eISSN - 1365-2389
pISSN - 1351-0754
DOI - 10.1111/ejss.12753
Subject(s) - chemometrics , partial least squares regression , calibration , soil texture , soil test , environmental science , soil science , heteroscedasticity , computer science , mathematics , statistics , soil water , machine learning
Summary Few studies have systematically investigated the effects of subsetting strategies on soil modelling or explored the potential of emergent methods from other fields not previously applied to pedometrics. This study considers smallholder agricultural villages in southern India that have been understudied in terms of chemometric modelling intended to support soil health, fertility and management. Therefore, the objective was to investigate the application of visible near‐infrared spectroscopy and chemometrics to predict soil properties in this setting. In addition, this study evaluated the effects of methods of calibration subsetting and new parametric models on the prediction of soil properties. These novel methods were transferred from the genomics field to soil science. Three strategic subsetting methods were used to produce calibration subsets that consider the variation in the soil properties, the spectra and both together; this is in addition to standard random calibration subsetting. Partial least squares regression (PLSR) and two methods from genomics that impose variable reduction were used for modelling; the latter were sparse PLSR (SPLSR) and the heteroscedastic effects model (HEM). Soil samples were collected from two villages and analysed for texture, soil carbon and available macro‐ and micro‐nutrients. The results showed that soil texture and carbon could be predicted moderately to strongly, whereas plant nutrient properties were predicted poorly to moderately. Random subsetting and subsetting by property distribution were more appropriate when spectra varied less overall, whereas subsetting that incorporates variation in spectra and properties improved results when spectral variation increased. The SPLSR and HEM models improved results over PLSR in some cases, or at least maintained prediction strength while using fewer predictors. Subsetting methods improved prediction results in 75% of cases. This study filled an important research gap by systematically studying local subsetting behaviour under different degrees of spectral and attribute variation. Highlights Explored new calibration subsetting methods and chemometric models in soil spectral modelling. Compared the methods and models for 17 soil properties in an understudied area of India. Random subsetting was not always optimal; subsetting matters and depends on data characteristics. Sparse models from genomics performed better in 75% of cases than a standard method.