z-logo
Premium
Diffuse reflectance infrared spectroscopy estimates for soil properties using multiple partitions: Effects of the range of contents, sample size, and algorithms
Author(s) -
Ludwig Bernard,
Greenberg Isabel,
Sawallisch Anja,
Vohland Michael
Publication year - 2021
Publication title -
soil science society of america journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.836
H-Index - 168
eISSN - 1435-0661
pISSN - 0361-5995
DOI - 10.1002/saj2.20205
Subject(s) - mean squared error , diffuse reflectance infrared fourier transform , mathematics , soil water , coefficient of determination , partial least squares regression , calibration , linear regression , soil test , soil science , analytical chemistry (journal) , statistics , mineralogy , chemistry , environmental science , environmental chemistry , biochemistry , photocatalysis , catalysis
The RMSE of validation (RMSE V ) and ratio of the interquartile range to RMSE V (RPIQ V ) are key quality parameters in diffuse reflectance infrared (IR) spectroscopy studies, but the effects of different factors on these parameters are often not sufficiently considered. Our objectives were to reveal the effects of range of contents, sample size, data pretreatment, wavenumber region selection, and algorithms on the evaluation of IR spectra in the wavenumber range from 1,000 to 7,000 cm −1 (mid‐ and long‐wave near IR) estimations. Contents of soil organic C (SOC), N, clay, and sand and pH values were determined for surface soils of an arable field in India, and IR spectra were recorded for four samples consisting of 71–263 soils. For each of the four samples, five random partitions into calibration and validation datasets were carried out, and partial least squares regression (PLSR) or support vector machine regression was performed. A plot of the RMSE V values against the interquartile ranges of measured values for the validation samples (IQR V ) indicated that the IQR V was a key parameter for all soil properties: a sufficiently high IQR V —which is affected by sample size and random partitioning—resulted in generally good estimation accuracies (RPIQ V ≥ 2.70). Optimized data pretreatment and wavenumber region selection improved estimation accuracy for SOC and pH. Support vector machine regression was superior to PLSR for the estimation of SOC, clay, and sand, but worse for pH. Overall, this study indicates that multiple partitioning of the data is essential in IR studies and suggests that RPIQ V and RMSE V need to be interpreted in the context of the respective IQR V values.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here