Premium
Soil Organic Carbon Predictions by Airborne Imaging Spectroscopy: Comparing Cross‐Validation and Validation
Author(s) -
Stevens Antoine,
Miralles Isabel,
Wesemael Bas
Publication year - 2012
Publication title -
soil science society of america journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.836
H-Index - 168
eISSN - 1435-0661
pISSN - 0361-5995
DOI - 10.2136/sssaj2012.0054
Subject(s) - cross validation , mean squared error , calibration , extrapolation , environmental science , remote sensing , field (mathematics) , statistics , soil carbon , soil science , computer science , mathematics , soil water , geology , pure mathematics
Soil organic carbon (SOC) is considered to influence important processes affecting soil, air, and water quality. The management of this valuable resource could be assisted by remote sensing techniques able to provide high‐resolution spatial estimates of SOC. Such estimations are usually based on empirical regressions that are likely to have poor extrapolation abilities and hence it is important to properly estimate their accuracy in unsampled fields. Based on an imaging spectroscopy image acquired over the Luxembourg (c. 420 km 2 ), several multivariate calibration models (partial least square [PLSR], penalized‐spline signal [PSR], and support vector machine [SVMR] regressions) were developed to predict SOC content of topsoil bare agricultural fields and compared. The performance of the models was evaluated by means of cross‐validation ( k ‐fold[KFO], leave‐one‐out [LOO], leave‐one‐group‐out [LOGO], and leave‐one‐field‐out [LOFO]) and these estimates were compared with model performance obtained by validation. The validation set excluded the fields used in the training set, to provide realistic measures of prediction error in unsampled fields. All cross‐validation techniques, except LOFO, strongly underestimate validation error. In large areas, training samples are often not a representative subset of the soil and spectral variation. Leave‐one‐field‐out cross‐validation, by repeatedly leaving samples belonging to one field out of the calibration, better simulates model error at unknown locations than other cross‐validation strategies. The root mean square error (RMSE) of the best models, obtained with a stringent validation procedure (leave‐fields‐out), was equal to 4.7 g C kg −1 . This is higher than most of previous studies using imaging spectroscopy for SOC prediction, suggesting that measures of accuracy obtained by KFO, LOO, and LOGO are likely over‐optimistic in large areas. Finally, a SOC content map for the topsoil of croplands was produced that may assist soil monitoring and/or management efforts in this region in the future.