z-logo
open-access-imgOpen Access
Nearest Neighbour vs. Regression approach: Effect of performance measures, calibration set size, and sampling method on Soil Organic Carbon Prediction using VNIR lab spectroscopy
Author(s) -
Chirag Rajendra Ternikar,
Cecile Gomez,
Debsunder Dutta,
D. Nagesh Kumar
Publication year - 2025
Publication title -
ieee journal of selected topics in applied earth observations and remote sensing
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 1.246
H-Index - 88
eISSN - 2151-1535
pISSN - 1939-1404
DOI - 10.1109/jstars.2025.3615516
Subject(s) - geoscience , signal processing and analysis , power, energy and industry applications
Soil Organic Carbon (SOC) plays a critical role in soil health, agricultural productivity, and ecosystem functioning, making accurate SOC estimations essential for sustainable land management and climate change mitigation. Visible and NearInfrared (VNIR) spectroscopy has emerged as a promising, non-destructive, and cost-effective method for SOC estimation. This study evaluates the performance of nine Nearest Neighbour (NN) models and the Partial Least Squares Regression (PLSR) model to estimate SOC using the global Open Soil Spectral Library (OSSL) data. Detailed error analyses and the use of Mean Absolute Error (MAE) as performance metric revealed differences in model performance that traditional metrics like R 2 , RMSE, and RPD alone fail to capture. Error correlation analysis further indicated that o_plsd (optimised partial least squares distance, one of the NN models) and PLSR provide structurally independent insights, while certain pairs of NN models (pcad – plsd and o_plsd – o_pcad) yield redundant information. Among the ten models tested, o_plsd model outperformed PLSR by leveraging local data density, exhibiting lower MAE (1.79% vs. 2.36%) but was more sensitive to reduction in calibration set size. In contrast, PLSR demonstrated better generalizability with less sensitivity to calibration size variation, but relatively higher sensitivity to the choice of sampling method. Future research should focus on strategies to improve computational efficiency of NN models. The findings highlight the importance of performance metric selection and calibration strategy in large-scale SOC modeling. These results have practical implications for improving SOC prediction models and designing efficient hybrid approaches for large, heterogeneous soil datasets

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom