
Predictive ability of machine learning methods for massive crop yield prediction
Author(s) -
Alberto González-Sanchez,
Juan Frausto–Solís,
Waldo Ojeda-Bustamante
Publication year - 2014
Publication title -
spanish journal of agricultural research
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.337
H-Index - 36
eISSN - 2171-9292
pISSN - 1695-971X
DOI - 10.5424/sjar/2014122-4439
Subject(s) - mean squared error , regression , statistics , linear regression , support vector machine , perceptron , crop yield , yield (engineering) , mathematics , regression analysis , machine learning , artificial neural network , multilayer perceptron , computer science , artificial intelligence , agronomy , materials science , metallurgy , biology
An important issue for agricultural planning purposes is the accurate yield estimation for the numerous crops involved in the planning. Machine learning (ML) is an essential approach for achieving practical and effective solutions for this problem. Many comparisons of ML methods for yield prediction have been made, seeking for the most accurate technique. Generally, the number of evaluated crops and techniques is too low and does not provide enough information for agricultural planning purposes. This paper compares the predictive accuracy of ML and linear regression techniques for crop yield prediction in ten crop datasets. Multiple linear regression, M5-Prime regression trees, perceptron multilayer neural networks, support vector regression and k-nearest neighbor methods were ranked. Four accuracy metrics were used to validate the models: the root mean square error (RMS), root relative square error (RRSE), normalized mean absolute error (MAE), and correlation factor (R). Real data of an irrigation zone of Mexico were used for building the models. Models were tested with samples of two consecutive years. The results show that M5-Prime and k-nearest neighbor techniques obtain the lowest average RMSE errors (5.14 and 4.91), the lowest RRSE errors (79.46% and 79.78%), the lowest average MAE errors (18.12% and 19.42%), and the highest average correlation factors (0.41 and 0.42). Since M5-Prime achieves the largest number of crop yield models with the lowest errors, it is a very suitable tool for massive crop yield prediction in agricultural planning