Can statistical learning models make early selection among sugarcane families easier and still efficient? | Zendy

Moreira Édimo Fernando Alves | Zendy; Barbosa Marcio Henrique Pereira | Zendy; Peternelli Luiz Alexandre | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

Can statistical learning models make early selection among sugarcane families easier and still efficient?

Author(s) -

Moreira Édimo Fernando Alves,

Barbosa Marcio Henrique Pereira,

Peternelli Luiz Alexandre

Publication year - 2020

Publication title -

crop science

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.76

H-Index - 147

eISSN - 1435-0653

pISSN - 0011-183X

DOI - 10.1002/csc2.20334

Subject(s) - hectare , selection (genetic algorithm) , support vector machine , artificial neural network , artificial intelligence , random forest , machine learning , saccharum , logistic regression , statistics , biology , yield (engineering) , mathematics , computer science , agronomy , ecology , materials science , metallurgy , agriculture

The selection of genotypes at the early stages is one of the main challenges facing sugarcane ( Saccharum officinarum L.) breeding programs. The present work aimed to compare classification techniques, namely, logistic regression (LR), k ‐nearest neighbor (KNN), random forests (RF), and support vector machine (SVM) against the selection among families of sugarcane via artificial neural networks (ANN) and via a procrefers to the families incorrectly selected byedure based on the weighing of the plots. The data used in this work were obtained from 110 families. In the families, the number of stalks (NS), stalk diameter (SD), and stalk height (SH) were collected, in addition to the actual yield, expressed in tons of cane per hectare (TCH). We considered the NS, SD, and SH as explanatory variables for the training of the classifiers. The response used was the indicator Y = 0 if the family is not selected via TCH or Y = 1 otherwise. To increase the efficiency in training, we produced synthetic data based on the simulation of NS, SD, SH, and TCH values. Two models were also considered: a full model with all the predictors and a reduced model without the SH. We used the apparent error rate (AER) and the true positive rate (TPR) for the evaluation of the classifiers. All classifiers present low values for the AER and high values for the TPR in both models. The best performance was observed in the SVM. The reduced model should be preferred, since its performance is very close to that of the full model and its operation is more straightforward.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Accelerating Research