
Multitrait machine‐ and deep‐learning models for genomic selection using spectral information in a wheat breeding program
Author(s) -
Sandhu Karansher,
Patil Shruti Sunil,
Pumphrey Michael,
Carter Arron
Publication year - 2021
Publication title -
the plant genome
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.403
H-Index - 41
ISSN - 1940-3372
DOI - 10.1002/tpg2.20119
Subject(s) - artificial intelligence , machine learning , breeding program , multilayer perceptron , deep learning , random forest , plant breeding , selection (genetic algorithm) , biology , genetic gain , bayesian probability , model selection , computer science , artificial neural network , agronomy , genetic variation , genetics , gene , cultivar
Prediction of breeding values is central to plant breeding and has been revolutionized by the adoption of genomic selection (GS). Use of machine‐ and deep‐learning algorithms applied to complex traits in plants can improve prediction accuracies. Because of the tremendous increase in collected data in breeding programs and the slow rate of genetic gain increase, it is required to explore the potential of artificial intelligence in analyzing the data. The main objectives of this study include optimization of multitrait (MT) machine‐ and deep‐learning models for predicting grain yield and grain protein content in wheat ( Triticum aestivum L.) using spectral information. This study compares the performance of four machine‐ and deep‐learning‐based unitrait (UT) and MT models with traditional genomic best linear unbiased predictor (GBLUP) and Bayesian models. The dataset consisted of 650 recombinant inbred lines (RILs) from a spring wheat breeding program grown for three years (2014–2016), and spectral data were collected at heading and grain filling stages. The MT‐GS models performed 0–28.5 and −0.04 to 15% superior to the UT‐GS models. Random forest and multilayer perceptron were the best performing machine‐ and deep‐learning models to predict both traits. Four explored Bayesian models gave similar accuracies, which were less than machine‐ and deep‐learning‐based models and required increased computational time. Green normalized difference vegetation index (GNDVI) best predicted grain protein content in seven out of the nine MT‐GS models. Overall, this study concluded that machine‐ and deep‐learning‐based MT‐GS models increased prediction accuracy and should be employed in large‐scale breeding programs.