
Training Population Optimization for Genomic Selection
Author(s) -
Berro Inés,
Lado Bettina,
Nalin Rafael S.,
Quincke Martin,
Gutiérrez Lucía
Publication year - 2019
Publication title -
the plant genome
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.403
H-Index - 41
ISSN - 1940-3372
DOI - 10.3835/plantgenome2019.04.0028
Subject(s) - biology , selection (genetic algorithm) , genomic selection , population , sampling (signal processing) , microbiology and biotechnology , computational biology , evolutionary biology , machine learning , computer science , genetics , genotype , demography , gene , single nucleotide polymorphism , computer vision , filter (signal processing) , sociology
Core Ideas Training populations can be optimized for specific testing populations. Optimized training populations are smaller, more related, and more predictive. Stratified sampling with a relationship matrix weighted by marker effect is optimal.The effectiveness of genomic selection in breeding programs depends on the phenotypic quality and depth, the prediction model, the number and type of molecular markers, and the size and composition of the training population (TR). Furthermore, population structure and diversity have a key role in the composition of the optimal training sets. Our goal was to compare strategies for optimizing the TR for specific testing populations (TE). A total of 1353 wheat ( Triticum aestivum L.) and 644 rice ( Oryza sativa L.) advanced lines were evaluated for grain yield in multiple environments. Several within‐TR optimization strategies were compared to identify groups of individuals with increased predictive ability. Additionally, optimization strategies to choose individuals from the TR with higher predictive ability for a specific TE were compared. There is a benefit in considering both the population structure and the relationship between the TR and the TE when designing an optimal TR for genomic selection. A weighted relationship matrix with stratified sampling is the best strategy for forward predictions of quantitative traits in populations several generations apart.