Premium
Genomic Selection using Multiple Populations
Author(s) -
SchulzStreeck T.,
Ogutu J. O.,
Karaman Z.,
Knaak C.,
Piepho H. P.
Publication year - 2012
Publication title -
crop science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.76
H-Index - 147
eISSN - 1435-0653
pISSN - 0011-183X
DOI - 10.2135/cropsci2012.03.0160
Subject(s) - biology , best linear unbiased prediction , lasso (programming language) , elastic net regularization , population , selection (genetic algorithm) , regression , linear regression , ancestry informative marker , statistics , predictive modelling , genetics , genotype , allele frequency , machine learning , mathematics , computer science , demography , sociology , world wide web , gene
Using different populations in genomic selection raises the possibility of marker effects varying across populations. However, common models for genomic selection only account for the main marker effects, assuming that they are consistent across populations. We present an approach in which the main plus population‐specific marker effects are simultaneously estimated in a single mixed model. Cross‐validation is used to compare the predictive ability of this model to that of the ridge regression best linear unbiased prediction (RR‐BLUP) method involving only either the main marker effects or the population‐specific marker effects. We used a maize ( Zea mays L.) data set with 312 genotypes derived from five biparental populations, which were genotyped with 39,339 markers. A combined analysis incorporating genotypes for all the populations and hence using a larger training set was better than separate analyses for each population. Modeling the main plus the population‐specific marker effects simultaneously improved predictive ability only slightly compared with modeling only the main marker effects. The performance of the RR‐BLUP method was comparable to that of two regularization methods, namely the ridge regression and the elastic net, and was more accurate than that of the least absolute shrinkage and selection operator (LASSO). Overall, combining information from related populations and increasing the number of genotypes improved predictive ability, but further allowing for population‐specific marker effects made minor improvement.