Premium
Prediction of breed composition in an admixed cattle population
Author(s) -
Frkonja A.,
Gredler B.,
Schnyder U.,
Curik I.,
Sölkner J.
Publication year - 2012
Publication title -
animal genetics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.756
H-Index - 81
eISSN - 1365-2052
pISSN - 0268-9146
DOI - 10.1111/j.1365-2052.2012.02345.x
Subject(s) - biology , lasso (programming language) , snp , single nucleotide polymorphism , breed , partial least squares regression , population , statistics , selection (genetic algorithm) , genetics , regression , mathematics , artificial intelligence , computer science , genotype , gene , demography , sociology , world wide web
Summary S wiss F leckvieh was established in 1970 as a composite of Simmental ( SI ) and R ed H olstein F riesian ( RHF ) cattle. Breed composition is currently reported based on pedigree information. Information on a large number of molecular markers potentially provides more accurate information. For the analysis, we used I llumina B ovine SNP 50 G enotyping B eadchip data for 90 pure SI , 100 pure RHF and 305 admixed bulls. The scope of the study was to compare the performance of hidden Markov models, as implemented in structure software, with methods conventionally used in genomic selection [BayesB, partial least squares regression ( PLSR ), least absolute shrinkage and selection operator ( LASSO ) variable selection)] for predicting breed composition. We checked the performance of algorithms for a set of 40 492 single nucleotide polymorphisms ( SNP s), subsets of evenly distributed SNP s and subsets with different allele frequencies in the pure populations, using F ST as an indicator. Key results are correlations of admixture levels estimated with the various algorithms with admixture based on pedigree information. For the full set, PLSR , BayesB and structure performed in a very similar manner (correlations of 0.97), whereas the correlation of LASSO and pedigree admixture was lower (0.93). With decreasing number of SNP s, correlations decreased substantially only for 5% or 1% of all SNP s. With SNP s chosen according to F ST , results were similar to results obtained with the full set. Only when using 96 and 48 SNP s with the highest F ST , correlations dropped to 0.92 and 0.90 respectively. Reducing the number of pure animals in training sets to 50, 20 and 10 each did not cause a drop in the correlation with pedigree admixture.