Premium
The impact of truncating data on the predictive ability for single‐step genomic best linear unbiased prediction
Author(s) -
Howard Jeremy T.,
Rathje Tom A.,
Bruns Caitlyn E.,
WilsonWells Danielle F.,
Kachman Stephen D.,
Spangler Matthew L.
Publication year - 2018
Publication title -
journal of animal breeding and genetics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.689
H-Index - 51
eISSN - 1439-0388
pISSN - 0931-2668
DOI - 10.1111/jbg.12334
Subject(s) - best linear unbiased prediction , selection (genetic algorithm) , predictive value , data set , statistics , genomic selection , biology , mathematics , genotype , genetics , computer science , artificial intelligence , gene , single nucleotide polymorphism , medicine
Abstract Simulated and swine industry data sets were utilized to assess the impact of removing older data on the predictive ability of selection candidate estimated breeding values (EBV) when using single‐step genomic best linear unbiased prediction (ssGBLUP). Simulated data included thirty replicates designed to mimic the structure of swine data sets. For the simulated data, varying amounts of data were truncated based on the number of ancestral generations back from the selection candidates. The swine data sets consisted of phenotypic and genotypic records for three traits across two breeds on animals born from 2003 to 2017. Phenotypes and genotypes were iteratively removed 1 year at a time based on the year an animal was born. For the swine data sets, correlations between corrected phenotypes (Cp) and EBV were used to evaluate the predictive ability on young animals born in 2016–2017. In the simulated data set, keeping data two generations back or greater resulted in no statistical difference ( p ‐value > 0.05) in the reduction in the true breeding value at generation 15 compared to utilizing all available data. Across swine data sets, removing phenotypes from animals born prior to 2011 resulted in a negligible or a slight numerical increase in the correlation between Cp and EBV. Truncating data is a method to alleviate computational issues without negatively impacting the predictive ability of selection candidate EBV.