z-logo
Premium
Major Errors in Data and Their Effect on Response to Selection
Author(s) -
Mackay I. J.,
Caligari P. D. S.
Publication year - 1999
Publication title -
crop science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.76
H-Index - 147
eISSN - 1435-0653
pISSN - 0011-183X
DOI - 10.2135/cropsci1999.0011183x003900020016x
Subject(s) - selection (genetic algorithm) , statistics , outlier , word error rate , population , biology , computer science , range (aeronautics) , mathematics , machine learning , artificial intelligence , engineering , sociology , demography , aerospace engineering
Outliers in data can usually be detected by data validation routines, but some major errors escape detection because they fall within an acceptable range of values. In a plant breeding program, although these errors may be rare, they could reduce response to selection by an amount disproportionate to their frequency. We used stochastic computer simulations to assess the effect of such errors on response to selection. Combinations of high (1%) and low (0.1%) error rates were simulated, with between 1 and 10 individuals selected from populations of size 100 or 1000. Four different error types were simulated by adjusting the means and variances of the simulated major errors. Major errors caused large reductions in response to selection, especially when present at an error rate of 1% with a population of size 1000. Under such circumstances response to selection may actually increase if selection intensity is reduced. At the 0.1% error rate, and in populations of size 100, the reduction in response to selection was less marked. Data validation methods, in which the most extreme observations were rejected prior to selection, usually reduced response to selection and therefore should not be used routinely. In addition to their effect on selection programs, major errors will also reduce the effciency of bulked segregant analysis. These results confirm that vigilance and careful experimental technique repay their time and effort. Data on the frequency and distribution of major errors are required to achieve a better understanding of their effect and define the best procedure to handle their presence.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here