Detecting Microsatellites in Genome Data: Variance in Definitions and Bioinformatic Approaches Cause Systematic Bias
Author(s) -
Angelika Merkel,
Neil J. Gemmell
Publication year - 2008
Publication title -
evolutionary bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.502
H-Index - 32
ISSN - 1176-9343
DOI - 10.4137/ebo.s420
Subject(s) - microsatellite , genome , tandem repeat , computational biology , in silico , biology , variance (accounting) , evolutionary biology , computer science , genetics , gene , allele , accounting , business
Microsatellites are currently one of the most commonly used genetic markers. The application of bioinformatic tools has become common practice in the study of these short tandem repeats (STR). However, in silico studies can suffer from study bias. Using a meta-analysis on microsatellite distribution in yeast we show that estimates of numbers of repeats reported by different studies can differ in the order of several magnitudes, even within a single genome. These differences arise because varying definitions of microsatellites, spanning repeat size, array length and array composition, are used in different search paradigms, with minimum array length being the main influencing factor. Structural differences in the implemented search algorithm additionally contribute to variation in the number of repeats detected. We suggest that for future studies a consistent approach to STR searches is adopted in order to improve the power of intra- and interspecific comparisons.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom