Inferring Indel Parameters using a Simulation-based Approach
Author(s) -
Eli Levy Karin,
Avigayel Rabin,
Haim Ashkenazy,
Dafna Shkedy,
Oren Avram,
Reed A. Cartwright,
Tal Pupko
Publication year - 2015
Publication title -
genome biology and evolution
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.702
H-Index - 74
ISSN - 1759-6653
DOI - 10.1093/gbe/evv212
Subject(s) - indel , indel mutation , set (abstract data type) , biology , mahalanobis distance , multiple sequence alignment , parametric statistics , computer science , data set , sequence (biology) , data mining , computational biology , pattern recognition (psychology) , algorithm , artificial intelligence , sequence alignment , statistics , mathematics , genetics , gene , genotype , single nucleotide polymorphism , peptide sequence , programming language
In this study, we present a novel methodology to infer indel parameters from multiple sequence alignments (MSAs) based on simulations. Our algorithm searches for the set of evolutionary parameters describing indel dynamics which best fits a given input MSA. In each step of the search, we use parametric bootstraps and the Mahalanobis distance to estimate how well a proposed set of parameters fits input data. Using simulations, we demonstrate that our methodology can accurately infer the indel parameters for a large variety of plausible settings. Moreover, using our methodology, we show that indel parameters substantially vary between three genomic data sets: Mammals, bacteria, and retroviruses. Finally, we demonstrate how our methodology can be used to simulate MSAs based on indel parameters inferred from real data sets.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom