BCFtools/RoH: a hidden Markov model approach for detecting autozygosity from next-generation sequencing data
Author(s) -
Vagheesh M. Narasimhan,
Petr Danecek,
Aylwyn Scally,
Yali Xue,
Chris TylerSmith,
Richard Durbin
Publication year - 2016
Publication title -
bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.599
H-Index - 390
eISSN - 1367-4811
pISSN - 1367-4803
DOI - 10.1093/bioinformatics/btw044
Subject(s) - sanger sequencing , computer science , hidden markov model , runs of homozygosity , software , 1000 genomes project , genome , exome sequencing , exome , computational biology , dna sequencing , data mining , biology , genetics , genotype , mutation , artificial intelligence , gene , single nucleotide polymorphism , programming language
Runs of homozygosity (RoHs) are genomic stretches of a diploid genome that show identical alleles on both chromosomes. Longer RoHs are unlikely to have arisen by chance but are likely to denote autozygosity, whereby both copies of the genome descend from the same recent ancestor. Early tools to detect RoH used genotype array data, but substantially more information is available from sequencing data. Here, we present and evaluate BCFtools/RoH, an extension to the BCFtools software package, that detects regions of autozygosity in sequencing data, in particular exome data, using a hidden Markov model. By applying it to simulated data and real data from the 1000 Genomes Project we estimate its accuracy and show that it has higher sensitivity and specificity than existing methods under a range of sequencing error rates and levels of autozygosity.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom