z-logo
open-access-imgOpen Access
The effect of genome graph expressiveness on the discrepancy between genome graph distance and string set distance
Author(s) -
Yutong Qiu,
Carl Kingsford
Publication year - 2022
Publication title -
bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.599
H-Index - 390
eISSN - 1367-4811
pISSN - 1367-4803
DOI - 10.1093/bioinformatics/btac264
Subject(s) - edit distance , string (physics) , genome , graph traversal , mathematics , combinatorics , graph , string metric , discrete mathematics , computer science , algorithm , string searching algorithm , data structure , biology , genetics , gene , mathematical physics , programming language
Intra-sample heterogeneity describes the phenomenon where a genomic sample contains a diverse set of genomic sequences. In practice, the true string sets in a sample are often unknown due to limitations in sequencing technology. In order to compare heterogeneous samples, genome graphs can be used to represent such sets of strings. However, a genome graph is generally able to represent a string set universe that contains multiple sets of strings in addition to the true string set. This difference between genome graphs and string sets is not well characterized. As a result, a distance metric between genome graphs may not match the distance between true string sets.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here