Complex Variant Discovery Using Discordant Cluster Normalization
Author(s) -
Matthew Hayes,
Derrick Mullins,
Angela Nguyen
Publication year - 2020
Publication title -
journal of computational biology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.585
H-Index - 95
eISSN - 1557-8666
pISSN - 1066-5277
DOI - 10.1089/cmb.2020.0249
Subject(s) - breakpoint , computer science , normalization (sociology) , clique , genome , computational biology , structural variation , data mining , biology , chromosome , genetics , mathematics , gene , combinatorics , sociology , anthropology
Complex genomic structural variants (CGSVs) are abnormalities that present with three or more breakpoints, making their discovery a challenge. The majority of existing algorithms for structural variant detection are only designed to find simple structural variants (SSVs) such as deletions and inversions; they fail to find more complex events such as deletion-inversions or deletion-duplications, for example. In this study, we present an algorithm named CleanBreak that employs a clique partitioning graph-based strategy to identify collections of SSV clusters and then subsequently identifies overlapping SSV clusters to examine the search space of possible CGSVs, choosing the one that is most concordant with local read depth. We evaluated CleanBreak's performance on whole genome simulated data and a real data set from the 1000 Genomes Project. We also compared CleanBreak with another algorithm for CGSV discovery. The results demonstrate CleanBreak's utility as an effective method to discover CGSVs.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom