z-logo
open-access-imgOpen Access
IGD: high-performance search for large-scale genomic interval datasets
Author(s) -
Jianglin Feng,
Nathan C. Sheffield
Publication year - 2020
Publication title -
bioinformatics
Language(s) - Uncategorized
Resource type - Journals
SCImago Journal Rank - 3.599
H-Index - 390
eISSN - 1367-4811
pISSN - 1367-4803
DOI - 10.1093/bioinformatics/btaa1062
Subject(s) - immunoglobulin d , scale (ratio) , interval (graph theory) , computer science , computational biology , biology , genetics , mathematics , antibody , combinatorics , cartography , geography , b cell
Databases of large-scale genome projects now contain thousands of genomic interval datasets. These data are a critical resource for understanding the function of DNA. However, our ability to examine and integrate interval data of this scale is limited. Here, we introduce the integrated genome database (IGD), a method and tool for searching genome interval datasets more than three orders of magnitude faster than existing approaches, while using only one hundredth of the memory. IGD uses a novel linear binning method that allows us to scale analysis to billions of genomic regions.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom