CanvasDB: a local database infrastructure for analysis of targeted- and whole genome re-sequencing projects
Author(s) -
Adam Ameur,
Ignas Bunikis,
Stefan Enroth,
Ulf Gyllensten
Publication year - 2014
Publication title -
database
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.406
H-Index - 62
ISSN - 1758-0463
DOI - 10.1093/database/bau098
Subject(s) - indel , scalability , genome , exome sequencing , computer science , identification (biology) , 1000 genomes project , massive parallel sequencing , exome , whole genome sequencing , dna sequencing , computational biology , database , biology , single nucleotide polymorphism , genetics , mutation , gene , botany , genotype
CanvasDB is an infrastructure for management and analysis of genetic variants from massively parallel sequencing (MPS) projects. The system stores SNP and indel calls in a local database, designed to handle very large datasets, to allow for rapid analysis using simple commands in R. Functional annotations are included in the system, making it suitable for direct identification of disease-causing mutations in human exome- (WES) or whole-genome sequencing (WGS) projects. The system has a built-in filtering function implemented to simultaneously take into account variant calls from all individual samples. This enables advanced comparative analysis of variant distribution between groups of samples, including detection of candidate causative mutations within family structures and genome-wide association by sequencing. In most cases, these analyses are executed within just a matter of seconds, even when there are several hundreds of samples and millions of variants in the database. We demonstrate the scalability of canvasDB by importing the individual variant calls from all 1092 individuals present in the 1000 Genomes Project into the system, over 4.4 billion SNPs and indels in total. Our results show that canvasDB makes it possible to perform advanced analyses of large-scale WGS projects on a local server. Database URL: https://github.com/UppsalaGenomeCenter/CanvasDB.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom