z-logo
open-access-imgOpen Access
SeqArray—a storage-efficient high-performance data format for WGS variant calls
Author(s) -
Xiuwen Zheng,
Stephanie M. Gogarten,
Michael C. Lawrence,
Adrienne M. Stilp,
Matthew P. Conomos,
Bruce S. Weir,
Cathy C. Laurie,
David Levine
Publication year - 2017
Publication title -
bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.599
H-Index - 390
eISSN - 1367-4811
pISSN - 1367-4803
DOI - 10.1093/bioinformatics/btx145
Subject(s) - computer science , database , operating system
Whole-genome sequencing (WGS) data are being generated at an unprecedented rate. Analysis of WGS data requires a flexible data format to store the different types of DNA variation. Variant call format (VCF) is a general text-based format developed to store variant genotypes and their annotations. However, VCF files are large and data retrieval is relatively slow. Here we introduce a new WGS variant data format implemented in the R/Bioconductor package 'SeqArray' for storing variant calls in an array-oriented manner which provides the same capabilities as VCF, but with multiple high compression options and data access using high-performance parallel computing.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom