z-logo
open-access-imgOpen Access
BGData - A Suite of R Packages for Genomic Analysis with Big Data
Author(s) -
Alexander Grueneberg,
Gustavo de los Campos
Publication year - 2019
Publication title -
g3 genes genomes genetics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.468
H-Index - 66
ISSN - 2160-1836
DOI - 10.1534/g3.119.400018
Subject(s) - suite , computer science , biobank , r package , set (abstract data type) , class (philosophy) , big data , data set , interface (matter) , binary number , data mining , operating system , programming language , bioinformatics , artificial intelligence , biology , history , arithmetic , mathematics , archaeology , bubble , maximum bubble pressure method
We created a suite of packages to enable analysis of extremely large genomic data sets (potentially millions of individuals and millions of molecular markers) within the R environment. The package offers: a matrix-like interface for .bed files (PLINK's binary format for genotype data), a novel class of linked arrays that allows linking data stored in multiple files to form a single array accessible from the R computing environment, methods for parallel computing capabilities that can carry out computations on very large data sets without loading the entire data into memory and a basic set of methods for statistical genetic analyses. The package is accessible through CRAN and GitHub. In this note, we describe the classes and methods implemented in each of the packages that make the suite and illustrate the use of the packages using data from the UK Biobank.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom