Open Access
RNA-SeQC 2: efficient RNA-seq quality control and quantification for large cohorts
Author(s) -
Aaron Graubert,
François Aguet,
Arvind Ravi,
Kristin Ardlie,
Gad Getz
Publication year - 2021
Publication title -
bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.599
H-Index - 390
eISSN - 1367-4811
pISSN - 1367-4803
DOI - 10.1093/bioinformatics/btab135
Subject(s) - rna , computer science , sample (material) , source code , mit license , documentation , rna seq , quality (philosophy) , data mining , computational biology , software , biology , genetics , transcriptome , programming language , gene , philosophy , chemistry , chromatography , epistemology , gene expression
Post-sequencing quality control is a crucial component of RNA sequencing (RNA-seq) data generation and analysis, as sample quality can be affected by sample storage, extraction and sequencing protocols. RNA-seq is increasingly applied to cohorts ranging from hundreds to tens of thousands of samples in size, but existing tools do not readily scale to these sizes, and were not designed for a wide range of sample types and qualities. Here, we describe RNA-SeQC 2, an efficient reimplementation of RNA-SeQC (DeLuca et al., 2012) that adds multiple metrics designed to characterize sample quality across a wide range of RNA-seq protocols.