
RNA‐seq Data: Challenges in and Recommendations for Experimental Design and Analysis
Author(s) -
Williams Alexander G.,
Thomas Sean,
Wyman Stacia K.,
Holloway Alisha K.
Publication year - 2014
Publication title -
current protocols in human genetics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.282
H-Index - 30
eISSN - 1934-8258
pISSN - 1934-8266
DOI - 10.1002/0471142905.hg1113s83
Subject(s) - rna seq , computational biology , rna , computer science , gene , biology , data mining , gene expression , genetics , transcriptome
RNA‐seq is widely used to determine differential expression of genes or transcripts as well as identify novel transcripts, identify allele‐specific expression, and precisely measure translation of transcripts. Thoughtful experimental design and choice of analysis tools are critical to ensure high‐quality data and interpretable results. Important considerations for experimental design include number of replicates, whether to collect paired‐end or single‐end reads, sequence length, and sequencing depth. Common analysis steps in all RNA‐seq experiments include quality control, read alignment, assigning reads to genes or transcripts, and estimating gene or transcript abundance. Our aims are two‐fold: to make recommendations for common components of experimental design and assess tool capabilities for each of these steps. We also test tools designed to detect differential expression, since this is the most widespread application of RNA‐seq. We hope that these analyses will help guide those who are new to RNA‐seq and will generate discussion about remaining needs for tool improvement and development. Curr. Protoc. Hum. Genet . 83:11.13.1‐11.13.20. © 2014 by John Wiley & Sons, Inc.