z-logo
open-access-imgOpen Access
Separating measurement and expression models clarifies confusion in single-cell RNA sequencing analysis
Author(s) -
Abhishek Sarkar,
Matthew Stephens
Publication year - 2021
Publication title -
nature genetics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 18.861
H-Index - 573
eISSN - 1546-1718
pISSN - 1061-4036
DOI - 10.1038/s41588-021-00873-4
Subject(s) - biology , terminology , confusion , expression (computer science) , computational biology , rna , rna seq , perspective (graphical) , transcriptome , gene expression , genetics , gene , computer science , artificial intelligence , psychology , philosophy , linguistics , psychoanalysis , programming language
The high proportion of zeros in typical single-cell RNA sequencing datasets has led to widespread but inconsistent use of terminology such as dropout and missing data. Here, we argue that much of this terminology is unhelpful and confusing, and outline simple ideas to help to reduce confusion. These include: (1) observed single-cell RNA sequencing counts reflect both true gene expression levels and measurement error, and carefully distinguishing between these contributions helps to clarify thinking; and (2) method development should start with a Poisson measurement model, rather than more complex models, because it is simple and generally consistent with existing data. We outline how several existing methods can be viewed within this framework and highlight how these methods differ in their assumptions about expression variation. We also illustrate how our perspective helps to address questions of biological interest, such as whether messenger RNA expression levels are multimodal among cells.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here