A probabilistic gene expression barcode for annotation of cell types from single-cell RNA-seq data | Zendy

Isabella N. Grabski | Zendy; Rafael A. Irizarry | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

A probabilistic gene expression barcode for annotation of cell types from single-cell RNA-seq data

Author(s) -

Isabella N. Grabski,

Rafael A. Irizarry

Publication year - 2022

Publication title -

biostatistics

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 3.493

H-Index - 82

eISSN - 1468-4357

pISSN - 1465-4644

DOI - 10.1093/biostatistics/kxac021

Subject(s) - overfitting , annotation , computer science , rna seq , computational biology , barcode , gene , data mining , biology , artificial intelligence , gene expression , genetics , transcriptome , artificial neural network , operating system

Single-cell RNA sequencing (scRNA-seq) quantifies gene expression for individual cells in a sample, which allows distinct cell-type populations to be identified and characterized. An important step in many scRNA-seq analysis pipelines is the annotation of cells into known cell types. While this can be achieved using experimental techniques, such as fluorescence-activated cell sorting, these approaches are impractical for large numbers of cells. This motivates the development of data-driven cell-type annotation methods. We find limitations with current approaches due to the reliance on known marker genes or from overfitting because of systematic differences, or batch effects, between studies. Here, we present a statistical approach that leverages public data sets to combine information across thousands of genes, uses a latent variable model to define cell-type-specific barcodes and account for batch effect variation, and probabilistically annotates cell-type identity from a reference of known cell types. The barcoding approach also provides a new way to discover marker genes. Using a range of data sets, including those generated to represent imperfect real-world reference data, we demonstrate that our approach substantially outperforms current reference-based methods, particularly when predicting across studies.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore