Classifying next-generation sequencing data using a zero-inflated Poisson model | Zendy

Yan Zhou | Zendy; Xiang Wan | Zendy; Baoxue Zhang | Zendy; Tiejun Tong | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Classifying next-generation sequencing data using a zero-inflated Poisson model

Author(s) -

Yan Zhou,

Xiang Wan,

Baoxue Zhang,

Tiejun Tong

Publication year - 2017

Publication title -

bioinformatics

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 3.599

H-Index - 390

eISSN - 1367-4811

pISSN - 1367-4803

DOI - 10.1093/bioinformatics/btx768

Subject(s) - poisson distribution , count data , computer science , rna , linear discriminant analysis , poisson regression , data mining , algorithm , mathematics , pattern recognition (psychology) , artificial intelligence , statistics , gene , biology , genetics , population , demography , sociology

With the development of high-throughput techniques, RNA-sequencing (RNA-seq) is becoming increasingly popular as an alternative for gene expression analysis, such as RNAs profiling and classification. Identifying which type of diseases a new patient belongs to with RNA-seq data has been recognized as a vital problem in medical research. As RNA-seq data are discrete, statistical methods developed for classifying microarray data cannot be readily applied for RNA-seq data classification. Witten proposed a Poisson linear discriminant analysis (PLDA) to classify the RNA-seq data in 2011. Note, however, that the count datasets are frequently characterized by excess zeros in real RNA-seq or microRNA sequence data (i.e. when the sequence depth is not enough or small RNAs with the length of 18-30 nucleotides). Therefore, it is desired to develop a new model to analyze RNA-seq data with an excess of zeros.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research