z-logo
Premium
Clustering distributions with the marginalized nested Dirichlet process
Author(s) -
Zuanetti Daiane Aparecida,
Müller Peter,
Zhu Yitan,
Yang Shengjie,
Ji Yuan
Publication year - 2018
Publication title -
biometrics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.298
H-Index - 130
eISSN - 1541-0420
pISSN - 0006-341X
DOI - 10.1111/biom.12778
Subject(s) - cluster analysis , dirichlet process , mathematics , hierarchical dirichlet process , dirichlet distribution , latent dirichlet allocation , process (computing) , computer science , statistics , statistical physics , artificial intelligence , topic model , mathematical analysis , physics , bayesian probability , boundary value problem , operating system
Summary We introduce a marginal version of the nested Dirichlet process to cluster distributions or histograms. We apply the model to cluster genes by patterns of gene–gene interaction. The proposed approach is based on the nested partition that is implied in the original construction of the nested Dirichlet process. It allows simulation exact inference, as opposed to a truncated Dirichlet process approximation. More importantly, the construction highlights the nature of the nested Dirichlet process as a nested partition of experimental units. We apply the proposed model to inference on clustering genes related to DNA mismatch repair (DMR) by the distribution of gene–gene interactions with other genes. Gene–gene interactions are recorded as coefficients in an auto‐logistic model for the co‐expression of two genes, adjusting for copy number variation, methylation and protein activation. These coefficients are extracted from an online database, called Zodiac, computed based on The Cancer Genome Atlas (TCGA) data. We compare results with a variation of k‐means clustering that is set up to cluster distributions, truncated NDP and a hierarchical clustering method. The proposed inference shows favorable performance, under simulated conditions and also in the real data sets.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here