z-logo
open-access-imgOpen Access
Item response theory modeling for microarray gene expression data
Author(s) -
Andrej Kastrin
Publication year - 2009
Publication title -
metodološki zvezki
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.127
H-Index - 7
eISSN - 1854-0031
pISSN - 1854-0023
DOI - 10.51936/mpqj3248
Subject(s) - latent variable , microarray analysis techniques , dimensionality reduction , latent variable model , gene chip analysis , local independence , cluster analysis , latent class model , feature selection , data mining , computer science , mathematics , statistics , microarray , artificial intelligence , gene , gene expression , biology , genetics
The high dimensionality of global gene expression profiles, where number of variables (genes) is very large compared to the number of observations (samples), presents challenges that affect generalizability and applicability of microarray analysis. Latent variable modeling offers a promising approach to deal with high-dimensional microarray data. The latent variable model is based on a few latent variables that capture most of the gene expression information. Here, we describe how to accomplish a reduction in dimension by a latent variable methodology, which can greatly reduce the number of features used to characterize microarray data. We propose a general latent variable framework for prediction of predefined classes of samples using gene expression profiles from microarray experiments. The framework consists of (i) selection of smaller number of genes that are most differentially expressed between samples, (ii) dimension reduction using hierarchical clustering, where each cluster partition is identified as latent variable, (iii) discretization of gene expression matrix, (iv) fitting the Rasch item response model for genes in each cluster partition to estimate the expression of latent variable, and (v) construction of prediction model with latent variables as covariates to study the relationship between latent variables and phenotype. Two different microarray data sets are used to illustrate a general framework of the approach. We show that the predictive performance of our method is comparable to the current best approach based on an all-gene space. The method is general and can be applied to the other high-dimensional data problems.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here