Premium
Observation‐based missing data methods for exploratory data analysis to unveil the connection between observations and variables in latent subspace models
Author(s) -
Camacho José
Publication year - 2011
Publication title -
journal of chemometrics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.47
H-Index - 92
eISSN - 1099-128X
pISSN - 0886-9383
DOI - 10.1002/cem.1405
Subject(s) - subspace topology , latent variable , missing data , computer science , data set , exploratory data analysis , probabilistic latent semantic analysis , latent variable model , data mining , latent class model , set (abstract data type) , statistics , mathematics , artificial intelligence , machine learning , programming language
This paper introduces a class of methods to infer the relationship between observations and variables in latent subspace models. The approach is a modification of the recently proposed missing data methods for exploratory data analysis (MEDA). MEDA is useful to identify the structure in the data and also to interpret the contribution of each latent variable. In this paper, MEDA is augmented with dummy variables to find the data variables related to a given deviation detected among observations, for instance, the difference between one cluster of observations and the bulk of the data. The MEDA extension, referred to as observation‐based MEDA or o MEDA, can be performed in several ways, one of which is theoretically shown to be equivalent to a comparison of means between groups. The use of the proposed approach is demonstrated with a number of examples with simulated data and a real data set of archeological artifacts. Copyright © 2011 John Wiley & Sons, Ltd.