Premium
Principal axes analysis of symbolic histogram variables
Author(s) -
MakossoKallyth Sun
Publication year - 2016
Publication title -
statistical analysis and data mining: the asa data science journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.381
H-Index - 33
eISSN - 1932-1872
pISSN - 1932-1864
DOI - 10.1002/sam.11270
Subject(s) - histogram , mathematics , principal component analysis , estimator , symbolic data analysis , quantile , pattern recognition (psychology) , histogram matching , statistics , artificial intelligence , computer science , image (mathematics)
We present a new method to perform a principal axes analysis of symbolic histogram variables. In the symbolic data analysis framework, several Histogram Principal component Analysis (Histogram PCA) have been proposed. Some approaches focus on the relationships between some specific features of histograms such as the means or the quantiles. Others use the association for distributional variables based on the squared Wasserstein distance. In this paper, we propose two new approaches. The first one uses new correlation measures based on Fisher's z scores between corresponding bins of histogram variables. We also suggest the use of the estimator proposed by Olkin and Pratt. In the first approach, histogram variables must have the same number of bins. The second proposed approach, by contrast, extends the previous proposed correlations by considering the corresponding quantiles. This second approach can be used when histograms do not have the same number of bins.