z-logo
Premium
Distance analysis of large data sets of categorical variables using object weights
Author(s) -
Groenen Patrick J. F.,
Commandeur Jacques J. F.,
Meulman Jacqueline J.
Publication year - 1998
Publication title -
british journal of mathematical and statistical psychology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.157
H-Index - 51
eISSN - 2044-8317
pISSN - 0007-1102
DOI - 10.1111/j.2044-8317.1998.tb00678.x
Subject(s) - categorical variable , mathematics , homogeneity (statistics) , resampling , dimension (graph theory) , object (grammar) , metric (unit) , stability (learning theory) , contingency table , representation (politics) , algorithm , computer science , statistics , artificial intelligence , combinatorics , politics , political science , law , operations management , machine learning , economics
Categorical variables are often analysed by multiple correspondence (or homogeneity analysis), which places great emphasis on graphical representation. A drawback of this method is that sometimes only minor aspects of the data are displayed, or, if a dominant first dimension exists, the horseshoe effect occurs. Here, we elaborate on a competing approach to multiple correspondence analysis based on distance approximation. This method emphasizes the distance between objects; they are graphically displayed as points, and objects close together are considered more similar than objects farther apart. A limiting factor of this method is that the number of objects cannot be very large (say, no more than 500). We show how the majorization algorithm for distance approximation can be extended using frequency counts as object weights such that much larger data sets can be analysed without a significant amount of additional computational effort. A second advantage of the use of object weights is that resampling methods, such as the bootstrap, are easily implemented. We present two illustrative examples, and investigate the stability in one of them through the bootstrap.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here