z-logo
Premium
Classification trees as an alternative to linear discriminant analysis
Author(s) -
Feldesman Marc R.
Publication year - 2002
Publication title -
american journal of physical anthropology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.146
H-Index - 119
eISSN - 1096-8644
pISSN - 0002-9483
DOI - 10.1002/ajpa.10102
Subject(s) - linear discriminant analysis , missing data , optimal discriminant analysis , nonparametric statistics , data set , homogeneity (statistics) , statistics , computer science , decision tree , artificial intelligence , mathematics , data mining , machine learning
Linear discriminant analysis (LDA) is frequently used for classification/prediction problems in physical anthropology, but it is unusual to find examples where researchers consider the statistical limitations and assumptions required for this technique. In these instances, it is difficult to know whether the predictions are reliable. This paper considers a nonparametric alternative to predictive LDA: binary, recursive (or classification) trees. This approach has the advantage that data transformation is unnecessary, cases with missing predictor variables do not require special treatment, prediction success is not dependent on data meeting normality conditions or covariance homogeneity, and variable selection is intrinsic to the methodology. Here I compare the efficacy of classification trees with LDA, using typical morphometric data. With data from modern hominoids, the results show that both techniques perform nearly equally. With complete data sets, LDA may be a better choice, as is shown in this example, but with missing observations, classification trees perform outstandingly well, whereas commercial discriminant analysis programs do not predict classifications for cases with incompletely measured predictor variables and generally are not designed to address the problem of missing data. Testing of data prior to analysis is necessary, and classification trees are recommended either as a replacement for LDA or as a supplement whenever data do not meet relevant assumptions. It is highly recommended as an alternative to LDA whenever the data set contains important cases with missing predictor variables. Am J Phys Anthropol 119:257–275, 2002. © 2002 Wiley‐Liss, Inc.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here