Premium
Comparison of misclassification rates of search partition analysis and other classification methods
Author(s) -
Marshall Roger J.
Publication year - 2005
Publication title -
statistics in medicine
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.996
H-Index - 183
eISSN - 1097-0258
pISSN - 0277-6715
DOI - 10.1002/sim.2488
Subject(s) - statistics , partition (number theory) , ranking (information retrieval) , logistic regression , data set , computer science , word error rate , data mining , mathematics , artificial intelligence , combinatorics
Search partition analysis (SPAN) is a method to develop classification rules based on Boolean expressions. The performance of SPAN is compared against the trials reported by Lim et al. of 33 other methods of classification, including tree, neural network and regression methods on 16 data sets, most of which were health related. Each data set was augmented with noise variables in further trials. Lim et al. assessed the performance of the methods by estimates of misclassification rate, either cross‐validated or test sample based. In this paper, the same data sets are analysed by SPAN and misclassification rates of the SPAN classifiers are estimated. Comparison is made of the performance of SPAN against the other methods that were considered by Lim et al. In terms of average misclassification error rate, taken over all data sets, SPAN was among the best five methods. In terms of average ranking of misclassification, that is, for each data set ranking the misclassification rates from lowest to highest, SPAN was second only to polyclass logistic regression. Copyright © 2005 John Wiley & Sons, Ltd.