Premium
Z‐BAG: A CLASSIFICATION ENSEMBLE SYSTEM WITH POSTERIOR PROBABILISTIC OUTPUTS
Author(s) -
Xu Zhonghui,
Voichiţa Călin,
Drăghici Sorin,
Romero Roberto
Publication year - 2013
Publication title -
computational intelligence
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.353
H-Index - 52
eISSN - 1467-8640
pISSN - 0824-7935
DOI - 10.1111/j.1467-8640.2012.00432.x
Subject(s) - posterior probability , computer science , probabilistic logic , artificial intelligence , probabilistic classification , ensemble learning , statistic , support vector machine , pattern recognition (psychology) , machine learning , random subspace method , classifier (uml) , data mining , mathematics , statistics , bayesian probability , naive bayes classifier
Ensemble systems improve the generalization of single classifiers by aggregating the prediction of a set of base classifiers. Assessing classification reliability (posterior probability) is crucial in a number of applications, such as biomedical and diagnosis applications, where the cost of a misclassified input vector can be unacceptable high. Available methods are limited to either calibrate the posterior probability on an aggregated decision value or obtain a posterior probability for each base classifier and aggregate the result. We propose a method that takes advantage of the distribution of the decision values from the base classifiers to summarize a statistic which is subsequently used to generate the posterior probability. Three approaches are considered to fit the probabilistic output to the statistic: the standard Gaussian CDF , isotonic regression , and linear logistic . Even though this study focuses on a bagged support vector machine ensemble ( Z ‐bag), our approach is not limited by the aggregation method selected, the choice of base classifiers, nor the statistic used. Performance is assessed on one artificial and 12 real‐world data sets from the UCI Machine Learning Repository. Our approach achieves comparable or better generalization on accuracy and posterior estimation to existing ensemble calibration methods although lowering computational cost.