Effect of data discretization on the classification accuracy in a high‐dimensional framework | Zendy

Tillander Annika | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

Effect of data discretization on the classification accuracy in a high‐dimensional framework

Author(s) -

Tillander Annika

Publication year - 2012

Publication title -

international journal of intelligent systems

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 1.291

H-Index - 87

eISSN - 1098-111X

pISSN - 0884-8173

DOI - 10.1002/int.21527

Subject(s) - discretization , computer science , data mining , artificial intelligence , pattern recognition (psychology) , algorithm , mathematics , mathematical analysis

We investigate discretization of continuous variables for classification problems in a high‐ dimensional framework. As the goal of classification is to correctly predict a class membership of an observation, we suggest a discretization method that optimizes the discretization procedure using the misclassification probability as a measure of the classification accuracy. Our method is compared to several other discretization methods as well as result for continuous data. To compare performance we consider three supervised classification methods, and to capture the effect of high dimensionality we investigate a number of feature variables for a fixed number of observations. Since discretization is a data transformation procedure, we also investigate how the dependence structure is affected by this. Our method performs well, and lower misclassification can be obtained in a high‐dimensional framework for both simulated and real data if the continuous feature variables are first discretized. The dependence structure is well maintained for some discretization methods. © 2012 Wiley Periodicals, Inc.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Accelerating Research