z-logo
Premium
Measurement of data complexity for classification problems with unbalanced data
Author(s) -
Anwar Nafees,
Jones Geoff,
Ganesh Siva
Publication year - 2014
Publication title -
statistical analysis and data mining: the asa data science journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.381
H-Index - 33
eISSN - 1932-1872
pISSN - 1932-1864
DOI - 10.1002/sam.11228
Subject(s) - computer science , measure (data warehouse) , data mining , classifier (uml) , visualization , metric (unit) , data visualization , pattern recognition (psychology) , artificial intelligence , machine learning , operations management , economics
We introduce a complexity measure for classification problems that takes account of deterioration in classifier performance as a result of class imbalance. The measure is based on k ‐nearest neighbors. We explore the choices of k and the distance metric through a simulation study, and illustrate the use of our measure, and related data visualization techniques, with real datasets from the literature.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here