Effect of separate sampling on classification accuracy
Author(s) -
Mohammad Shahrokh Esfahani,
Edward R. Dougherty
Publication year - 2013
Publication title -
bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.599
H-Index - 390
eISSN - 1367-4811
pISSN - 1367-4803
DOI - 10.1093/bioinformatics/btt662
Subject(s) - minimax , classifier (uml) , computer science , matlab , sample size determination , naive bayes classifier , bayes' theorem , sampling (signal processing) , population , bayes classifier , statistics , bayes error rate , artificial intelligence , data mining , pattern recognition (psychology) , mathematics , bayesian probability , mathematical optimization , support vector machine , sociology , demography , filter (signal processing) , computer vision , operating system
Measurements are commonly taken from two phenotypes to build a classifier, where the number of data points from each class is predetermined, not random. In this 'separate sampling' scenario, the data cannot be used to estimate the class prior probabilities. Moreover, predetermined class sizes can severely degrade classifier performance, even for large samples.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom