Effects of Pooling Samples on the Performance of Classification Algorithms: A Comparative Study | Zendy

Kanthida Kusonmano | Zendy; Michael Netzer | Zendy; Christian Baumgärtner | Zendy; Matthias Dehmer | Zendy; Klaus R. Liedl | Zendy; Armin Graber | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Effects of Pooling Samples on the Performance of Classification Algorithms: A Comparative Study

Author(s) -

Kanthida Kusonmano,

Michael Netzer,

Christian Baumgärtner,

Matthias Dehmer,

Klaus R. Liedl,

Armin Graber

Publication year - 2012

Publication title -

the scientific world journal

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.453

H-Index - 93

eISSN - 2356-6140

pISSN - 1537-744X

DOI - 10.1100/2012/278352

Subject(s) - pooling , feature selection , computer science , random forest , support vector machine , machine learning , logistic regression , data mining , classifier (uml) , artificial intelligence , pattern recognition (psychology)

A pooling design can be used as a powerful strategy to compensate for limited amounts of samples or high biological variation. In this paper, we perform a comparative study to model and quantify the effects of virtual pooling on the performance of the widely applied classifiers, support vector machines (SVMs), random forest (RF), k -nearest neighbors ( k -NN), penalized logistic regression (PLR), and prediction analysis for microarrays (PAMs). We evaluate a variety of experimental designs using mock omics datasets with varying levels of pool sizes and considering effects from feature selection. Our results show that feature selection significantly improves classifier performance for non-pooled and pooled data. All investigated classifiers yield lower misclassification rates with smaller pool sizes. RF mainly outperforms other investigated algorithms, while accuracy levels are comparable among all the remaining ones. Guidelines are derived to identify an optimal pooling scheme for obtaining adequate predictive power and, hence, to motivate a study design that meets best experimental objectives and budgetary conditions, including time constraints.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research