z-logo
open-access-imgOpen Access
Flow Cytometry-Based Classification in Cancer Research: A View on Feature Selection
Author(s) -
Sakira Hassan,
Pekka Ruusuvuori,
Leena Latonen,
Heikki Huttunen
Publication year - 2015
Publication title -
cancer informatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.606
H-Index - 31
ISSN - 1176-9351
DOI - 10.4137/cin.s30795
Subject(s) - feature selection , computer science , artificial intelligence , machine learning , regularization (linguistics) , elastic net regularization , feature (linguistics) , selection (genetic algorithm) , lasso (programming language) , stability (learning theory) , data mining , estimator , logistic regression , pattern recognition (psychology) , mathematics , statistics , philosophy , linguistics , world wide web
In this paper, we study the problem of feature selection in cancer-related machine learning tasks. In particular, we study the accuracy and stability of different feature selection approaches within simplistic machine learning pipelines. Earlier studies have shown that for certain cases, the accuracy of detection can easily reach 100% given enough training data. Here, however, we concentrate on simplifying the classification models with and seek for feature selection approaches that are reliable even with extremely small sample sizes. We show that as much as 50% of features can be discarded without compromising the prediction accuracy. Moreover, we study the model selection problem among the ℓ 1 regularization path of logistic regression classifiers. To this aim, we compare a more traditional cross-validation approach with a recently proposed Bayesian error estimator.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom