z-logo
open-access-imgOpen Access
Differential privacy-based evaporative cooling feature selection and classification with relief-F and random forests
Author(s) -
Trang T. Le,
W. Kyle Simmons,
Masaya Misaki,
Jerzy Bodurka,
Bill C. White,
Jonathan Savitz,
Brett A. McKinney
Publication year - 2017
Publication title -
bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.599
H-Index - 390
eISSN - 1367-4811
pISSN - 1367-4803
DOI - 10.1093/bioinformatics/btx298
Subject(s) - overfitting , random forest , differential privacy , feature selection , computer science , feature (linguistics) , machine learning , artificial intelligence , data mining , evaporative cooler , engineering , linguistics , philosophy , artificial neural network , mechanical engineering
Classification of individuals into disease or clinical categories from high-dimensional biological data with low prediction error is an important challenge of statistical learning in bioinformatics. Feature selection can improve classification accuracy but must be incorporated carefully into cross-validation to avoid overfitting. Recently, feature selection methods based on differential privacy, such as differentially private random forests and reusable holdout sets, have been proposed. However, for domains such as bioinformatics, where the number of features is much larger than the number of observations p≫n , these differential privacy methods are susceptible to overfitting.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom