Premium
A simple model‐based approach to variable selection in classification and clustering
Author(s) -
Partovi Nia Vahid,
Davison Anthony C.
Publication year - 2015
Publication title -
canadian journal of statistics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.804
H-Index - 51
eISSN - 1708-945X
pISSN - 0319-5724
DOI - 10.1002/cjs.11241
Subject(s) - cluster analysis , computer science , data mining , simple (philosophy) , feature selection , selection (genetic algorithm) , variable (mathematics) , machine learning , artificial intelligence , mathematics , mathematical analysis , philosophy , epistemology
Clustering and classification of replicated data is often performed using classical techniques that inappropriately treat the data as unreplicated, or by complex modern ones that are computationally demanding. In this paper, we introduce a simple approach based on a “spike‐and‐slab” mixture model that is fast, automatic, allows classification, clustering and variable selection in a single framework, and can handle replicated or unreplicated data. Simulation shows that our approach compares well with other recently proposed methods. The ideas are illustrated by application to microarray and metabolomic data. The Canadian Journal of Statistics 43: 157–175; 2015 © 2015 Statistical Society of Canada