z-logo
Premium
Feature selection in feature network models: Finding predictive subsets of features with the Positive Lasso
Author(s) -
Frank Laurence E.,
Heiser Willem J.
Publication year - 2008
Publication title -
british journal of mathematical and statistical psychology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.157
H-Index - 51
eISSN - 2044-8317
pISSN - 0007-1102
DOI - 10.1348/000711006x119365
Subject(s) - feature selection , lasso (programming language) , univariate , feature (linguistics) , a priori and a posteriori , representation (politics) , mathematics , binary number , set (abstract data type) , measure (data warehouse) , variable (mathematics) , computer science , regression analysis , artificial intelligence , data mining , machine learning , mathematical analysis , linguistics , philosophy , arithmetic , epistemology , multivariate statistics , politics , world wide web , political science , law , programming language
A set of features is the basis for the network representation of proximity data achieved by feature network models (FNMs). Features are binary variables that characterize the objects in an experiment, with some measure of proximity as response variable. Sometimes features are provided by theory and play an important role in the construction of the experimental conditions. In some research settings, the features are not known a priori . This paper shows how to generate features in this situation and how to select an adequate subset of features that takes into account a good compromise between model fit and model complexity, using a new version of least angle regression that restricts coefficients to be non‐negative, called the Positive Lasso. It will be shown that features can be generated efficiently with Gray codes that are naturally linked to the FNMs. The model selection strategy makes use of the fact that FNM can be considered as univariate multiple regression model. A simulation study shows that the proposed strategy leads to satisfactory results if the number of objects is less than or equal to 22. If the number of objects is larger than 22, the number of features selected by our method exceeds the true number of features in some conditions.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here