Full model selection in the space of data mining operators
Author(s) -
Quan Sun,
Bernhard Pfahringer,
Michael Mayo
Publication year - 2012
Publication title -
research commons (the university of waikato)
Language(s) - English
Resource type - Conference proceedings
DOI - 10.1145/2330784.2331014
Subject(s) - particle swarm optimization , computer science , data mining , directed acyclic graph , selection (genetic algorithm) , feature selection , genetic algorithm , graph , set (abstract data type) , transformation (genetics) , mathematical optimization , algorithm , artificial intelligence , mathematics , machine learning , theoretical computer science , programming language , biochemistry , chemistry , gene
We propose a framework and a novel algorithm for the full model selection (FMS) problem. The proposed algorithm, combining both genetic algorithms (GA) and particle swarm optimization (PSO), is named GPS (which stands for GAPSO-FMS), in which a GA is used for searching the optimal structure of a data mining solution, and PSO is used for searching the optimal parameter set for a particular structure instance. Given a classification or regression problem, GPS outputs a FMS solution as a directed acyclic graph consisting of diverse data mining operators that are applicable to the problem, including data cleansing, data sampling, feature transformation/selection and algorithm operators. The solution can also be represented graphically in a human readable form. Experimental results demonstrate the benefit of the algorithm
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom