Premium
Association Rule Mining for Continuous Attributes using Genetic Network Programming
Author(s) -
Taboada Karla,
Gonzales Eloy,
Shimada Kaoru,
Mabu Shingo,
Hirasawa Kotaro,
Hu Jinglu
Publication year - 2008
Publication title -
ieej transactions on electrical and electronic engineering
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.254
H-Index - 30
eISSN - 1931-4981
pISSN - 1931-4973
DOI - 10.1002/tee.20256
Subject(s) - discretization , association rule learning , data mining , preprocessor , computer science , heuristic , genetic algorithm , genetic programming , data pre processing , apriori algorithm , discretization of continuous features , mathematics , algorithm , machine learning , artificial intelligence , discretization error , mathematical analysis
Most of the existing association rule mining algorithms are able to extract knowledge from databases with attributes of binary values. However, in real‐world applications, databases are usually composed of continuous values such as height, length or weight. If the attributes are continuous, the algorithms are commonly integrated with a discretization method that transforms them into discrete attributes. Discretization is a process of transforming a continuous attribute value into a finite number of intervals and assigning each interval into a discrete numerical value. However, the user most often must specify the number of intervals, or provide some heuristic rules to be used while discretization, and then it is difficult to get the highest attribute interdependency and at the same time get the lowest number of intervals. In this paper we present an association rule mining algorithm that is suited for continuous valued attributes commonly found in scientific and statistical databases. We propose a method using a new graph‐based evolutionary algorithm named ‘genetic network programming (GNP)’ that can deal with continuous values directly, that is, without using any discretization method as a preprocessing step. GNP represents its individuals using graph structures and evolves them in order to find a solution; this feature contributes to creating very compact programs and implicitly memorizing past action sequences. In the proposed method using GNP, the significance of the extracted association rules is measured by the use of χ 2 test, and only important association rules are stored in a pool all together through generations. Results of experiments conducted on a real‐life database suggest that the proposed method provides an effective technique for handling continuous attributes. Copyright © 2008 Institute of Electrical Engineers of Japan. Published by John Wiley & Sons, Inc.