Premium
Effective database processing for classification and regression with continuous variables
Author(s) -
Di Tomaso E.,
Baldwin J.F.
Publication year - 2007
Publication title -
international journal of intelligent systems
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.291
H-Index - 87
eISSN - 1098-111X
pISSN - 0884-8173
DOI - 10.1002/int.7020
Subject(s) - computer science , data mining , database , partition (number theory) , naive bayes classifier , relational database , database design , bayesian network , artificial intelligence , database theory , machine learning , mathematics , support vector machine , combinatorics
This article proposes a method for manipulating a database of instances relative to discrete and continuous variables. A fuzzy partition is used to discretize continuous domains. A reorganized form of representing a relational database is proposed. The new form of representation is called an effective database. The effective database is tested on classification and regression problems using general Bayesian networks and Näive Bayes classifiers. The structures and the parameters of the classifiers are estimated from the effective database. An algorithm for updating with soft evidence is used to test the induced models, when continuous variables are present. The experiments show that the effective database procedure produces a selection of relevant information from data, which improves in some cases the prediction accuracy of the classifiers. © 2007 Wiley Periodicals, Inc. Int J Int Syst 22: 1271–1285, 2007.