This article proposes a method for manipulating a database of instances relative to discrete and continuous variables. A fuzzy partition is used to discretize continuous domains. A reorganized form of representing a relational database is proposed. The new form of representation is called an effective database. The effective database is tested on classification and regression problems using general Bayesian networks and Naive Bayes classifiers. The structures and the parameters of the classifiers are estimated from the effective database. An algorithm for updating with soft evidence is used to test the induced models, when continuous variables are present. The experiments show that the effective database procedure produces a selection of relevant information from data, which improves in some cases the prediction accuracy of the classifiers.
Effective database processing for classification and regression with continuous variables
2007
Abstract
This article proposes a method for manipulating a database of instances relative to discrete and continuous variables. A fuzzy partition is used to discretize continuous domains. A reorganized form of representing a relational database is proposed. The new form of representation is called an effective database. The effective database is tested on classification and regression problems using general Bayesian networks and Naive Bayes classifiers. The structures and the parameters of the classifiers are estimated from the effective database. An algorithm for updating with soft evidence is used to test the induced models, when continuous variables are present. The experiments show that the effective database procedure produces a selection of relevant information from data, which improves in some cases the prediction accuracy of the classifiers.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


