This paper aims to investigate and compare the accuracy of different data mining classification schemes, employing Ensemble Machine Learning Techniques, for the prediction of heart disease. The Cleveland data set for heart diseases, containing 303 instances, has been used as the main database for the training and testing of the developed system. 10-Fold Cross-Validation has been applied in order to increase the amount of data, which would otherwise have been limited. Different classifiers, namely Decision Tree (DT), Naïve Bayes (NB), Multilayer Perceptron (MLP), K-Nearest Neighbor (K-NN), Single Conjunctive Rule Learner (SCRL), Radial Basis Function (RBF) and Support Vector Machine (SVM), have been employed. Moreover, the ensemble prediction of classifiers, bagging, boosting and stacking, has been applied to the dataset. The results of the experiments indicate that the SVM method using the boosting technique outperforms the other aforementioned methods.
A comprehensive investigation and comparison of Machine Learning Techniques in the domain of heart disease
Giovanna Sannino;Giuseppe De Pietro;
2017
Abstract
This paper aims to investigate and compare the accuracy of different data mining classification schemes, employing Ensemble Machine Learning Techniques, for the prediction of heart disease. The Cleveland data set for heart diseases, containing 303 instances, has been used as the main database for the training and testing of the developed system. 10-Fold Cross-Validation has been applied in order to increase the amount of data, which would otherwise have been limited. Different classifiers, namely Decision Tree (DT), Naïve Bayes (NB), Multilayer Perceptron (MLP), K-Nearest Neighbor (K-NN), Single Conjunctive Rule Learner (SCRL), Radial Basis Function (RBF) and Support Vector Machine (SVM), have been employed. Moreover, the ensemble prediction of classifiers, bagging, boosting and stacking, has been applied to the dataset. The results of the experiments indicate that the SVM method using the boosting technique outperforms the other aforementioned methods.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.