An unsupervised distance-based outlier detection method that finds the top n outliers of a large and high-dimensional data set D, is presented. The method provides a subset R of the data set, called robust solving set, that contains the top n outliers and can be used to predict if a new unseen object p is an outlier or not by computing the distances of p to only the objects in R. Experimental results show that the prediction accuracy of the robust solving set is comparable with that obtained by using the overall data set
Improving Prediction of Distance-Based Outliers
Basta Stefano;Pizzuti Clara
2004
Abstract
An unsupervised distance-based outlier detection method that finds the top n outliers of a large and high-dimensional data set D, is presented. The method provides a subset R of the data set, called robust solving set, that contains the top n outliers and can be used to predict if a new unseen object p is an outlier or not by computing the distances of p to only the objects in R. Experimental results show that the prediction accuracy of the robust solving set is comparable with that obtained by using the overall data setFile in questo prodotto:
Non ci sono file associati a questo prodotto.
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.