The availability of cost-effective data collections and storage hardware has allowed organizations to accumulate very large data sets, which are a potential source of previously unknown valuable information. The process of discovering interesting patterns in such large data sets is referred to as data mining. Outlier detection is a data mining task consisting in the discovery of observations which deviate substantially from the rest of the data, and has many important practical applications. Outlier detection in very large data sets is however computationally very demanding and currently requires highperformance computing facilities. We propose a family of parallel algorithms for Graphic Processing Units (GPU), derived from two distance-based outlier detection algorithms: the BruteForce and the SolvingSet. We analyze their performance with an extensive set of experiments, comparing the GPU implementations with the base CPU versions and obtaining significant speedups.

Fast Outlier Detection Using a GPU

S Basta;
2013

Abstract

The availability of cost-effective data collections and storage hardware has allowed organizations to accumulate very large data sets, which are a potential source of previously unknown valuable information. The process of discovering interesting patterns in such large data sets is referred to as data mining. Outlier detection is a data mining task consisting in the discovery of observations which deviate substantially from the rest of the data, and has many important practical applications. Outlier detection in very large data sets is however computationally very demanding and currently requires highperformance computing facilities. We propose a family of parallel algorithms for Graphic Processing Units (GPU), derived from two distance-based outlier detection algorithms: the BruteForce and the SolvingSet. We analyze their performance with an extensive set of experiments, comparing the GPU implementations with the base CPU versions and obtaining significant speedups.
2013
Istituto di Calcolo e Reti ad Alte Prestazioni - ICAR
978-1-4799-0836-3
Data mining exploiting GPUs
outlier detection
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/241554
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact