CNR Institutional Research Information System

Increasingly the datasets used for data mining are becoming huge and physically distributed. Since the distributed knowledge discovery process is bothdata and computational intensive, the Grid is a natural platform for deploying a high performance data mining service. The focus of this paper is on the core services of such a Grid infrastructure. In particular we concentrate our attention on the design and implementation of specialized broker aware of data source locations and resource needs of data mining tasks. Allocation and scheduling decisions are taken on the basis of performance cost metrics and models that exploit knowledge about previous executions, and use sampling to acquire estimate about execution behavior.

Scheduling high performance data mining tasks on a data grid environment

Orlando S;Palmerini P;Perego R;Silvestri F

2002

Abstract

Increasingly the datasets used for data mining are becoming huge and physically distributed. Since the distributed knowledge discovery process is bothdata and computational intensive, the Grid is a natural platform for deploying a high performance data mining service. The focus of this paper is on the core services of such a Grid infrastructure. In particular we concentrate our attention on the design and implementation of specialized broker aware of data source locations and resource needs of data mining tasks. Allocation and scheduling decisions are taken on the basis of performance cost metrics and models that exploit knowledge about previous executions, and use sampling to acquire estimate about execution behavior.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2002
			
	Strutture organizzative
	
				Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
			
	Parole chiave
	
				High performance
Data Mining
			
	Appare nelle tipologie:
	
				04.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
prod_91595-doc_123024.pdf solo utenti autorizzati Descrizione: Scheduling high performance data mining tasks on a data grid environment Tipologia: Versione Editoriale (PDF) Dimensione 335.67 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	335.67 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/101848

Citazioni

ND

ND

ND

social impact