WebCat

Giannotti, F; Nanni, M; Samaritani, F

WebCat is a versatile system which reorganizes search results into a partition of homogeneous document clusters using Data Mining techniques. The purpose is to help users to easily browse through the set of retrieved documents, by focusing on clusters whose characterizing keywords are directly pertinent to the search. WebCat submits a query specified by the user to the Google search engine, and retrieves a large number of snippets, i.e., answers. Then, snippets are modelled as sets of (clean, stemmed) terms and are partitioned into clusters by means of the Transactional K-means algorithm. Clusters are then presented to the users by means of their centroids (i.e., sets of terms which well represent the content of each cluster) which can be used as a fast access method to the answers contained in each cluster. The overall system is computationally light, very fast, and can be run on the client side as a Internet Explorer toolbar (similar to the Google Toolbar).

WebCat

Giannotti F;Nanni M;Samaritani F

2003

Abstract

WebCat is a versatile system which reorganizes search results into a partition of homogeneous document clusters using Data Mining techniques. The purpose is to help users to easily browse through the set of retrieved documents, by focusing on clusters whose characterizing keywords are directly pertinent to the search. WebCat submits a query specified by the user to the Google search engine, and retrieves a large number of snippets, i.e., answers. Then, snippets are modelled as sets of (clean, stemmed) terms and are partitioned into clusters by means of the Transactional K-means algorithm. Clusters are then presented to the users by means of their centroids (i.e., sets of terms which well represent the content of each cluster) which can be used as a fast access method to the answers contained in each cluster. The overall system is computationally light, very fast, and can be run on the client side as a Internet Explorer toolbar (similar to the Google Toolbar).

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2003
			
	Strutture organizzative
	
				Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
			
	Parole chiave
	
				Web mining
Clustering
Search engines
			
	Appare nelle tipologie:
	
				05.11 Software

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/180200

Citazioni

ND

ND

ND

CNR Institutional Research Information System

WebCat

Giannotti F;Nanni M;Samaritani F

2003

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

Citazioni

social impact

CNR Institutional Research Information System

WebCat

Giannotti F;Nanni M;Samaritani F

2003

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Informazioni

Citazioni

social impact

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)