Finding information using Web search engines is not always successful. When search results are presented in a ranked list, users are often compelled to sift through a long list of snippets to find the information they are looking for. This paper presents a versatile system to reorganize search results into a partition of homogeneous document clusters using Data Mining techniques. The purpose is to help users to easily browse through the set of retrieved documents, by focusing on clusters whose characterizing keywords are directly pertinent to the query. We experienced interesting results by applying our techniques on snippets only, i.e., by running our application on the client side, 'outside' of the search engine. For general queries, the obtained clusters are usually natural collections of homogeneous documents, and often documents in the same cluster occur in distant positions in the ranked list returned by the search engine.
WebCat: Automatic Categorization of Web Search Results
Giannotti F;Nanni M;Pedreschi D;
2003
Abstract
Finding information using Web search engines is not always successful. When search results are presented in a ranked list, users are often compelled to sift through a long list of snippets to find the information they are looking for. This paper presents a versatile system to reorganize search results into a partition of homogeneous document clusters using Data Mining techniques. The purpose is to help users to easily browse through the set of retrieved documents, by focusing on clusters whose characterizing keywords are directly pertinent to the query. We experienced interesting results by applying our techniques on snippets only, i.e., by running our application on the client side, 'outside' of the search engine. For general queries, the obtained clusters are usually natural collections of homogeneous documents, and often documents in the same cluster occur in distant positions in the ranked list returned by the search engine.File | Dimensione | Formato | |
---|---|---|---|
prod_90988-doc_123363.pdf
solo utenti autorizzati
Descrizione: WebCat: Automatic Categorization of Web Search Results
Tipologia:
Versione Editoriale (PDF)
Dimensione
7.8 MB
Formato
Adobe PDF
|
7.8 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.