Query Logs collected by a Web Search Engine (WSE) constitute a valuable source of information which can be used in several ways to enhance efficiency and efficacy of the complex process of searching. This paper surveys the results recently achieved by our group in the design of innovative solutions targeting parallel Information Retrieval (IR) systems. Our solutions exploit the knowledge deriving from the patterns of common usage of the system extracted from query logs. Such knowledge has been used: (1), to devise an effective policy for caching WSE query results; (2), to drive the partitioning of the inverted index among the nodes of a termpartitioned, parallel IR system; (3), to perform document partitioning and effective collection selection in a document-partitioned, parallel IR system. The techniques and algorithms used vary from simple statistical analysis, to frequent pattern mining, and document/query co-clustering. The have the common denominator of exploiting past usage information, and of granting remarkable improvements in efficiency or efficacy. The paper briefly describes the proposals and the framework of their application, and reports the results of experiments conducted on large query logs of real WSEs.

On the value of query logs for modern information retrieval

Perego R;Laforenza D;Puppin D
2006

Abstract

Query Logs collected by a Web Search Engine (WSE) constitute a valuable source of information which can be used in several ways to enhance efficiency and efficacy of the complex process of searching. This paper surveys the results recently achieved by our group in the design of innovative solutions targeting parallel Information Retrieval (IR) systems. Our solutions exploit the knowledge deriving from the patterns of common usage of the system extracted from query logs. Such knowledge has been used: (1), to devise an effective policy for caching WSE query results; (2), to drive the partitioning of the inverted index among the nodes of a termpartitioned, parallel IR system; (3), to perform document partitioning and effective collection selection in a document-partitioned, parallel IR system. The techniques and algorithms used vary from simple statistical analysis, to frequent pattern mining, and document/query co-clustering. The have the common denominator of exploiting past usage information, and of granting remarkable improvements in efficiency or efficacy. The paper briefly describes the proposals and the framework of their application, and reports the results of experiments conducted on large query logs of real WSEs.
2006
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
8876990437
Query log analysis
File in questo prodotto:
File Dimensione Formato  
prod_138968-doc_130449.pdf

solo utenti autorizzati

Descrizione: On the value of query logs for modern information retrieval
Tipologia: Versione Editoriale (PDF)
Dimensione 643.88 kB
Formato Adobe PDF
643.88 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/97831
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact