Modern Web IR systems have to manage collections of billions of documents. The indexes used to represent them are very large data structures, the form of which can have a big impact on the quality and the speed of IR algorithms. Traditionally, two main ways are used to model the documents available: the bag-of-words model, and the vector-space model. In the query-vector document model, documents are mod- eled with the list of queries they match, along with the rank they get for each. The query-vector representation of a doc- ument is built out of a query-log. A reference search engine is used in the building phase: for every query in the training set, the system stores the first 100 results along with their rank. This creates a matrix, with documents on columns and queries on rows, where each entry is the rank of a doc- ument for a given query.

The query-vector document model

Puppin D;Silvestri F
2006

Abstract

Modern Web IR systems have to manage collections of billions of documents. The indexes used to represent them are very large data structures, the form of which can have a big impact on the quality and the speed of IR algorithms. Traditionally, two main ways are used to model the documents available: the bag-of-words model, and the vector-space model. In the query-vector document model, documents are mod- eled with the list of queries they match, along with the rank they get for each. The query-vector representation of a doc- ument is built out of a query-log. A reference search engine is used in the building phase: for every query in the training set, the system stores the first 100 results along with their rank. This creates a matrix, with documents on columns and queries on rows, where each entry is the rank of a doc- ument for a given query.
2006
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
Document Partitioning
Collection Selection
File in questo prodotto:
File Dimensione Formato  
prod_120631-doc_129811.pdf

solo utenti autorizzati

Descrizione: The query-vector document model
Tipologia: Versione Editoriale (PDF)
Dimensione 134.01 kB
Formato Adobe PDF
134.01 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/85955
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact