In the ocean ofWeb data, Web search engines are the primary way to access content. As the data is on the order of petabytes, current search engines are very large centralized systems based on replicated clusters. Web data, however, is always evolving. The number of Web sites continues to grow rapidly and there are currently more than 20 billion indexed pages. In the near future, centralized systems are likely to become ineffective against such a load, thus suggesting the need of fully distributed search engines. Such engines need to achieve the following goals: high quality answers, fast response time, high query throughput, and scalability. In this paper we survey and organize recent research results, outlining the main challenges of designing a distributed Web retrieval system.

Challenges on distributed Web retrieval

Silvestri F
2007

Abstract

In the ocean ofWeb data, Web search engines are the primary way to access content. As the data is on the order of petabytes, current search engines are very large centralized systems based on replicated clusters. Web data, however, is always evolving. The number of Web sites continues to grow rapidly and there are currently more than 20 billion indexed pages. In the near future, centralized systems are likely to become ineffective against such a load, thus suggesting the need of fully distributed search engines. Such engines need to achieve the following goals: high quality answers, fast response time, high query throughput, and scalability. In this paper we survey and organize recent research results, outlining the main challenges of designing a distributed Web retrieval system.
2007
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
H.3 Information Storage and Retrieval
H.3.5 Online Information Services. Web-based services
H.3.5 Online Information Services. Commercial services
Web Search Engine Engineering
Distributed Search Engines
File in questo prodotto:
File Dimensione Formato  
prod_91752-doc_21534.pdf

non disponibili

Descrizione: "Challenges on distributed Web retrieval"
Tipologia: Versione Editoriale (PDF)
Dimensione 484.77 kB
Formato Adobe PDF
484.77 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/57645
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 63
  • ???jsp.display-item.citation.isi??? ND
social impact