In this paper we test a new approach to blog distillation, defined as the task in which, given a user query, the system ranks the blogs in descending order of relevance to the query topic. Our approach is based on the idea of adding a link analysis phase to the standard retrieval-by-topicality phase. However, differently from other link analysis methods, we check whether a given hyperlink is a citation with a positive or a negative nature, i.e., if it expresses approval or disapproval of the hyperlinked page by the hyperlinking page. This allows us to test the hypothesis that distinguishing approval from disapproval brings about benefits in the blog distillation task. We have tested our method on the Blogs08 collection used in the last two editions (2009 and 2010) of the TREC Blog Track, a collection consisting of more than one million blogs and more than 28 million blog posts. Unfortunately, the experimental results seem to disconfirm the above hypothesis, due to the low level of connectivity of the collection which severely limits the impact of a link analysis phase (and, a fortiori, of the attempt to distinguish endorsements from rebuttals). Application contexts other than the blogosphere (such as, e.g., the domain of eBay transactions) are probably more suited to such an approach.

Endorsements and rebuttals in blog distillation

Esuli A;Sebastiani F;Silvestri F
2013

Abstract

In this paper we test a new approach to blog distillation, defined as the task in which, given a user query, the system ranks the blogs in descending order of relevance to the query topic. Our approach is based on the idea of adding a link analysis phase to the standard retrieval-by-topicality phase. However, differently from other link analysis methods, we check whether a given hyperlink is a citation with a positive or a negative nature, i.e., if it expresses approval or disapproval of the hyperlinked page by the hyperlinking page. This allows us to test the hypothesis that distinguishing approval from disapproval brings about benefits in the blog distillation task. We have tested our method on the Blogs08 collection used in the last two editions (2009 and 2010) of the TREC Blog Track, a collection consisting of more than one million blogs and more than 28 million blog posts. Unfortunately, the experimental results seem to disconfirm the above hypothesis, due to the low level of connectivity of the collection which severely limits the impact of a link analysis phase (and, a fortiori, of the attempt to distinguish endorsements from rebuttals). Application contexts other than the blogosphere (such as, e.g., the domain of eBay transactions) are probably more suited to such an approach.
2013
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
Blog search
Sentiment analysis
Random walks
File in questo prodotto:
File Dimensione Formato  
prod_276815-doc_78104.pdf

solo utenti autorizzati

Descrizione: Endorsements and rebuttals in blog distillation
Tipologia: Versione Editoriale (PDF)
Dimensione 525.35 kB
Formato Adobe PDF
525.35 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/253456
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 7
  • ???jsp.display-item.citation.isi??? ND
social impact