Freshness of information in real-time search is central in social networks, news, blogs and micro-blogs. Nevertheless, there is not a clear experimental evidence that shows what principled approach effectively combines time and content. We introduce a novel approach to model freshness using a survival analysis of relevance over time. In such models, freshness is measured by the tail probability of relevance over time. We also assume that the probability distributions for freshness are heavy-tailed. The heavy-tailed models of fresh-ness are shown to be highly effective on the micro-blogging test collection of TREC 2011. The improvements over the stateof-the-art time-based models are statistically significant or moderately significant.
Survival analysis for freshness in microblogging search
Carlo Gaibisso
2011
Abstract
Freshness of information in real-time search is central in social networks, news, blogs and micro-blogs. Nevertheless, there is not a clear experimental evidence that shows what principled approach effectively combines time and content. We introduce a novel approach to model freshness using a survival analysis of relevance over time. In such models, freshness is measured by the tail probability of relevance over time. We also assume that the probability distributions for freshness are heavy-tailed. The heavy-tailed models of fresh-ness are shown to be highly effective on the micro-blogging test collection of TREC 2011. The improvements over the stateof-the-art time-based models are statistically significant or moderately significant.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.