The behaviour of inlink and outlink distributions appears to be one of the most studied property of the web structure. The literature agrees that the inlink distribution follows a power law, but no such agreement applies to the outlink distribution. Accurate observations show that in the low degree region the link distribution fails to fit a power law with a discrep- ancy larger for outlinks than for inlinks. Moreover a power law, as well as any continuous function, does not fit the scattered behaviour shared by both the link distributions for large degree values. The linking model we consider here is a mixed one, based on both the preferential attachment and the uniform attachment strategy. A new approximation technique is devised to detect the parameters of the steady state solution which describe a real data set. A stochastic technique is suggested to describe the scattering of the data. With these techniques the model appears to be well suited for describing both inlink and outlink distributions. The experimentation on subsets of the real web and of Wikipedia shows that our approach produces an approximation more adequate than the power law. This approximation suggests that the two attachment strategies play a different role in the inlink and the outlink case.

A stochasic model for the link analysis of the Web

Favati P;
2007

Abstract

The behaviour of inlink and outlink distributions appears to be one of the most studied property of the web structure. The literature agrees that the inlink distribution follows a power law, but no such agreement applies to the outlink distribution. Accurate observations show that in the low degree region the link distribution fails to fit a power law with a discrep- ancy larger for outlinks than for inlinks. Moreover a power law, as well as any continuous function, does not fit the scattered behaviour shared by both the link distributions for large degree values. The linking model we consider here is a mixed one, based on both the preferential attachment and the uniform attachment strategy. A new approximation technique is devised to detect the parameters of the steady state solution which describe a real data set. A stochastic technique is suggested to describe the scattering of the data. With these techniques the model appears to be well suited for describing both inlink and outlink distributions. The experimentation on subsets of the real web and of Wikipedia shows that our approach produces an approximation more adequate than the power law. This approximation suggests that the two attachment strategies play a different role in the inlink and the outlink case.
2007
Istituto di informatica e telematica - IIT
Attachment Strategy
Web Graph
Link distribution
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/24780
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact