We introduce a measure to compute similarity between two sequences containing accesses to Web pages, to be exploited in a clustering approach for grouping sessions of accesses to a Web site. The notion of sequence similarity is parametric to the sequence topology, and the similarity among Web pages within the sequences. In our formalization, two Web pages are similar if they can be considered synonymies not only from a content point of view, but also from a usage point of view, i.e., if users exhibit the same behavior on both pages. The refined notion of page similarity, as well as the related notion of sequence siilarity, are envisaged to be effective in the application of a centroid-based clustering technique to the personalization of Web experience.
Similarity-Based Clustering of Web Transactions
Manco Giuseppe;Ortale Riccardo;
2003
Abstract
We introduce a measure to compute similarity between two sequences containing accesses to Web pages, to be exploited in a clustering approach for grouping sessions of accesses to a Web site. The notion of sequence similarity is parametric to the sequence topology, and the similarity among Web pages within the sequences. In our formalization, two Web pages are similar if they can be considered synonymies not only from a content point of view, but also from a usage point of view, i.e., if users exhibit the same behavior on both pages. The refined notion of page similarity, as well as the related notion of sequence siilarity, are envisaged to be effective in the application of a centroid-based clustering technique to the personalization of Web experience.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.