The combination of multiple clusterings for partitioning XML documents is proposed as a promising method, aimed to decompose the inherently difficult problem of catching structural and content relationships within an XML corpus into a number of simpler subproblems. To verify the validity of such an intuition, a new technique for partitioning XML documents is presented, in which conventional clustering techniques operating on flattened representations of individual aspects of the XML documents (that also include some rare patterns) are used to partition the available XML corpus. The effectiveness of the devised technique is revealed by a comparative empirical evaluation on benchmark XML corpora. © 2013 IEEE.

Developments in partitioning XML documents by content and structure based on combining multiple clusterings

Ortale Riccardo
2013

Abstract

The combination of multiple clusterings for partitioning XML documents is proposed as a promising method, aimed to decompose the inherently difficult problem of catching structural and content relationships within an XML corpus into a number of simpler subproblems. To verify the validity of such an intuition, a new technique for partitioning XML documents is presented, in which conventional clustering techniques operating on flattened representations of individual aspects of the XML documents (that also include some rare patterns) are used to partition the available XML corpus. The effectiveness of the devised technique is revealed by a comparative empirical evaluation on benchmark XML corpora. © 2013 IEEE.
2013
Istituto di Calcolo e Reti ad Alte Prestazioni - ICAR
9781479929719
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/287213
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 11
  • ???jsp.display-item.citation.isi??? ND
social impact