To date, current investigations on Semi-Structured Data (SSD) have focused on query languages that operate directly on graph-based data by matching flexible path expressions against the graph-database topology. The problem with this approach is that it renounces in principle the benefits typically associated with typing information. In particular, storage and query optimisation techniques, representation of user-knowledge of the data, and validation of computations cannot be based upon static typing information. On the other hand, the various attempts to reintroduce types for SSD, while effectively returning some of the benefits of static typing, compromise the irregular nature of SSD databases, by allowing just for mild forms of irregularities. Our investigation is motivated by the observation that, despite the inherent irregularity of the structure, many or indeed most SSD databases contain one or more subsets that present a high degree of regularity and could therefore be treated as typed values of a programming language. In this thesis we lay the formal foundations underlying a novel query methodology based on an extraction system that, given an SSD database and a type of a target language, results in: (i) a subset of the database that is semantically equivalent to a value of the given type; (ii) a measure that informs the user about the quality of his type with respect to the original database. The extracted subset can then be converted into a value of that type and injected into the language environment, where it can be computed over with all the benefits of static typing.

Extracting Type Values From Semistructured Databases / Manghi, P. - (26/10/2001), pp. 1-156.

Extracting Type Values From Semistructured Databases

Manghi P
26/10/2001

Abstract

To date, current investigations on Semi-Structured Data (SSD) have focused on query languages that operate directly on graph-based data by matching flexible path expressions against the graph-database topology. The problem with this approach is that it renounces in principle the benefits typically associated with typing information. In particular, storage and query optimisation techniques, representation of user-knowledge of the data, and validation of computations cannot be based upon static typing information. On the other hand, the various attempts to reintroduce types for SSD, while effectively returning some of the benefits of static typing, compromise the irregular nature of SSD databases, by allowing just for mild forms of irregularities. Our investigation is motivated by the observation that, despite the inherent irregularity of the structure, many or indeed most SSD databases contain one or more subsets that present a high degree of regularity and could therefore be treated as typed values of a programming language. In this thesis we lay the formal foundations underlying a novel query methodology based on an extraction system that, given an SSD database and a type of a target language, results in: (i) a subset of the database that is semantically equivalent to a value of the given type; (ii) a measure that informs the user about the quality of his type with respect to the original database. The extracted subset can then be converted into a value of that type and injected into the language environment, where it can be computed over with all the benefits of static typing.
26
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
dynamic typing
semi-structured data
Giorgio Ghelli
File in questo prodotto:
File Dimensione Formato  
prod_425373-doc_151740.pdf

accesso aperto

Descrizione: Extracting Type Values From Semistructured Databases
Dimensione 20.85 MB
Formato Adobe PDF
20.85 MB Adobe PDF Visualizza/Apri
prod_425373-doc_151741.pdf

accesso aperto

Descrizione: Extracting Type Values From Semistructured Databases
Dimensione 19.23 MB
Formato Adobe PDF
19.23 MB Adobe PDF Visualizza/Apri
prod_425373-doc_151742.pdf

accesso aperto

Descrizione: Extracting Type Values From Semistructured Databases
Dimensione 29.9 MB
Formato Adobe PDF
29.9 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/410897
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact