To date, current investigations on Semi-Structured Data (SSD) have focused on query languages that operate directly on graph-based data by matching flexible path expressions against the graph-database topology. The problem with this approach is that it renounces in principle the benefits typically associated with typing information. In particular, storage and query optimisation techniques, representation of user-knowledge of the data, and validation of computations cannot be based upon static typing information. On the other hand, the various attempts to reintroduce types for SSD, while effectively returning some of the benefits of static typing, compromise the irregular nature of SSD databases, by allowing just for mild forms of irregularities. Our investigation is motivated by the observation that, despite the inherent irregularity of the structure, many or indeed most SSD databases contain one or more subsets that present a high degree of regularity and could therefore be treated as typed values of a programming language. In this thesis we lay the formal foundations underlying a novel query methodology based on an extraction system that, given an SSD database and a type of a target language, results in: (i) a subset of the database that is semantically equivalent to a value of the given type; (ii) a measure that informs the user about the quality of his type with respect to the original database. The extracted subset can then be converted into a value of that type and injected into the language environment, where it can be computed over with all the benefits of static typing.
Extracting Type Values From Semistructured Databases / Manghi, P. - (26/10/2001), pp. 1-156.
Extracting Type Values From Semistructured Databases
Manghi P
26/10/2001
Abstract
To date, current investigations on Semi-Structured Data (SSD) have focused on query languages that operate directly on graph-based data by matching flexible path expressions against the graph-database topology. The problem with this approach is that it renounces in principle the benefits typically associated with typing information. In particular, storage and query optimisation techniques, representation of user-knowledge of the data, and validation of computations cannot be based upon static typing information. On the other hand, the various attempts to reintroduce types for SSD, while effectively returning some of the benefits of static typing, compromise the irregular nature of SSD databases, by allowing just for mild forms of irregularities. Our investigation is motivated by the observation that, despite the inherent irregularity of the structure, many or indeed most SSD databases contain one or more subsets that present a high degree of regularity and could therefore be treated as typed values of a programming language. In this thesis we lay the formal foundations underlying a novel query methodology based on an extraction system that, given an SSD database and a type of a target language, results in: (i) a subset of the database that is semantically equivalent to a value of the given type; (ii) a measure that informs the user about the quality of his type with respect to the original database. The extracted subset can then be converted into a value of that type and injected into the language environment, where it can be computed over with all the benefits of static typing.| File | Dimensione | Formato | |
|---|---|---|---|
|
prod_425373-doc_151740.pdf
accesso aperto
Descrizione: Extracting Type Values From Semistructured Databases
Dimensione
20.85 MB
Formato
Adobe PDF
|
20.85 MB | Adobe PDF | Visualizza/Apri |
|
prod_425373-doc_151741.pdf
accesso aperto
Descrizione: Extracting Type Values From Semistructured Databases
Dimensione
19.23 MB
Formato
Adobe PDF
|
19.23 MB | Adobe PDF | Visualizza/Apri |
|
prod_425373-doc_151742.pdf
accesso aperto
Descrizione: Extracting Type Values From Semistructured Databases
Dimensione
29.9 MB
Formato
Adobe PDF
|
29.9 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


