CNR Institutional Research Information System

To date, current investigations on Semi-Structured Data (SSD) have focused on query languages that operate directly on graph-based data by matching flexible path expressions against the graph-database topology. The problem with this approach is that it renounces in principle the benefits typically associated with typing information. In particular, storage and query optimisation techniques, representation of user-knowledge of the data, and validation of computations cannot be based upon static typing information. On the other hand, the various attempts to reintroduce types for SSD, while effectively returning some of the benefits of static typing, compromise the irregular nature of SSD databases, by allowing just for mild forms of irregularities. Our investigation is motivated by the observation that, despite the inherent irregularity of the structure, many or indeed most SSD databases contain one or more subsets that present a high degree of regularity and could therefore be treated as typed values of a programming language. In this thesis we lay the formal foundations underlying a novel query methodology based on an extraction system that, given an SSD database and a type of a target language, results in: (i) a subset of the database that is semantically equivalent to a value of the given type; (ii) a measure that informs the user about the quality of his type with respect to the original database. The extracted subset can then be converted into a value of that type and injected into the language environment, where it can be computed over with all the benefits of static typing.

Extracting Type Values From Semistructured Databases / Manghi, P.. - (26/10/2001), pp. 1-156.

Extracting Type Values From Semistructured Databases

Manghi P

26/10/2001

Abstract

To date, current investigations on Semi-Structured Data (SSD) have focused on query languages that operate directly on graph-based data by matching flexible path expressions against the graph-database topology. The problem with this approach is that it renounces in principle the benefits typically associated with typing information. In particular, storage and query optimisation techniques, representation of user-knowledge of the data, and validation of computations cannot be based upon static typing information. On the other hand, the various attempts to reintroduce types for SSD, while effectively returning some of the benefits of static typing, compromise the irregular nature of SSD databases, by allowing just for mild forms of irregularities. Our investigation is motivated by the observation that, despite the inherent irregularity of the structure, many or indeed most SSD databases contain one or more subsets that present a high degree of regularity and could therefore be treated as typed values of a programming language. In this thesis we lay the formal foundations underlying a novel query methodology based on an extraction system that, given an SSD database and a type of a target language, results in: (i) a subset of the database that is semantically equivalent to a value of the given type; (ii) a measure that informs the user about the quality of his type with respect to the original database. The extracted subset can then be converted into a value of that type and injected into the language environment, where it can be computed over with all the benefits of static typing.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				26
			
	Strutture organizzative
	
				Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
			
	Parole chiave
	
				dynamic typing
semi-structured data
			
	Tutor esterni
	
				Giorgio Ghelli
			
	Appare nelle tipologie:
	
				07.01 Tesi di dottorato

File in questo prodotto:

File	Dimensione	Formato
prod_425373-doc_151740.pdf accesso aperto Descrizione: Extracting Type Values From Semistructured Databases Dimensione 20.85 MB Formato Adobe PDF Visualizza/Apri	20.85 MB	Adobe PDF	Visualizza/Apri
prod_425373-doc_151741.pdf accesso aperto Descrizione: Extracting Type Values From Semistructured Databases Dimensione 19.23 MB Formato Adobe PDF Visualizza/Apri	19.23 MB	Adobe PDF	Visualizza/Apri
prod_425373-doc_151742.pdf accesso aperto Descrizione: Extracting Type Values From Semistructured Databases Dimensione 29.9 MB Formato Adobe PDF Visualizza/Apri	29.9 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/410897

Citazioni

ND

ND

ND

social impact