CNR Institutional Research Information System

Digital Libraries continue to evolve towards research environments supporting access and management of multiform Information Objects spread across multiple data sources and organizational domains. This evolution has introduced the need to deal with Information Objects having traits different from those characterizing Digital Libraries at their early stages and to revise the services supporting their management. Tabular data represent a class of Information Objects that require to be efficiently managed because of their core role in many eScience scenarios. This paper discusses the tabular data characterization problem, i.e., the problem of identifying the reference dataset of any column of the dataset. In particular, the paper presents an approach based on lexical matching techniques to support users during the data curation phase by providing them with a ranked list of reference datasets suitable for a dataset column.

Supporting tabular data characterization in a large scale data infrastructure by lexical matching techniques

Candela L;Coro G;Pagano P

2013

Abstract

Digital Libraries continue to evolve towards research environments supporting access and management of multiform Information Objects spread across multiple data sources and organizational domains. This evolution has introduced the need to deal with Information Objects having traits different from those characterizing Digital Libraries at their early stages and to revise the services supporting their management. Tabular data represent a class of Information Objects that require to be efficiently managed because of their core role in many eScience scenarios. This paper discusses the tabular data characterization problem, i.e., the problem of identifying the reference dataset of any column of the dataset. In particular, the paper presents an approach based on lexical matching techniques to support users during the data curation phase by providing them with a ranked list of reference datasets suitable for a dataset column.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2013
			
	Strutture organizzative
	
				Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
			
	Lingua/e
	
				Inglese
			
	Supervisori e coordinatori esterni
	
				M. Agosti, F. Esposito, S. Ferilli, N. Ferro
			
	Serie
	
				COMMUNICATIONS IN COMPUTER AND INFORMATION SCIENCE (PRINT)
			
	Titolo del Volume
	
				Digital Libraries and Archives. 8th Italian Research Conference. IRCDL 2012. Revised Selected Papers
			
	Da pagina
	
				21
			
	A pagina
	
				32
			
	Codice ISBN
	
				978-3-642-35833-3
			
	Codice DOI
	
				https://dx.doi.org/10.1007/978-3-642-35834-0_5
			
	URL
	
				http://link.springer.com/chapter/10.1007%2F978-3-642-35834-0_5#
			
	Referee
	
				Sì, ma tipo non specificato
			
	Parole chiave
	
				Data curation
Large-scale data infrastructures
Lexical similarity
			
	Altre informazioni
	
				Tipo Progetto EU_FP7 
Data e-Infrastructure Initiative for Fisheries Management and Conservation of Marine Living Resources 
Acronimo IMARINE 
Grant agreement 283644
			
	Codice Scopus
	
				2-s2.0-84873876064
			
	Numero autori
	
				3
			
	Tipologia
	
				02 Contributo in Volume::02.01 Contributo in volume (Capitolo o Saggio)
			
	Tipologia Login Miur
	
				268
			
	Fulltext
	
				restricted
			
	Tutti gli autori
	
						Candela, L; Coro, G; Pagano, P
					
	Tipologia
	
				info:eu-repo/semantics/bookPart
			
	Identificativo progetto
	
	Titolo Progetto
	
									Data e-Infrastructure Initiative for Fisheries Management and Conservation of Marine Living Resources
								
	Acronimo
	
									IMARINE
								
	Finanziamento
	
									FP7
								
	N. Contratto
	
									283644
								
	Appare nelle tipologie:
	
				02.01 Contributo in volume (Capitolo o Saggio)

File in questo prodotto:

File	Dimensione	Formato
prod_277226-doc_78158.pdf solo utenti autorizzati Descrizione: Supporting Tabular Data Characterization in a Large Scale Data Infrastructure by Lexical Matching Techniques Tipologia: Versione Editoriale (PDF) Dimensione 179.27 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	179.27 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/246161

Citazioni

ND

1

ND

social impact