Access to e-Government data is challenging due to the heterogeneity and complexity of the public information ecosystem. Controlled Vocabularies (CVs) provide a key to disclosing the potential of Open Government data, by supplying common terms for marking up metadata and data in a consistent and coherent way. However, quality information is needed to help public institutions decide whether to adopt an existing or newly created CV. The paper discusses how to evaluate and document CV quality thus facilitating a comparison of different controlled vocabularies based on contextual information. The Analytical Hierarchy Process (AHP) is adopted to assess the overall quality and rank of a controlled vocabulary, by integrating various quality dimensions according to the decision maker's needs. A set of e-Government controlled vocabularies that facilitate the semantic interoperability of e-Government data are selected as a testbed, and updated quality values are made available as Linked Data. Multi-step guidelines are also defined promoting and complementing the adoption of W3C recommendations to provide machine-readable quality metadata. This fosters reliability and re-usability by providing consumers with information on the assessment process carried out and the outcomes achieved. We illustrate the application of these guidelines by focusing on provenance and quality documentation.

Documenting context-based quality assessment of controlled vocabularies

R Albertoni;M De Martino;A Quarati
2018

Abstract

Access to e-Government data is challenging due to the heterogeneity and complexity of the public information ecosystem. Controlled Vocabularies (CVs) provide a key to disclosing the potential of Open Government data, by supplying common terms for marking up metadata and data in a consistent and coherent way. However, quality information is needed to help public institutions decide whether to adopt an existing or newly created CV. The paper discusses how to evaluate and document CV quality thus facilitating a comparison of different controlled vocabularies based on contextual information. The Analytical Hierarchy Process (AHP) is adopted to assess the overall quality and rank of a controlled vocabulary, by integrating various quality dimensions according to the decision maker's needs. A set of e-Government controlled vocabularies that facilitate the semantic interoperability of e-Government data are selected as a testbed, and updated quality values are made available as Linked Data. Multi-step guidelines are also defined promoting and complementing the adoption of W3C recommendations to provide machine-readable quality metadata. This fosters reliability and re-usability by providing consumers with information on the assessment process carried out and the outcomes achieved. We illustrate the application of these guidelines by focusing on provenance and quality documentation.
2018
Istituto di Matematica Applicata e Tecnologie Informatiche - IMATI -
Controlled vocabularies
data quality
metadata
Analytic Hierarchy Process
context
workflow provenance
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/371667
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 9
  • ???jsp.display-item.citation.isi??? ND
social impact