Developing and Aligning a Detailed Controlled Vocabulary for Artwork

Bulla, L.; Frangipane, M. C.; Mancinelli, M. L.; Marinucci, L.; Mongiovi, M.; Porena, M.; Presutti, V.; Veninata, C.

doi:10.1007/978-3-031-15743-1_48

Controlled vocabularies have proved to be critical for data interoperability and accessibility. In the cultural heritage (CH) domain, description of artworks are often given as free text, thus making filtering and searching burdensome (e.g. listing all artworks of a specific type). Despite being multi-language and quite detailed, the Getty’s Art & Architecture Thesaurus –a de facto standard for describing artworks– has a low coverage for languages different than English and sometimes does not reach the required degree of granularity to describe specific niche artworks. We build upon the Italian Vocabulary of Artworks, developed by the Italian Ministry of Cultural Heritage (MIC) and a set of free text descriptions from ArCO, the knowledge graph of the Italian CH, to propose an extension of the Vocabulary of Artworks and align it to the Getty’s thesaurus. Our framework relies on text matching and natural language processing tools for suggesting candidate alignments between free text and terms and between cross-vocabulary terms, with a human in the loop for validation and refinement. We produce 1.166 new terms (31% more w.r.t. the original vocabulary) and 1.330 links to the Getty’s thesaurus, with estimated coverage of 21%.

Developing and Aligning a Detailed Controlled Vocabulary for Artwork

Bulla L.;Frangipane M. C.;Mancinelli M. L.;Marinucci L.;Mongiovi M.;Porena M.;Presutti V.;Veninata C.

2022

Abstract

Controlled vocabularies have proved to be critical for data interoperability and accessibility. In the cultural heritage (CH) domain, description of artworks are often given as free text, thus making filtering and searching burdensome (e.g. listing all artworks of a specific type). Despite being multi-language and quite detailed, the Getty’s Art & Architecture Thesaurus –a de facto standard for describing artworks– has a low coverage for languages different than English and sometimes does not reach the required degree of granularity to describe specific niche artworks. We build upon the Italian Vocabulary of Artworks, developed by the Italian Ministry of Cultural Heritage (MIC) and a set of free text descriptions from ArCO, the knowledge graph of the Italian CH, to propose an extension of the Vocabulary of Artworks and align it to the Getty’s thesaurus. Our framework relies on text matching and natural language processing tools for suggesting candidate alignments between free text and terms and between cross-vocabulary terms, with a human in the loop for validation and refinement. We produce 1.166 new terms (31% more w.r.t. the original vocabulary) and 1.330 links to the Getty’s thesaurus, with estimated coverage of 21%.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2022
			
	Strutture organizzative
	
				Istituto di Scienze e Tecnologie della Cognizione - ISTC
			
	Codice ISBN
	
				978-3-031-15743-1
			
	Parole chiave
	
				Controlled vocabularies
Cultural heritage
Semantic similarity
Natural Language Processing, String-matching
			
	Appare nelle tipologie:
	
				04.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
2022_Marinucci_SWODCH.pdf solo utenti autorizzati Tipologia: Versione Editoriale (PDF) Licenza: NON PUBBLICO - Accesso privato/ristretto Dimensione 343.45 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	343.45 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/486494

Citazioni

ND

4

2

CNR Institutional Research Information System

Developing and Aligning a Detailed Controlled Vocabulary for Artwork

Bulla L.;Frangipane M. C.;Mancinelli M. L.;Marinucci L.;Mongiovi M.;Porena M.;Presutti V.;Veninata C.

2022

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

Citazioni

social impact

CNR Institutional Research Information System

Developing and Aligning a Detailed Controlled Vocabulary for Artwork

Bulla L.;Frangipane M. C.;Mancinelli M. L.;Marinucci L.;Mongiovi M.;Porena M.;Presutti V.;Veninata C.

2022

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Informazioni

Citazioni

social impact

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)