Preservation, accessibility, readability improvement and content analysis of ancient documents are nowadays urgent needs, to prevent loosing much of our past memory. Indeed, the damages that they have suffered over time still cause a progressive decay, and the fragility of rare and important historical documents prevents their direct access by scholars and historians. Furthermore, low contrast and degradations (e.g. back-to-front interferences), make their reading difficult. Finally, interesting features that could provide very important information from a historical and cultural point of view are often barely detectable in the original documents. Multispectral and multisensory imaging and digital image processing techniques can be used to enhance readability and reveal new information. The digital processing may require the application of many different techniques to the original digital image, such as blind source separation (BSS) for an effective restoration and content analysis. The results of these activities may then be subjected to other processes for document interpretation, such as OCR for transcription. Processing of this kind could become standard practice in libraries and archives and therefore improve the access and usability of cultural heritage archival material. A major challenge is therefore the creation of digital libraries, where, for each document, the plurality of its representations will be appropriately managed. These include all the image components available and all the subsequent processed versions, along with the parameters used by the processing techniques. This rich description of acquisitions and image processing results should support the archival and the retrieval based on the features of the images and on the processing they have undergone. At the same time, the availability of traditional descriptive metadata should support content based search, such as those usually done in a Digital Library, enabling the document retrieval by using attributes such as its title, the author, the creation date, etc. In this paper, we propose a metadata schema model to support such a combination of classical and new ways of describing a document and its analysis process, and we illustrate a Metadata Editor (ME) that supports the creation and editing of the metadata. The metadata schema we propose extends existing metadata representations, and describes the semantic content of the documents in its whole, and that of the results obtained after its processing. It satisfies three broad requirements: i) it enables the representation and description of a cultural heritage object to support its retrieval and access; ii) it supports the description of the complete acquisition process, and of all the processing performed on the digital representation of the object and iii) it is fully compliant to existing standard metadata schemata, so that interrelation of existing archives and reuse of existing resources is guaranteed. The metadata schema is composed of different interrelated entities describing the physical object (i.e. the actual physical work of art), its digital representation (i.e. how the object is represented in digital form), and its digital elaboration (i.e. all processing activities performed on the digital representation in order to improve its quality and readability). The metadata schema also allows one to specify the author of the cultural object and the organizations which are responsible for their preservation. The paper also describes a metadata editor, a tool providing an easy way to create and edit the metadata records. The metadata editor allows users to create, search and edit metadata records. It also supports the use of controlled vocabularies for the different metadata elements.

Editing metadata to support the acquisition, content analysis, storage and retrieval of ancient documents

Debole F;Salerno E;Savino P;Tonazzini A
2011

Abstract

Preservation, accessibility, readability improvement and content analysis of ancient documents are nowadays urgent needs, to prevent loosing much of our past memory. Indeed, the damages that they have suffered over time still cause a progressive decay, and the fragility of rare and important historical documents prevents their direct access by scholars and historians. Furthermore, low contrast and degradations (e.g. back-to-front interferences), make their reading difficult. Finally, interesting features that could provide very important information from a historical and cultural point of view are often barely detectable in the original documents. Multispectral and multisensory imaging and digital image processing techniques can be used to enhance readability and reveal new information. The digital processing may require the application of many different techniques to the original digital image, such as blind source separation (BSS) for an effective restoration and content analysis. The results of these activities may then be subjected to other processes for document interpretation, such as OCR for transcription. Processing of this kind could become standard practice in libraries and archives and therefore improve the access and usability of cultural heritage archival material. A major challenge is therefore the creation of digital libraries, where, for each document, the plurality of its representations will be appropriately managed. These include all the image components available and all the subsequent processed versions, along with the parameters used by the processing techniques. This rich description of acquisitions and image processing results should support the archival and the retrieval based on the features of the images and on the processing they have undergone. At the same time, the availability of traditional descriptive metadata should support content based search, such as those usually done in a Digital Library, enabling the document retrieval by using attributes such as its title, the author, the creation date, etc. In this paper, we propose a metadata schema model to support such a combination of classical and new ways of describing a document and its analysis process, and we illustrate a Metadata Editor (ME) that supports the creation and editing of the metadata. The metadata schema we propose extends existing metadata representations, and describes the semantic content of the documents in its whole, and that of the results obtained after its processing. It satisfies three broad requirements: i) it enables the representation and description of a cultural heritage object to support its retrieval and access; ii) it supports the description of the complete acquisition process, and of all the processing performed on the digital representation of the object and iii) it is fully compliant to existing standard metadata schemata, so that interrelation of existing archives and reuse of existing resources is guaranteed. The metadata schema is composed of different interrelated entities describing the physical object (i.e. the actual physical work of art), its digital representation (i.e. how the object is represented in digital form), and its digital elaboration (i.e. all processing activities performed on the digital representation in order to improve its quality and readability). The metadata schema also allows one to specify the author of the cultural object and the organizations which are responsible for their preservation. The paper also describes a metadata editor, a tool providing an easy way to create and edit the metadata records. The metadata editor allows users to create, search and edit metadata records. It also supports the use of controlled vocabularies for the different metadata elements.
2011
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
978-88-905639-3-5
Ancient document preservation and accessibility
Metadata schema for multispectral images
Metadata editor tool
File in questo prodotto:
File Dimensione Formato  
prod_206794-doc_46595.pdf

accesso aperto

Descrizione: abstract/comunicazione
Tipologia: Versione Editoriale (PDF)
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 593.73 kB
Formato Adobe PDF
593.73 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/180929
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact