The creation, preservation, processing, publication, and querying of complex textual resources require adopting a methodological framework that is both universally applicable and tailored to specific domains. Its broad design promotes a widespread (re)usability, while domain-specific features ensure effective imple- mentation. This work describes (the refinement of) an integrated environment for automatic text and layout recognition that relies on the use of the tools ZoneRW, Kraken and eScriptorium. Previous experiments have been undertaken within the framework of the COVerLeSS, a historical-philological and linguistic investigation project of the literary magazines of the late 19th century of Italian Verismo. The publication of the text transcriptions in TEI Publisher together with the collections of related digital images makes the digitization process reusable and interoperable.

An Infrastructural Solution for Digital Publication starting from Automatic Layout and Text Recognition: Insights from Italian Literary Journals

Mazzagufo, Laura
Co-primo
;
Sichera, Pietro
Co-primo
;
Cristofaro, Salvatore
Co-ultimo
;
Del Grosso, Angelo Mario
Co-ultimo
;
Spampinato, Daria
Co-ultimo
2025

Abstract

The creation, preservation, processing, publication, and querying of complex textual resources require adopting a methodological framework that is both universally applicable and tailored to specific domains. Its broad design promotes a widespread (re)usability, while domain-specific features ensure effective imple- mentation. This work describes (the refinement of) an integrated environment for automatic text and layout recognition that relies on the use of the tools ZoneRW, Kraken and eScriptorium. Previous experiments have been undertaken within the framework of the COVerLeSS, a historical-philological and linguistic investigation project of the literary magazines of the late 19th century of Italian Verismo. The publication of the text transcriptions in TEI Publisher together with the collections of related digital images makes the digitization process reusable and interoperable.
2025
Istituto di Scienze e Tecnologie della Cognizione - ISTC - Sede Secondaria Catania
Istituto per il Lessico Intellettuale Europeo e Storia delle Idee - ILIESI
Istituto di linguistica computazionale "Antonio Zampolli" - ILC
979-8-3315-4384-6
ATR, eScriptorium, ZoneRW, Kraken, TEI Publisher, Digital Humanities
File in questo prodotto:
File Dimensione Formato  
An-Infrastructural-Solution-Cist25.pdf

solo utenti autorizzati

Tipologia: Versione Editoriale (PDF)
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 6.36 MB
Formato Adobe PDF
6.36 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/559506
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact