The development of computational models and tools for philologically curated digital editions poses dual challenges: defining functional specifications for the reference community and ensuring sustainability and adherence to open science principles. Requirements analysis benefits from user stories describing application scenarios, while issues in process management and technologies require solutions for resource accessibility and longevity. The CHROMA model (http://chroma.cnr.it/) offers an integrated approach rooted in projects like "Bellini Digital Correspondence" and "Pirandello Nazionale". It treats text as a complex multidimensional object through an editorial process involving: Creating digital surrogates of primary sources via IIIF protocol; Segmentation and text recognition using HTR/OCR environments by using eScriptorium tool; Separating textual and paratextual planes; Representing structural and semantic phenomena through XML/TEI, RDF, and Domain-Specific Languages; Assisted encoding with software tools for realigning different versions of editions e.g., Bertalign and NormaTEI; Integrating linguistic and lexicographic analyses by using NLP tools; Extracting information and generating knowledge graphs by means of semantic Web technologies and Ontologies; Connecting to authority records via LOD. The edition's fruition involves interactive visualization tools like TEIPublisher or EVT. Long-term technological standards ensure academic sustainability and synergy with the H2IOSC project. This workflow is part of pilot projects of the H2IOSC infrastructure such as the "Text Transcription Environment" and will be included in the initiative's marketplace as possible tool in docker deployment flavor. The CLARIN bazaar has discussed the model's conceptual choices and explored the proposed process in depth.

CHROMA model for H2IOSC

Sichera, Pietro
Primo
Methodology
;
Cristofaro, Salvatore
Methodology
;
SPAMPINATO, DARIA
Methodology
;
Mazzagufo, Laura
Methodology
;
DEL GROSSO, ANGELO MARIO
Methodology
2024

Abstract

The development of computational models and tools for philologically curated digital editions poses dual challenges: defining functional specifications for the reference community and ensuring sustainability and adherence to open science principles. Requirements analysis benefits from user stories describing application scenarios, while issues in process management and technologies require solutions for resource accessibility and longevity. The CHROMA model (http://chroma.cnr.it/) offers an integrated approach rooted in projects like "Bellini Digital Correspondence" and "Pirandello Nazionale". It treats text as a complex multidimensional object through an editorial process involving: Creating digital surrogates of primary sources via IIIF protocol; Segmentation and text recognition using HTR/OCR environments by using eScriptorium tool; Separating textual and paratextual planes; Representing structural and semantic phenomena through XML/TEI, RDF, and Domain-Specific Languages; Assisted encoding with software tools for realigning different versions of editions e.g., Bertalign and NormaTEI; Integrating linguistic and lexicographic analyses by using NLP tools; Extracting information and generating knowledge graphs by means of semantic Web technologies and Ontologies; Connecting to authority records via LOD. The edition's fruition involves interactive visualization tools like TEIPublisher or EVT. Long-term technological standards ensure academic sustainability and synergy with the H2IOSC project. This workflow is part of pilot projects of the H2IOSC infrastructure such as the "Text Transcription Environment" and will be included in the initiative's marketplace as possible tool in docker deployment flavor. The CLARIN bazaar has discussed the model's conceptual choices and explored the proposed process in depth.
Campo DC Valore Lingua
dc.authority.orgunit Istituto per il Lessico Intellettuale Europeo e Storia delle Idee - ILIESI en
dc.authority.orgunit Istituto di Scienze e Tecnologie della Cognizione - ISTC - Sede Secondaria Catania en
dc.authority.orgunit Istituto di linguistica computazionale "Antonio Zampolli" - ILC en
dc.authority.people Sichera, Pietro en
dc.authority.people Cristofaro, Salvatore en
dc.authority.people SPAMPINATO, DARIA en
dc.authority.people Mazzagufo, Laura en
dc.authority.people DEL GROSSO, ANGELO MARIO en
dc.collection.id.s 2e1a85b5-484d-45dd-a997-50e67e31babd *
dc.collection.name 04.05 Poster non pubblicato in atti di convegno *
dc.contributor.appartenenza Istituto di Scienze e Tecnologie della Cognizione - ISTC *
dc.contributor.appartenenza Istituto di Scienze e Tecnologie della Cognizione - ISTC - Sede Secondaria Catania *
dc.contributor.appartenenza Istituto di linguistica computazionale "Antonio Zampolli" - ILC *
dc.contributor.appartenenza Istituto per il Lessico Intellettuale Europeo e Storia delle Idee - ILIESI *
dc.contributor.appartenenza.mi 917 *
dc.contributor.appartenenza.mi 918 *
dc.contributor.appartenenza.mi 986 *
dc.contributor.appartenenza.mi 989 *
dc.date.accessioned 2024/12/30 12:34:33 -
dc.date.available 2024/12/30 12:34:33 -
dc.date.firstsubmission 2024/12/28 20:35:14 *
dc.date.issued 2024 -
dc.date.submission 2024/12/28 20:35:14 *
dc.description.abstracteng The development of computational models and tools for philologically curated digital editions poses dual challenges: defining functional specifications for the reference community and ensuring sustainability and adherence to open science principles. Requirements analysis benefits from user stories describing application scenarios, while issues in process management and technologies require solutions for resource accessibility and longevity. The CHROMA model (http://chroma.cnr.it/) offers an integrated approach rooted in projects like "Bellini Digital Correspondence" and "Pirandello Nazionale". It treats text as a complex multidimensional object through an editorial process involving: Creating digital surrogates of primary sources via IIIF protocol; Segmentation and text recognition using HTR/OCR environments by using eScriptorium tool; Separating textual and paratextual planes; Representing structural and semantic phenomena through XML/TEI, RDF, and Domain-Specific Languages; Assisted encoding with software tools for realigning different versions of editions e.g., Bertalign and NormaTEI; Integrating linguistic and lexicographic analyses by using NLP tools; Extracting information and generating knowledge graphs by means of semantic Web technologies and Ontologies; Connecting to authority records via LOD. The edition's fruition involves interactive visualization tools like TEIPublisher or EVT. Long-term technological standards ensure academic sustainability and synergy with the H2IOSC project. This workflow is part of pilot projects of the H2IOSC infrastructure such as the "Text Transcription Environment" and will be included in the initiative's marketplace as possible tool in docker deployment flavor. The CLARIN bazaar has discussed the model's conceptual choices and explored the proposed process in depth. -
dc.description.allpeople Sichera, Pietro; Cristofaro, Salvatore; Spampinato, Daria; Mazzagufo, Laura; DEL GROSSO, ANGELO MARIO -
dc.description.allpeopleoriginal Sichera, Pietro; Cristofaro, Salvatore; SPAMPINATO, DARIA; Mazzagufo, Laura; DEL GROSSO, ANGELO MARIO en
dc.description.fulltext open en
dc.description.numberofauthors 5 -
dc.identifier.doi 10.5281/zenodo.13913607 en
dc.identifier.source orcid *
dc.identifier.uri https://hdl.handle.net/20.500.14243/522482 -
dc.language.iso eng en
dc.relation.allauthors Vincent Vandeghinste and Thalassia Kontino en
dc.relation.conferencedate 15 – 17 October 2024 en
dc.relation.conferencename CLARIN Annual Conference 2024 en
dc.relation.conferenceplace Barcelona, Spain en
dc.relation.ispartofbook CLARIN Annual Conference Proceedings, 2024 en
dc.subject.keywords Workflow -
dc.subject.keywords eScriptorium -
dc.subject.keywords HTR -
dc.subject.keywords OCR -
dc.subject.keywords XML-TEI -
dc.subject.keywords NormaTEI -
dc.subject.keywords NLP -
dc.subject.keywordseng Digital humanities -
dc.subject.keywordseng onthology -
dc.subject.singlekeyword Workflow *
dc.subject.singlekeyword eScriptorium *
dc.subject.singlekeyword HTR *
dc.subject.singlekeyword OCR *
dc.subject.singlekeyword XML-TEI *
dc.subject.singlekeyword NormaTEI *
dc.subject.singlekeyword NLP *
dc.subject.singlekeyword Digital humanities *
dc.subject.singlekeyword onthology *
dc.title CHROMA model for H2IOSC en
dc.type.driver info:eu-repo/semantics/conferenceObject -
dc.type.full 04 Contributo in convegno::04.05 Poster non pubblicato in atti di convegno it
dc.type.miur -2 -
iris.mediafilter.data 2025/04/13 03:00:22 *
iris.orcid.lastModifiedDate 2025/01/21 17:02:24 *
iris.orcid.lastModifiedMillisecond 1737475344732 *
iris.sitodocente.maxattempts 3 -
iris.unpaywall.metadataCallLastModified 24/01/2025 14:04:44 -
iris.unpaywall.metadataCallLastModifiedMillisecond 1737723884588 -
iris.unpaywall.metadataErrorDescription 0 -
iris.unpaywall.metadataErrorType ERROR_NO_MATCH -
iris.unpaywall.metadataStatus ERROR -
Appare nelle tipologie: 04.05 Poster/Abstract non pubblicati in atti di convegno
File in questo prodotto:
File Dimensione Formato  
CHROMA-model-for-H2IOSC-CLARIN-Conference-2024.pdf

accesso aperto

Tipologia: Versione Editoriale (PDF)
Licenza: Creative commons
Dimensione 1.84 MB
Formato Adobe PDF
1.84 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/522482
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact