CNR Institutional Research Information System

The development of computational models and tools for philologically curated digital editions poses dual challenges: defining functional specifications for the reference community and ensuring sustainability and adherence to open science principles. Requirements analysis benefits from user stories describing application scenarios, while issues in process management and technologies require solutions for resource accessibility and longevity. The CHROMA model (http://chroma.cnr.it/) offers an integrated approach rooted in projects like "Bellini Digital Correspondence" and "Pirandello Nazionale". It treats text as a complex multidimensional object through an editorial process involving: Creating digital surrogates of primary sources via IIIF protocol; Segmentation and text recognition using HTR/OCR environments by using eScriptorium tool; Separating textual and paratextual planes; Representing structural and semantic phenomena through XML/TEI, RDF, and Domain-Specific Languages; Assisted encoding with software tools for realigning different versions of editions e.g., Bertalign and NormaTEI; Integrating linguistic and lexicographic analyses by using NLP tools; Extracting information and generating knowledge graphs by means of semantic Web technologies and Ontologies; Connecting to authority records via LOD. The edition's fruition involves interactive visualization tools like TEIPublisher or EVT. Long-term technological standards ensure academic sustainability and synergy with the H2IOSC project. This workflow is part of pilot projects of the H2IOSC infrastructure such as the "Text Transcription Environment" and will be included in the initiative's marketplace as possible tool in docker deployment flavor. The CLARIN bazaar has discussed the model's conceptual choices and explored the proposed process in depth.

CHROMA model for H2IOSC

Sichera, Pietro^{Primo

Methodology};Cristofaro, Salvatore^Methodology;SPAMPINATO, DARIA^Methodology;Mazzagufo, Laura^Methodology;DEL GROSSO, ANGELO MARIO^Methodology

2024

Abstract

The development of computational models and tools for philologically curated digital editions poses dual challenges: defining functional specifications for the reference community and ensuring sustainability and adherence to open science principles. Requirements analysis benefits from user stories describing application scenarios, while issues in process management and technologies require solutions for resource accessibility and longevity. The CHROMA model (http://chroma.cnr.it/) offers an integrated approach rooted in projects like "Bellini Digital Correspondence" and "Pirandello Nazionale". It treats text as a complex multidimensional object through an editorial process involving: Creating digital surrogates of primary sources via IIIF protocol; Segmentation and text recognition using HTR/OCR environments by using eScriptorium tool; Separating textual and paratextual planes; Representing structural and semantic phenomena through XML/TEI, RDF, and Domain-Specific Languages; Assisted encoding with software tools for realigning different versions of editions e.g., Bertalign and NormaTEI; Integrating linguistic and lexicographic analyses by using NLP tools; Extracting information and generating knowledge graphs by means of semantic Web technologies and Ontologies; Connecting to authority records via LOD. The edition's fruition involves interactive visualization tools like TEIPublisher or EVT. Long-term technological standards ensure academic sustainability and synergy with the H2IOSC project. This workflow is part of pilot projects of the H2IOSC infrastructure such as the "Text Transcription Environment" and will be included in the initiative's marketplace as possible tool in docker deployment flavor. The CLARIN bazaar has discussed the model's conceptual choices and explored the proposed process in depth.

Scheda breve

Scheda completa

Scheda completa (DC)

Campo DC	Valore	Lingua
dc.authority.orgunit	Istituto per il Lessico Intellettuale Europeo e Storia delle Idee - ILIESI	en
dc.authority.orgunit	Istituto di Scienze e Tecnologie della Cognizione - ISTC - Sede Secondaria Catania	en
dc.authority.orgunit	Istituto di linguistica computazionale "Antonio Zampolli" - ILC	en
dc.authority.people	Sichera, Pietro	en
dc.authority.people	Cristofaro, Salvatore	en
dc.authority.people	SPAMPINATO, DARIA	en
dc.authority.people	Mazzagufo, Laura	en
dc.authority.people	DEL GROSSO, ANGELO MARIO	en
dc.collection.id.s	2e1a85b5-484d-45dd-a997-50e67e31babd	*
dc.collection.name	04.05 Poster non pubblicato in atti di convegno	*
dc.contributor.appartenenza	Istituto di Scienze e Tecnologie della Cognizione - ISTC	*
dc.contributor.appartenenza	Istituto di Scienze e Tecnologie della Cognizione - ISTC - Sede Secondaria Catania	*
dc.contributor.appartenenza	Istituto di linguistica computazionale "Antonio Zampolli" - ILC	*
dc.contributor.appartenenza	Istituto per il Lessico Intellettuale Europeo e Storia delle Idee - ILIESI	*
dc.contributor.appartenenza.mi	917	*
dc.contributor.appartenenza.mi	918	*
dc.contributor.appartenenza.mi	986	*
dc.contributor.appartenenza.mi	989	*
dc.date.accessioned	2024/12/30 12:34:33	-
dc.date.available	2024/12/30 12:34:33	-
dc.date.firstsubmission	2024/12/28 20:35:14	*
dc.date.issued	2024	-
dc.date.submission	2024/12/28 20:35:14	*
dc.description.abstracteng	The development of computational models and tools for philologically curated digital editions poses dual challenges: defining functional specifications for the reference community and ensuring sustainability and adherence to open science principles. Requirements analysis benefits from user stories describing application scenarios, while issues in process management and technologies require solutions for resource accessibility and longevity. The CHROMA model (http://chroma.cnr.it/) offers an integrated approach rooted in projects like "Bellini Digital Correspondence" and "Pirandello Nazionale". It treats text as a complex multidimensional object through an editorial process involving: Creating digital surrogates of primary sources via IIIF protocol; Segmentation and text recognition using HTR/OCR environments by using eScriptorium tool; Separating textual and paratextual planes; Representing structural and semantic phenomena through XML/TEI, RDF, and Domain-Specific Languages; Assisted encoding with software tools for realigning different versions of editions e.g., Bertalign and NormaTEI; Integrating linguistic and lexicographic analyses by using NLP tools; Extracting information and generating knowledge graphs by means of semantic Web technologies and Ontologies; Connecting to authority records via LOD. The edition's fruition involves interactive visualization tools like TEIPublisher or EVT. Long-term technological standards ensure academic sustainability and synergy with the H2IOSC project. This workflow is part of pilot projects of the H2IOSC infrastructure such as the "Text Transcription Environment" and will be included in the initiative's marketplace as possible tool in docker deployment flavor. The CLARIN bazaar has discussed the model's conceptual choices and explored the proposed process in depth.	-
dc.description.allpeople	Sichera, Pietro; Cristofaro, Salvatore; Spampinato, Daria; Mazzagufo, Laura; DEL GROSSO, ANGELO MARIO	-
dc.description.allpeopleoriginal	Sichera, Pietro; Cristofaro, Salvatore; SPAMPINATO, DARIA; Mazzagufo, Laura; DEL GROSSO, ANGELO MARIO	en
dc.description.fulltext	open	en
dc.description.numberofauthors	5	-
dc.identifier.doi	10.5281/zenodo.13913607	en
dc.identifier.source	orcid	*
dc.identifier.uri	https://hdl.handle.net/20.500.14243/522482	-
dc.language.iso	eng	en
dc.relation.allauthors	Vincent Vandeghinste and Thalassia Kontino	en
dc.relation.conferencedate	15 – 17 October 2024	en
dc.relation.conferencename	CLARIN Annual Conference 2024	en
dc.relation.conferenceplace	Barcelona, Spain	en
dc.relation.ispartofbook	CLARIN Annual Conference Proceedings, 2024	en
dc.subject.keywords	Workflow	-
dc.subject.keywords	eScriptorium	-
dc.subject.keywords	HTR	-
dc.subject.keywords	OCR	-
dc.subject.keywords	XML-TEI	-
dc.subject.keywords	NormaTEI	-
dc.subject.keywords	NLP	-
dc.subject.keywordseng	Digital humanities	-
dc.subject.keywordseng	onthology	-
dc.subject.singlekeyword	Workflow	*
dc.subject.singlekeyword	eScriptorium	*
dc.subject.singlekeyword	HTR	*
dc.subject.singlekeyword	OCR	*
dc.subject.singlekeyword	XML-TEI	*
dc.subject.singlekeyword	NormaTEI	*
dc.subject.singlekeyword	NLP	*
dc.subject.singlekeyword	Digital humanities	*
dc.subject.singlekeyword	onthology	*
dc.title	CHROMA model for H2IOSC	en
dc.type.driver	info:eu-repo/semantics/conferenceObject	-
dc.type.full	04 Contributo in convegno::04.05 Poster non pubblicato in atti di convegno	it
dc.type.miur	-2	-
iris.mediafilter.data	2025/04/13 03:00:22	*
iris.orcid.lastModifiedDate	2025/01/21 17:02:24	*
iris.orcid.lastModifiedMillisecond	1737475344732	*
iris.sitodocente.maxattempts	3	-
iris.unpaywall.metadataCallLastModified	27/01/2026 03:46:53	-
iris.unpaywall.metadataCallLastModifiedMillisecond	1769482013373	-
iris.unpaywall.metadataErrorDescription	0	-
iris.unpaywall.metadataErrorType	ERROR_NO_MATCH	-
iris.unpaywall.metadataStatus	ERROR	-
Appare nelle tipologie:	04.05 Poster/Abstract non pubblicati in atti di convegno

File in questo prodotto:

File	Dimensione	Formato
CHROMA-model-for-H2IOSC-CLARIN-Conference-2024.pdf accesso aperto Tipologia: Versione Editoriale (PDF) Licenza: Creative commons Dimensione 1.84 MB Formato Adobe PDF Visualizza/Apri	1.84 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/522482

Citazioni

ND

ND

ND

social impact