This presentation shares the experience of members of the CLARIN Trainers’ Network in reusing, adapting and localising existing learning content related to speech and oral data management to meet the learning needs of the CLARIN-IT research community in the context of the Humanities and Cultural Heritage Italian Open Science Cloud (H2IOSC) project. Following a brief introduction to the H2IOSC project and its training strategy, the article describes the transcription workshop and an accompanying workflow for managing speech and oral data, including data collection, privacy considerations, and the transcription chain. Furthermore, the authors show how they used the Skills4EOSC FAIR-by-Design methodology to convert the workshop into reusable training material, which other trainers can take and adapt to meet the needs of researchers in other communities working with oral history data. -- Progetto H2IOSC - Humanities and cultural Heritage Italian Open Science Cloud finanziato dall’Unione Europea NextGenerationEU – PNRR M4C2 – Codice progetto IR0000029 – CUP B63C22000730005.

From Collection to Transcription: a Workflow for Managing Speech Data by the CLARIN Trainers' Network

Giulia Pedonese
2025

Abstract

This presentation shares the experience of members of the CLARIN Trainers’ Network in reusing, adapting and localising existing learning content related to speech and oral data management to meet the learning needs of the CLARIN-IT research community in the context of the Humanities and Cultural Heritage Italian Open Science Cloud (H2IOSC) project. Following a brief introduction to the H2IOSC project and its training strategy, the article describes the transcription workshop and an accompanying workflow for managing speech and oral data, including data collection, privacy considerations, and the transcription chain. Furthermore, the authors show how they used the Skills4EOSC FAIR-by-Design methodology to convert the workshop into reusable training material, which other trainers can take and adapt to meet the needs of researchers in other communities working with oral history data. -- Progetto H2IOSC - Humanities and cultural Heritage Italian Open Science Cloud finanziato dall’Unione Europea NextGenerationEU – PNRR M4C2 – Codice progetto IR0000029 – CUP B63C22000730005.
Campo DC Valore Lingua
dc.authority.orgunit Istituto di linguistica computazionale "Antonio Zampolli" - ILC en
dc.authority.people Giulia Pedonese en
dc.authority.project Codice progetto IR0000029 en
dc.collection.id.s 33fc2b58-b895-438b-9d2a-2c5bc86a83a6 *
dc.collection.name 04.04 Presentazione/Comunicazione non pubblicata (convegno, evento, webinar...) *
dc.contributor.appartenenza Istituto di linguistica computazionale "Antonio Zampolli" - ILC *
dc.contributor.appartenenza.mi 918 *
dc.contributor.area Non assegn *
dc.date.accessioned 2026/03/03 15:34:23 -
dc.date.available 2026/03/03 15:34:23 -
dc.date.firstsubmission 2025/12/30 17:01:32 *
dc.date.issued 2025 -
dc.date.submission 2025/12/30 17:01:32 *
dc.description.abstracteng This presentation shares the experience of members of the CLARIN Trainers’ Network in reusing, adapting and localising existing learning content related to speech and oral data management to meet the learning needs of the CLARIN-IT research community in the context of the Humanities and Cultural Heritage Italian Open Science Cloud (H2IOSC) project. Following a brief introduction to the H2IOSC project and its training strategy, the article describes the transcription workshop and an accompanying workflow for managing speech and oral data, including data collection, privacy considerations, and the transcription chain. Furthermore, the authors show how they used the Skills4EOSC FAIR-by-Design methodology to convert the workshop into reusable training material, which other trainers can take and adapt to meet the needs of researchers in other communities working with oral history data. -- Progetto H2IOSC - Humanities and cultural Heritage Italian Open Science Cloud finanziato dall’Unione Europea NextGenerationEU – PNRR M4C2 – Codice progetto IR0000029 – CUP B63C22000730005. -
dc.description.allpeople Pedonese, Giulia -
dc.description.allpeopleoriginal Giulia Pedonese en
dc.description.fulltext open en
dc.description.numberofauthors 1 -
dc.identifier.doi 10.5281/zenodo.17191109 en
dc.identifier.source orcid *
dc.identifier.uri https://hdl.handle.net/20.500.14243/561729 -
dc.language.iso eng en
dc.relation.conferencename IEEE International Conference on Cyber Humanities (IEEE-CH) , Florence, 8-10 September 2025 en
dc.relation.projectAcronym H2IOSC en
dc.relation.projectAwardNumber CUP B63C22000730005 en
dc.relation.projectAwardTitle Humanities and cultural Heritage Italian Open Science Cloud en
dc.relation.projectFunderName Unione Europea en
dc.relation.projectFundingStream NextGenerationEU – PNRR M4C2 en
dc.subject.keywordseng transcription chain -
dc.subject.keywordseng Linguistics -
dc.subject.singlekeyword transcription chain *
dc.subject.singlekeyword Linguistics *
dc.title From Collection to Transcription: a Workflow for Managing Speech Data by the CLARIN Trainers' Network en
dc.type.driver info:eu-repo/semantics/conferenceObject -
dc.type.full 04 Contributo in convegno::04.04 Presentazione/Comunicazione non pubblicata (convegno, evento, webinar...) it
dc.type.miur -2 -
iris.mediafilter.data 2026/03/04 02:52:33 *
iris.orcid.lastModifiedDate 2026/03/03 15:34:23 *
iris.orcid.lastModifiedMillisecond 1772548463699 *
iris.sitodocente.maxattempts 2 -
iris.unpaywall.metadataCallLastModified 04/03/2026 04:33:48 -
iris.unpaywall.metadataCallLastModifiedMillisecond 1772595228780 -
iris.unpaywall.metadataErrorDescription 0 -
iris.unpaywall.metadataErrorType ERROR_NO_MATCH -
iris.unpaywall.metadataStatus ERROR -
Appare nelle tipologie: 04.04 Presentazione/Comunicazione non pubblicata (convegno, evento, webinar...)
File in questo prodotto:
File Dimensione Formato  
IEEE_From Collection to Transcription_GP (3).pptx

accesso aperto

Descrizione: Presentazione
Tipologia: Altro materiale allegato
Licenza: Creative commons
Dimensione 5.2 MB
Formato Microsoft Powerpoint XML
5.2 MB Microsoft Powerpoint XML Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/561729
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact