In this paper, we propose an evaluation of a Transformer-based punctuation restoration model for the Italian language. Experimenting with a BERT-base model, we perform several fine-tuning with different training data and sizes and tested them in an in- and cross-domain scenario. Moreover, we conducted an error analysis of the main weaknesses of the model related to specific punctuation marks. Finally, we test our system either quantitatively and qualitatively, by offering a typical task-oriented and a perception-based acceptability evaluation.
Punctuation Restoration in Spoken Italian Transcripts with Transformers
Miaschi A;Ravelli AA;Dell'Orletta F
2022
Abstract
In this paper, we propose an evaluation of a Transformer-based punctuation restoration model for the Italian language. Experimenting with a BERT-base model, we perform several fine-tuning with different training data and sizes and tested them in an in- and cross-domain scenario. Moreover, we conducted an error analysis of the main weaknesses of the model related to specific punctuation marks. Finally, we test our system either quantitatively and qualitatively, by offering a typical task-oriented and a perception-based acceptability evaluation.| Campo DC | Valore | Lingua |
|---|---|---|
| dc.authority.orgunit | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | en |
| dc.authority.people | Miaschi A | en |
| dc.authority.people | Ravelli AA | en |
| dc.authority.people | Dell'Orletta F | en |
| dc.collection.id.s | 71c7200a-7c5f-4e83-8d57-d3d2ba88f40d | * |
| dc.collection.name | 04.01 Contributo in Atti di convegno | * |
| dc.contributor.appartenenza | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | * |
| dc.contributor.appartenenza.mi | 918 | * |
| dc.date.accessioned | 2024/02/21 03:23:03 | - |
| dc.date.available | 2024/02/21 03:23:03 | - |
| dc.date.firstsubmission | 2024/12/20 10:11:03 | * |
| dc.date.issued | 2022 | - |
| dc.date.submission | 2024/12/20 17:07:54 | * |
| dc.description.abstracteng | In this paper, we propose an evaluation of a Transformer-based punctuation restoration model for the Italian language. Experimenting with a BERT-base model, we perform several fine-tuning with different training data and sizes and tested them in an in- and cross-domain scenario. Moreover, we conducted an error analysis of the main weaknesses of the model related to specific punctuation marks. Finally, we test our system either quantitatively and qualitatively, by offering a typical task-oriented and a perception-based acceptability evaluation. | - |
| dc.description.affiliations | Department of Computer Science, Università di Pisa, Pisa; Istituto di Linguistica Computazionale "Antonio Zampolli" (ILC-CNR), ItaliaNLP Lab, Pisa | - |
| dc.description.allpeople | Miaschi, A; Ravelli, Aa; Dell'Orletta, F | - |
| dc.description.allpeopleoriginal | Miaschi A.; Ravelli A.A.; Dell'Orletta F. | en |
| dc.description.fulltext | restricted | en |
| dc.description.numberofauthors | 3 | - |
| dc.identifier.doi | 10.1007/978-3-031-08421-8_17 | en |
| dc.identifier.scopus | 2-s2.0-85135083576 | en |
| dc.identifier.uri | https://hdl.handle.net/20.500.14243/443056 | - |
| dc.identifier.url | http://www.scopus.com/record/display.url?eid=2-s2.0-85135083576&origin=inward | en |
| dc.language.iso | eng | en |
| dc.miur.last.status.update | 2024-12-20T11:46:24Z | * |
| dc.relation.conferencedate | 1-3/12/2021 | en |
| dc.relation.conferencename | AIxIA 2021 - Advances in Artificial Intelligence | en |
| dc.relation.firstpage | 245 | en |
| dc.relation.ispartofbook | Proccedings of AIxIA 2021 - Advances in Artificial Intelligence | en |
| dc.relation.lastpage | 260 | en |
| dc.relation.numberofpages | 16 | en |
| dc.relation.volume | 13196 LNAI | en |
| dc.subject.keywords | nlp | - |
| dc.subject.keywords | transformer models | - |
| dc.subject.keywords | puncutation restoration | - |
| dc.subject.singlekeyword | nlp | * |
| dc.subject.singlekeyword | transformer models | * |
| dc.subject.singlekeyword | puncutation restoration | * |
| dc.title | Punctuation Restoration in Spoken Italian Transcripts with Transformers | en |
| dc.type.driver | info:eu-repo/semantics/conferenceObject | - |
| dc.type.full | 04 Contributo in convegno::04.01 Contributo in Atti di convegno | it |
| dc.type.miur | 273 | - |
| dc.ugov.descaux1 | 469732 | - |
| iris.mediafilter.data | 2025/04/15 04:18:53 | * |
| iris.orcid.lastModifiedDate | 2024/12/23 17:42:22 | * |
| iris.orcid.lastModifiedMillisecond | 1734972142483 | * |
| iris.scopus.extIssued | 2022 | - |
| iris.scopus.extTitle | Punctuation Restoration in Spoken Italian Transcripts with Transformers | - |
| iris.sitodocente.maxattempts | 1 | - |
| iris.unpaywall.doi | 10.1007/978-3-031-08421-8_17 | * |
| iris.unpaywall.isoa | false | * |
| iris.unpaywall.journalisindoaj | false | * |
| iris.unpaywall.metadataCallLastModified | 01/01/2026 02:45:51 | - |
| iris.unpaywall.metadataCallLastModifiedMillisecond | 1767231951172 | - |
| iris.unpaywall.oastatus | closed | * |
| scopus.authority.anceserie | LECTURE NOTES IN COMPUTER SCIENCE###0302-9743 | * |
| scopus.category | 2614 | * |
| scopus.category | 1700 | * |
| scopus.contributor.affiliation | ItaliaNLP Lab | - |
| scopus.contributor.affiliation | ItaliaNLP Lab | - |
| scopus.contributor.affiliation | ItaliaNLP Lab | - |
| scopus.contributor.afid | 60008941 | - |
| scopus.contributor.afid | 60008941 | - |
| scopus.contributor.afid | 60008941 | - |
| scopus.contributor.auid | 57211678681 | - |
| scopus.contributor.auid | 57192943134 | - |
| scopus.contributor.auid | 57540567000 | - |
| scopus.contributor.country | Italy | - |
| scopus.contributor.country | Italy | - |
| scopus.contributor.country | Italy | - |
| scopus.contributor.dptid | 114087935 | - |
| scopus.contributor.dptid | 114087935 | - |
| scopus.contributor.dptid | 114087935 | - |
| scopus.contributor.name | Alessio | - |
| scopus.contributor.name | Andrea Amelio | - |
| scopus.contributor.name | Felice | - |
| scopus.contributor.subaffiliation | Istituto di Linguistica Computazionale “Antonio Zampolli” (ILC–CNR); | - |
| scopus.contributor.subaffiliation | Istituto di Linguistica Computazionale “Antonio Zampolli” (ILC–CNR); | - |
| scopus.contributor.subaffiliation | Istituto di Linguistica Computazionale “Antonio Zampolli” (ILC–CNR); | - |
| scopus.contributor.surname | Miaschi | - |
| scopus.contributor.surname | Ravelli | - |
| scopus.contributor.surname | Dell’Orletta | - |
| scopus.date.issued | 2022 | * |
| scopus.description.abstracteng | In this paper, we propose an evaluation of a Transformer-based punctuation restoration model for the Italian language. Experimenting with a BERT-base model, we perform several fine-tuning with different training data and sizes and tested them in an in- and cross-domain scenario. Moreover, we conducted an error analysis of the main weaknesses of the model related to specific punctuation marks. Finally, we test our system either quantitatively and qualitatively, by offering a typical task-oriented and a perception-based acceptability evaluation. | * |
| scopus.description.allpeopleoriginal | Miaschi A.; Ravelli A.A.; Dell'Orletta F. | * |
| scopus.differences | scopus.authority.anceserie | * |
| scopus.differences | scopus.publisher.name | * |
| scopus.differences | scopus.subject.keywords | * |
| scopus.differences | scopus.relation.conferencedate | * |
| scopus.differences | scopus.relation.conferencename | * |
| scopus.differences | scopus.identifier.isbn | * |
| scopus.differences | scopus.relation.volume | * |
| scopus.document.type | cp | * |
| scopus.document.types | cp | * |
| scopus.identifier.doi | 10.1007/978-3-031-08421-8_17 | * |
| scopus.identifier.eissn | 1611-3349 | * |
| scopus.identifier.isbn | 9783031084201 | * |
| scopus.identifier.pui | 638586047 | * |
| scopus.identifier.scopus | 2-s2.0-85135083576 | * |
| scopus.journal.sourceid | 25674 | * |
| scopus.language.iso | eng | * |
| scopus.publisher.name | Springer Science and Business Media Deutschland GmbH | * |
| scopus.relation.conferencedate | 2021 | * |
| scopus.relation.conferencename | 20th International Conference of the Italian Association for Artificial Intelligence, AIxIA 2021 | * |
| scopus.relation.firstpage | 245 | * |
| scopus.relation.lastpage | 260 | * |
| scopus.relation.volume | 13196 | * |
| scopus.subject.keywords | Punctuation restoration; Speech transcription; Transformers; | * |
| scopus.title | Punctuation Restoration in Spoken Italian Transcripts with Transformers | * |
| scopus.titleeng | Punctuation Restoration in Spoken Italian Transcripts with Transformers | * |
| Appare nelle tipologie: | 04.01 Contributo in Atti di convegno | |
File in questo prodotto:
| File | Dimensione | Formato | |
|---|---|---|---|
|
978-3-031-08421-8-2.pdf
solo utenti autorizzati
Tipologia:
Versione Editoriale (PDF)
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
567.28 kB
Formato
Adobe PDF
|
567.28 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


