In this paper, we present design and construction of the first Italian corpus for automatic and semi--automatic text simplification. In line with current approaches, we propose a new annotation scheme specifically conceived to identify the typology of changes an original sentence undergoes when it is manually simplified. Such a scheme has been applied to two aligned Italian corpora, containing original texts with corresponding simplified versions, selected as representative of two different manual simplification strategies and addressing different target reader populations. Each corpus was annotated with the operations foreseen in the annotation scheme, covering different levels of linguistic description. Annotation results were analysed with the final aim of capturing peculiarities and differences of the different simplification strategies pursued in the two corpora.

Design and Annotation of the First Italian Corpus for Text Simplification

Brunato D;Dell'Orletta F;Venturi G;Montemagni S
2015

Abstract

In this paper, we present design and construction of the first Italian corpus for automatic and semi--automatic text simplification. In line with current approaches, we propose a new annotation scheme specifically conceived to identify the typology of changes an original sentence undergoes when it is manually simplified. Such a scheme has been applied to two aligned Italian corpora, containing original texts with corresponding simplified versions, selected as representative of two different manual simplification strategies and addressing different target reader populations. Each corpus was annotated with the operations foreseen in the annotation scheme, covering different levels of linguistic description. Annotation results were analysed with the final aim of capturing peculiarities and differences of the different simplification strategies pursued in the two corpora.
Campo DC Valore Lingua
dc.authority.orgunit Istituto di linguistica computazionale "Antonio Zampolli" - ILC -
dc.authority.people Brunato D it
dc.authority.people Dell'Orletta F it
dc.authority.people Venturi G it
dc.authority.people Montemagni S it
dc.collection.id.s 71c7200a-7c5f-4e83-8d57-d3d2ba88f40d *
dc.collection.name 04.01 Contributo in Atti di convegno *
dc.contributor.appartenenza Istituto di linguistica computazionale "Antonio Zampolli" - ILC *
dc.contributor.appartenenza.mi 918 *
dc.date.accessioned 2024/02/21 08:05:14 -
dc.date.available 2024/02/21 08:05:14 -
dc.date.issued 2015 -
dc.description.abstracteng In this paper, we present design and construction of the first Italian corpus for automatic and semi--automatic text simplification. In line with current approaches, we propose a new annotation scheme specifically conceived to identify the typology of changes an original sentence undergoes when it is manually simplified. Such a scheme has been applied to two aligned Italian corpora, containing original texts with corresponding simplified versions, selected as representative of two different manual simplification strategies and addressing different target reader populations. Each corpus was annotated with the operations foreseen in the annotation scheme, covering different levels of linguistic description. Annotation results were analysed with the final aim of capturing peculiarities and differences of the different simplification strategies pursued in the two corpora. -
dc.description.affiliations Istituto di Linguistica Computazionale "Antonio Zampolli" (ILC-CNR) -
dc.description.allpeople Brunato D.; Dell'Orletta F.; Venturi G.; Montemagni S. -
dc.description.allpeopleoriginal Brunato D., Dell'Orletta F., Venturi G., Montemagni S. -
dc.description.fulltext none en
dc.description.numberofauthors 4 -
dc.identifier.isbn 978-1-941643-47-1 -
dc.identifier.uri https://hdl.handle.net/20.500.14243/296574 -
dc.identifier.url https://aclweb.org/anthology/W/W15/W15-1604.pdf -
dc.language.iso eng -
dc.relation.conferencedate 5 giugno 2015 -
dc.relation.conferencename Proceedings of LAW IX - The 9th Linguistic Annotation Workshop -
dc.relation.conferenceplace Denver, Colorado -
dc.relation.firstpage 31 -
dc.relation.lastpage 34 -
dc.subject.keywords Annotation Scheme -
dc.subject.keywords Automatic Text Simplification -
dc.subject.singlekeyword Annotation Scheme *
dc.subject.singlekeyword Automatic Text Simplification *
dc.title Design and Annotation of the First Italian Corpus for Text Simplification en
dc.type.driver info:eu-repo/semantics/conferenceObject -
dc.type.full 04 Contributo in convegno::04.01 Contributo in Atti di convegno it
dc.type.miur 273 -
dc.type.referee Sì, ma tipo non specificato -
dc.ugov.descaux1 332693 -
iris.orcid.lastModifiedDate 2024/03/02 03:49:45 *
iris.orcid.lastModifiedMillisecond 1709347785467 *
iris.sitodocente.maxattempts 1 -
Appare nelle tipologie: 04.01 Contributo in Atti di convegno
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/296574
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact