In this study we present a Natural Language Processing (NLP)-based stylometric approach for tracking the evolution of written language competence in Italian L1 learners. The approach relies on a wide set of linguistically motivated features capturing stylistic aspects of a text, which were extracted from students' essays contained in CItA (Corpus Italiano di Apprendenti L1), the first longitudinal corpus of texts written by Italian L1 learners enrolled in the first and second year of lower secondary school. We address the problem of modeling written language development as a supervised classification task consisting in predicting the chronological order of essays written by the same student at different temporal spans. The promising results obtained in several classification scenarios allow us to conclude that it is possible to automatically model the highly relevant changes affecting written language evolution across time, as well as identifying which features are more predictive of this process. In the last part of the article, we focus the attention on the possible influence of background variables on language learning and we present preliminary results of a pilot study aiming at understanding how the observed developmental patterns are affected by information related to the school environment of the student.
A NLP-based stylometric approach for tracking the evolution of L1 written language competence
Miaschi;Alessio;Brunato;Dominique;Dell'Orletta;Felice
2021
Abstract
In this study we present a Natural Language Processing (NLP)-based stylometric approach for tracking the evolution of written language competence in Italian L1 learners. The approach relies on a wide set of linguistically motivated features capturing stylistic aspects of a text, which were extracted from students' essays contained in CItA (Corpus Italiano di Apprendenti L1), the first longitudinal corpus of texts written by Italian L1 learners enrolled in the first and second year of lower secondary school. We address the problem of modeling written language development as a supervised classification task consisting in predicting the chronological order of essays written by the same student at different temporal spans. The promising results obtained in several classification scenarios allow us to conclude that it is possible to automatically model the highly relevant changes affecting written language evolution across time, as well as identifying which features are more predictive of this process. In the last part of the article, we focus the attention on the possible influence of background variables on language learning and we present preliminary results of a pilot study aiming at understanding how the observed developmental patterns are affected by information related to the school environment of the student.| Campo DC | Valore | Lingua |
|---|---|---|
| dc.authority.ancejournal | JOURNAL OF WRITING RESEARCH | en |
| dc.authority.orgunit | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | en |
| dc.authority.people | Miaschi | en |
| dc.authority.people | Alessio | en |
| dc.authority.people | Brunato | en |
| dc.authority.people | Dominique | en |
| dc.authority.people | Dell'Orletta | en |
| dc.authority.people | Felice | en |
| dc.collection.id.s | b3f88f24-048a-4e43-8ab1-6697b90e068e | * |
| dc.collection.name | 01.01 Articolo in rivista | * |
| dc.contributor.appartenenza | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | * |
| dc.contributor.appartenenza.mi | 918 | * |
| dc.date.accessioned | 2024/02/20 17:36:36 | - |
| dc.date.available | 2024/02/20 17:36:36 | - |
| dc.date.firstsubmission | 2025/01/24 11:02:41 | * |
| dc.date.issued | 2021 | - |
| dc.date.submission | 2025/01/29 09:52:41 | * |
| dc.description.abstracteng | In this study we present a Natural Language Processing (NLP)-based stylometric approach for tracking the evolution of written language competence in Italian L1 learners. The approach relies on a wide set of linguistically motivated features capturing stylistic aspects of a text, which were extracted from students' essays contained in CItA (Corpus Italiano di Apprendenti L1), the first longitudinal corpus of texts written by Italian L1 learners enrolled in the first and second year of lower secondary school. We address the problem of modeling written language development as a supervised classification task consisting in predicting the chronological order of essays written by the same student at different temporal spans. The promising results obtained in several classification scenarios allow us to conclude that it is possible to automatically model the highly relevant changes affecting written language evolution across time, as well as identifying which features are more predictive of this process. In the last part of the article, we focus the attention on the possible influence of background variables on language learning and we present preliminary results of a pilot study aiming at understanding how the observed developmental patterns are affected by information related to the school environment of the student. | - |
| dc.description.affiliations | Università di Pisa; Istituto di Linguistica Computazionale (ILC-CNR) | - |
| dc.description.allpeople | Miaschi, Alessio; Miaschi, Alessio; Brunato, DOMINIQUE PIERINA; Brunato, DOMINIQUE PIERINA; Dell'Orletta, Felice; Dell'Orletta, Felice | - |
| dc.description.allpeopleoriginal | Miaschi, Alessio and Brunato, Dominique and Dell'Orletta, Felice | en |
| dc.description.fulltext | open | en |
| dc.description.note | Query date: 2021-06-09 | en |
| dc.description.numberofauthors | 6 | - |
| dc.identifier.doi | 10.17239/jowr-2021.13.01.03 | en |
| dc.identifier.isi | WOS:000659987400003 | - |
| dc.identifier.scopus | 2-s2.0-85108566169 | en |
| dc.identifier.uri | https://hdl.handle.net/20.500.14243/402654 | - |
| dc.identifier.url | https://www.jowr.org/abstracts/vol13_1/Miaschi_et_al_2021_13_1_abstract.html | en |
| dc.language.iso | eng | en |
| dc.miur.last.status.update | 2025-01-24T14:40:15Z | * |
| dc.relation.firstpage | 71 | en |
| dc.relation.lastpage | 105 | en |
| dc.relation.medium | ELETTRONICO | en |
| dc.relation.numberofpages | 35 | en |
| dc.relation.volume | vol. 13 | en |
| dc.subject.keywordseng | stylometry | - |
| dc.subject.keywordseng | computational linguistics | - |
| dc.subject.keywordseng | language competence | - |
| dc.subject.singlekeyword | stylometry | * |
| dc.subject.singlekeyword | computational linguistics | * |
| dc.subject.singlekeyword | language competence | * |
| dc.title | A NLP-based stylometric approach for tracking the evolution of L1 written language competence | en |
| dc.type.circulation | Internazionale | en |
| dc.type.driver | info:eu-repo/semantics/article | - |
| dc.type.full | 01 Contributo su Rivista::01.01 Articolo in rivista | it |
| dc.type.impactfactor | si | en |
| dc.type.miur | 262 | - |
| dc.ugov.descaux1 | 454570 | - |
| iris.isi.extIssued | 2021 | - |
| iris.isi.extTitle | A NLP-based stylometric approach for tracking the evolution of L1 written language competence | - |
| iris.mediafilter.data | 2025/04/06 03:20:42 | * |
| iris.orcid.lastModifiedDate | 2025/02/25 07:11:25 | * |
| iris.orcid.lastModifiedMillisecond | 1740463885780 | * |
| iris.scopus.extIssued | 2021 | - |
| iris.scopus.extTitle | A NLP-based stylometric approach for tracking the evolution of L1 written language competence | - |
| iris.sitodocente.maxattempts | 1 | - |
| iris.unpaywall.bestoahost | publisher | * |
| iris.unpaywall.bestoaversion | publishedVersion | * |
| iris.unpaywall.doi | 10.17239/jowr-2021.13.01.03 | * |
| iris.unpaywall.hosttype | publisher | * |
| iris.unpaywall.isoa | true | * |
| iris.unpaywall.journalisindoaj | true | * |
| iris.unpaywall.landingpage | https://doi.org/10.17239/jowr-2021.13.01.03 | * |
| iris.unpaywall.license | cc-by-nc-nd | * |
| iris.unpaywall.metadataCallLastModified | 26/04/2026 07:32:12 | - |
| iris.unpaywall.metadataCallLastModifiedMillisecond | 1777181532735 | - |
| iris.unpaywall.oastatus | gold | * |
| isi.authority.ancejournal | JOURNAL OF WRITING RESEARCH###2030-1006 | * |
| isi.authority.sdg | Goal 4: Quality education###12084 | * |
| isi.category | HA | * |
| isi.contributor.affiliation | University of Pisa | - |
| isi.contributor.affiliation | Ist Linguist Computaz A Zampolli ILC CNR | - |
| isi.contributor.affiliation | Ist Linguist Computaz A Zampolli ILC CNR | - |
| isi.contributor.country | Italy | - |
| isi.contributor.country | Italy | - |
| isi.contributor.country | Italy | - |
| isi.contributor.name | Alessio | - |
| isi.contributor.name | Dominique | - |
| isi.contributor.name | Felice | - |
| isi.contributor.researcherId | GCD-5321-2022 | - |
| isi.contributor.researcherId | MCK-5206-2025 | - |
| isi.contributor.researcherId | AAX-1864-2020 | - |
| isi.contributor.subaffiliation | Dipartimento Informat | - |
| isi.contributor.subaffiliation | ItaliaNLP Lab | - |
| isi.contributor.subaffiliation | ItaliaNLP Lab | - |
| isi.contributor.surname | Miaschi | - |
| isi.contributor.surname | Brunato | - |
| isi.contributor.surname | Dell'Orletta | - |
| isi.date.issued | 2021 | * |
| isi.description.abstracteng | In this study we present a Natural Language Processing (NLP)-based stylometric approach for tracking the evolution of written language competence in Italian L1 learners. The approach relies on a wide set of linguistically motivated features capturing stylistic aspects of a text, which were extracted from students' essays contained in CItA (Corpus Italiano di Apprendenti L1), the first longitudinal corpus of texts written by Italian L1 learners enrolled in the first and second year of lower secondary school. We address the problem of modeling written language development as a supervised classification task consisting in predicting the chronological order of essays written by the same student at different temporal spans. The promising results obtained in several classification scenarios allow us to conclude that it is possible to automatically model the highly relevant changes affecting written language evolution across time, as well as identifying which features are more predictive of this process. In the last part of the article, we focus the attention on the possible influence of background variables on language learning and we present preliminary results of a pilot study aiming at understanding how the observed developmental patterns are affected by information related to the school environment of the student. | * |
| isi.description.allpeopleoriginal | Miaschi, A; Brunato, D; Dell'Orletta, F; | * |
| isi.document.sourcetype | WOS.ESCI | * |
| isi.document.type | Article | * |
| isi.document.types | Article | * |
| isi.identifier.doi | 10.17239/jowr-2021.13.01.03 | * |
| isi.identifier.eissn | 2294-3307 | * |
| isi.identifier.isi | WOS:000659987400003 | * |
| isi.journal.journaltitle | JOURNAL OF WRITING RESEARCH | * |
| isi.journal.journaltitleabbrev | J WRIT RES | * |
| isi.language.original | English | * |
| isi.publisher.place | CAMPUS GROENENBORGER, 171 GROENENBORGERLAAN, ANTWERP, 2020, BELGIUM | * |
| isi.relation.firstpage | 71 | * |
| isi.relation.issue | 1 | * |
| isi.relation.lastpage | 105 | * |
| isi.relation.volume | 13 | * |
| isi.title | A NLP-based stylometric approach for tracking the evolution of L1 written language competence | * |
| scopus.authority.ancejournal | JOURNAL OF WRITING RESEARCH###2030-1006 | * |
| scopus.category | 1203 | * |
| scopus.category | 3304 | * |
| scopus.category | 3310 | * |
| scopus.category | 1208 | * |
| scopus.contributor.affiliation | ItaliaNLP Lab | - |
| scopus.contributor.affiliation | Universita di Pisa | - |
| scopus.contributor.affiliation | Universita di Pisa | - |
| scopus.contributor.afid | 60008941 | - |
| scopus.contributor.afid | 60028868 | - |
| scopus.contributor.afid | 60028868 | - |
| scopus.contributor.auid | 57211678681 | - |
| scopus.contributor.auid | 55237740200 | - |
| scopus.contributor.auid | 57540567000 | - |
| scopus.contributor.country | Italy | - |
| scopus.contributor.country | - | |
| scopus.contributor.country | - | |
| scopus.contributor.dptid | 114087935 | - |
| scopus.contributor.dptid | - | |
| scopus.contributor.dptid | - | |
| scopus.contributor.name | Alessio | - |
| scopus.contributor.name | Dominique | - |
| scopus.contributor.name | Felice | - |
| scopus.contributor.subaffiliation | Istituto di Linguistica Computazionale “A. Zampolli” (ILC-CNR); | - |
| scopus.contributor.subaffiliation | Dipartimento di Informatica; | - |
| scopus.contributor.subaffiliation | Dipartimento di Informatica; | - |
| scopus.contributor.surname | Miaschi | - |
| scopus.contributor.surname | Brunato | - |
| scopus.contributor.surname | Dell'Orletta | - |
| scopus.date.issued | 2021 | * |
| scopus.description.abstracteng | In this study we present a Natural Language Processing (NLP)-based stylometric approach for tracking the evolution of written language competence in Italian L1 learners. The approach relies on a wide set of linguistically motivated features capturing stylistic aspects of a text, which were extracted from students' essays contained in CItA (Corpus Italiano di Apprendenti L1), the first longitudinal corpus of texts written by Italian L1 learners enrolled in the first and second year of lower secondary school. We address the problem of modeling written language development as a supervised classification task consisting in predicting the chronological order of essays written by the same student at different temporal spans. The promising results obtained in several classification scenarios allow us to conclude that it is possible to automatically model the highly relevant changes affecting written language evolution across time, as well as identifying which features are more predictive of this process. In the last part of the article, we focus the attention on the possible influence of background variables on language learning and we present preliminary results of a pilot study aiming at understanding how the observed developmental patterns are affected by information related to the school environment of the student. | * |
| scopus.description.allpeopleoriginal | Miaschi A.; Brunato D.; Dell'Orletta F. | * |
| scopus.differences | scopus.subject.keywords | * |
| scopus.differences | scopus.description.allpeopleoriginal | * |
| scopus.differences | scopus.relation.issue | * |
| scopus.differences | scopus.relation.volume | * |
| scopus.document.type | ar | * |
| scopus.document.types | ar | * |
| scopus.identifier.doi | 10.17239/JOWR-2021.13.01.03 | * |
| scopus.identifier.eissn | 2294-3307 | * |
| scopus.identifier.pui | 2012939152 | * |
| scopus.identifier.scopus | 2-s2.0-85108566169 | * |
| scopus.journal.sourceid | 21100217021 | * |
| scopus.language.iso | eng | * |
| scopus.publisher.name | University of Antwerp | * |
| scopus.relation.firstpage | 71 | * |
| scopus.relation.issue | 1 | * |
| scopus.relation.lastpage | 105 | * |
| scopus.relation.volume | 13 | * |
| scopus.subject.keywords | Diachronic Evolution of Written Language Competence; Italian Learner Corpus; Learners' errors; Machine Learning; Natural Language Processing; Stylometry; | * |
| scopus.title | A NLP-based stylometric approach for tracking the evolution of L1 written language competence | * |
| scopus.titleeng | A NLP-based stylometric approach for tracking the evolution of L1 written language competence | * |
| Appare nelle tipologie: | 01.01 Articolo in rivista | |
| File | Dimensione | Formato | |
|---|---|---|---|
|
JoWR_2021_vol13_nr1_Miaschi_et_al (7).pdf
accesso aperto
Tipologia:
Versione Editoriale (PDF)
Licenza:
Creative commons
Dimensione
364.16 kB
Formato
Adobe PDF
|
364.16 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


