This paper investigates the relationship between two complementary perspectives in the human assessment of sentence complexity and how they are modeled in a neural language model (NLM). The first perspective takes into account multiple online behavioral metrics obtained from eye-tracking recordings. The second one concerns the offline perception of complexity measured by explicit human judgments. Using a broad spectrum of linguistic features modeling lexical, morpho-syntactic, and syntactic properties of sentences, we perform a comprehensive analysis of linguistic phenomena associated with the two complexity viewpoints and report similarities and differences. We then show the effectiveness of linguistic features when explicitly leveraged by a regression model for predicting sentence complexity and compare its results with the ones obtained by a fine-tuned neural language model. We finally probe the NLM's linguistic competence before and after fine-tuning, highlighting how linguistic information encoded in representations changes when the model learns to predict complexity.
That Looks Hard: Characterizing Linguistic Complexity in Humans and Language Models
Brunato D;Dell'Orletta F
2021
Abstract
This paper investigates the relationship between two complementary perspectives in the human assessment of sentence complexity and how they are modeled in a neural language model (NLM). The first perspective takes into account multiple online behavioral metrics obtained from eye-tracking recordings. The second one concerns the offline perception of complexity measured by explicit human judgments. Using a broad spectrum of linguistic features modeling lexical, morpho-syntactic, and syntactic properties of sentences, we perform a comprehensive analysis of linguistic phenomena associated with the two complexity viewpoints and report similarities and differences. We then show the effectiveness of linguistic features when explicitly leveraged by a regression model for predicting sentence complexity and compare its results with the ones obtained by a fine-tuned neural language model. We finally probe the NLM's linguistic competence before and after fine-tuning, highlighting how linguistic information encoded in representations changes when the model learns to predict complexity.| Campo DC | Valore | Lingua |
|---|---|---|
| dc.authority.orgunit | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | - |
| dc.authority.people | Sarti G | it |
| dc.authority.people | Brunato D | it |
| dc.authority.people | Dell'Orletta F | it |
| dc.collection.id.s | 71c7200a-7c5f-4e83-8d57-d3d2ba88f40d | * |
| dc.collection.name | 04.01 Contributo in Atti di convegno | * |
| dc.contributor.appartenenza | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | * |
| dc.contributor.appartenenza.mi | 918 | * |
| dc.date.accessioned | 2024/02/20 13:12:51 | - |
| dc.date.available | 2024/02/20 13:12:51 | - |
| dc.date.issued | 2021 | - |
| dc.description.abstracteng | This paper investigates the relationship between two complementary perspectives in the human assessment of sentence complexity and how they are modeled in a neural language model (NLM). The first perspective takes into account multiple online behavioral metrics obtained from eye-tracking recordings. The second one concerns the offline perception of complexity measured by explicit human judgments. Using a broad spectrum of linguistic features modeling lexical, morpho-syntactic, and syntactic properties of sentences, we perform a comprehensive analysis of linguistic phenomena associated with the two complexity viewpoints and report similarities and differences. We then show the effectiveness of linguistic features when explicitly leveraged by a regression model for predicting sentence complexity and compare its results with the ones obtained by a fine-tuned neural language model. We finally probe the NLM's linguistic competence before and after fine-tuning, highlighting how linguistic information encoded in representations changes when the model learns to predict complexity. | - |
| dc.description.affiliations | University of Trieste, International School for Advanced Studies (SISSA), Trieste, Istituto di Linguistica Computazionale "Antonio Zampolli" (ILC-CNR), Pisa | - |
| dc.description.allpeople | Sarti G; Brunato D; Dell'Orletta F | - |
| dc.description.allpeopleoriginal | Sarti G, Brunato D, Dell'Orletta F | - |
| dc.description.fulltext | none | en |
| dc.description.numberofauthors | 2 | - |
| dc.identifier.doi | 10.18653/v1/2021.cmcl-1.5 | - |
| dc.identifier.isbn | 978-1-954085-35-0 | - |
| dc.identifier.uri | https://hdl.handle.net/20.500.14243/440173 | - |
| dc.identifier.url | https://aclanthology.org/2021.cmcl-1.5 | - |
| dc.language.iso | eng | - |
| dc.relation.conferencedate | 10/06/2021 | - |
| dc.relation.conferencename | Proceedings of Workshop on Cognitive Modeling and Computational Linguistics (CMCL 2021) | - |
| dc.relation.firstpage | 48 | - |
| dc.relation.lastpage | 60 | - |
| dc.subject.keywords | linguistic complexity | - |
| dc.subject.keywords | eyetracking | - |
| dc.subject.keywords | human evaluation | - |
| dc.subject.singlekeyword | linguistic complexity | * |
| dc.subject.singlekeyword | eyetracking | * |
| dc.subject.singlekeyword | human evaluation | * |
| dc.title | That Looks Hard: Characterizing Linguistic Complexity in Humans and Language Models | en |
| dc.type.driver | info:eu-repo/semantics/conferenceObject | - |
| dc.type.full | 04 Contributo in convegno::04.01 Contributo in Atti di convegno | it |
| dc.type.miur | 273 | - |
| dc.type.referee | Sì, ma tipo non specificato | - |
| dc.ugov.descaux1 | 464972 | - |
| iris.orcid.lastModifiedDate | 2024/03/02 04:09:20 | * |
| iris.orcid.lastModifiedMillisecond | 1709348960784 | * |
| iris.scopus.extIssued | 2021 | - |
| iris.scopus.extTitle | That Looks Hard: Characterizing Linguistic Complexity in Humans and Language Models | - |
| iris.sitodocente.maxattempts | 1 | - |
| iris.unpaywall.bestoaversion | publishedVersion | * |
| iris.unpaywall.doi | 10.18653/v1/2021.cmcl-1.5 | * |
| iris.unpaywall.isoa | true | * |
| iris.unpaywall.landingpage | https://doi.org/10.18653/v1/2021.cmcl-1.5 | * |
| iris.unpaywall.license | cc-by | * |
| iris.unpaywall.metadataCallLastModified | 21/12/2025 05:36:41 | - |
| iris.unpaywall.metadataCallLastModifiedMillisecond | 1766291801559 | - |
| iris.unpaywall.oastatus | gold | * |
| iris.unpaywall.pdfurl | https://aclanthology.org/2021.cmcl-1.5.pdf | * |
| Appare nelle tipologie: | 04.01 Contributo in Atti di convegno | |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


