CNR Institutional Research Information System

This paper investigates the relationship between two complementary perspectives in the human assessment of sentence complexity and how they are modeled in a neural language model (NLM). The first perspective takes into account multiple online behavioral metrics obtained from eye-tracking recordings. The second one concerns the offline perception of complexity measured by explicit human judgments. Using a broad spectrum of linguistic features modeling lexical, morpho-syntactic, and syntactic properties of sentences, we perform a comprehensive analysis of linguistic phenomena associated with the two complexity viewpoints and report similarities and differences. We then show the effectiveness of linguistic features when explicitly leveraged by a regression model for predicting sentence complexity and compare its results with the ones obtained by a fine-tuned neural language model. We finally probe the NLM's linguistic competence before and after fine-tuning, highlighting how linguistic information encoded in representations changes when the model learns to predict complexity.

That Looks Hard: Characterizing Linguistic Complexity in Humans and Language Models

Sarti G;Brunato D;Dell'Orletta F

2021

Abstract

This paper investigates the relationship between two complementary perspectives in the human assessment of sentence complexity and how they are modeled in a neural language model (NLM). The first perspective takes into account multiple online behavioral metrics obtained from eye-tracking recordings. The second one concerns the offline perception of complexity measured by explicit human judgments. Using a broad spectrum of linguistic features modeling lexical, morpho-syntactic, and syntactic properties of sentences, we perform a comprehensive analysis of linguistic phenomena associated with the two complexity viewpoints and report similarities and differences. We then show the effectiveness of linguistic features when explicitly leveraged by a regression model for predicting sentence complexity and compare its results with the ones obtained by a fine-tuned neural language model. We finally probe the NLM's linguistic competence before and after fine-tuning, highlighting how linguistic information encoded in representations changes when the model learns to predict complexity.

Scheda breve

Scheda completa

Scheda completa (DC)

Campo DC	Valore	Lingua
dc.authority.orgunit	Istituto di linguistica computazionale "Antonio Zampolli" - ILC	-
dc.authority.people	Sarti G	it
dc.authority.people	Brunato D	it
dc.authority.people	Dell'Orletta F	it
dc.collection.id.s	71c7200a-7c5f-4e83-8d57-d3d2ba88f40d	*
dc.collection.name	04.01 Contributo in Atti di convegno	*
dc.contributor.appartenenza	Istituto di linguistica computazionale "Antonio Zampolli" - ILC	*
dc.contributor.appartenenza.mi	918	*
dc.date.accessioned	2024/02/20 13:12:51	-
dc.date.available	2024/02/20 13:12:51	-
dc.date.issued	2021	-
dc.description.abstracteng	This paper investigates the relationship between two complementary perspectives in the human assessment of sentence complexity and how they are modeled in a neural language model (NLM). The first perspective takes into account multiple online behavioral metrics obtained from eye-tracking recordings. The second one concerns the offline perception of complexity measured by explicit human judgments. Using a broad spectrum of linguistic features modeling lexical, morpho-syntactic, and syntactic properties of sentences, we perform a comprehensive analysis of linguistic phenomena associated with the two complexity viewpoints and report similarities and differences. We then show the effectiveness of linguistic features when explicitly leveraged by a regression model for predicting sentence complexity and compare its results with the ones obtained by a fine-tuned neural language model. We finally probe the NLM's linguistic competence before and after fine-tuning, highlighting how linguistic information encoded in representations changes when the model learns to predict complexity.	-
dc.description.affiliations	University of Trieste, International School for Advanced Studies (SISSA), Trieste, Istituto di Linguistica Computazionale "Antonio Zampolli" (ILC-CNR), Pisa	-
dc.description.allpeople	Sarti G; Brunato D; Dell'Orletta F	-
dc.description.allpeopleoriginal	Sarti G, Brunato D, Dell'Orletta F	-
dc.description.fulltext	none	en
dc.description.numberofauthors	2	-
dc.identifier.doi	10.18653/v1/2021.cmcl-1.5	-
dc.identifier.isbn	978-1-954085-35-0	-
dc.identifier.uri	https://hdl.handle.net/20.500.14243/440173	-
dc.identifier.url	https://aclanthology.org/2021.cmcl-1.5	-
dc.language.iso	eng	-
dc.relation.conferencedate	10/06/2021	-
dc.relation.conferencename	Proceedings of Workshop on Cognitive Modeling and Computational Linguistics (CMCL 2021)	-
dc.relation.firstpage	48	-
dc.relation.lastpage	60	-
dc.subject.keywords	linguistic complexity	-
dc.subject.keywords	eyetracking	-
dc.subject.keywords	human evaluation	-
dc.subject.singlekeyword	linguistic complexity	*
dc.subject.singlekeyword	eyetracking	*
dc.subject.singlekeyword	human evaluation	*
dc.title	That Looks Hard: Characterizing Linguistic Complexity in Humans and Language Models	en
dc.type.driver	info:eu-repo/semantics/conferenceObject	-
dc.type.full	04 Contributo in convegno::04.01 Contributo in Atti di convegno	it
dc.type.miur	273	-
dc.type.referee	Sì, ma tipo non specificato	-
dc.ugov.descaux1	464972	-
iris.orcid.lastModifiedDate	2024/03/02 04:09:20	*
iris.orcid.lastModifiedMillisecond	1709348960784	*
iris.scopus.extIssued	2021	-
iris.scopus.extTitle	That Looks Hard: Characterizing Linguistic Complexity in Humans and Language Models	-
iris.sitodocente.maxattempts	1	-
iris.unpaywall.bestoaversion	publishedVersion	*
iris.unpaywall.doi	10.18653/v1/2021.cmcl-1.5	*
iris.unpaywall.isoa	true	*
iris.unpaywall.landingpage	https://doi.org/10.18653/v1/2021.cmcl-1.5	*
iris.unpaywall.license	cc-by	*
iris.unpaywall.metadataCallLastModified	21/12/2025 05:36:41	-
iris.unpaywall.metadataCallLastModifiedMillisecond	1766291801559	-
iris.unpaywall.oastatus	gold	*
iris.unpaywall.pdfurl	https://aclanthology.org/2021.cmcl-1.5.pdf	*
Appare nelle tipologie:	04.01 Contributo in Atti di convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/440173

Citazioni

ND

ND

ND

social impact