This paper falls within ongoing research aimed at enhancing the human interpretability of neural language models by incorporating physiological data. Specifically, we leverage eye-tracking data collected during reading to explore how such information can guide model behavior. We train a multilingual encoder model to predict eye-tracking features from the Multilingual Eye-tracking Corpus (MECO) and analyze the resulting shifts in model attention patterns, focusing on how attention redistributes across linguistically informed categories such as part of speech, word position, word length, and distance from the syntactic head after fine-tuning. Moreover, we test how this attention shift impacts the representation of the interested words in the embedding space. The study covers both Italian and English, enabling a cross-linguistic perspective on attention and representation shifts in multilingual encoders grounded in human reading behavior.

The Role of Eye-Tracking Data in Encoder-Based Models: an In-depth Linguistic Analysis

Lucia Domenichelli;Luca Dini;Dominique Brunato;Felice Dell'Orletta
2025

Abstract

This paper falls within ongoing research aimed at enhancing the human interpretability of neural language models by incorporating physiological data. Specifically, we leverage eye-tracking data collected during reading to explore how such information can guide model behavior. We train a multilingual encoder model to predict eye-tracking features from the Multilingual Eye-tracking Corpus (MECO) and analyze the resulting shifts in model attention patterns, focusing on how attention redistributes across linguistically informed categories such as part of speech, word position, word length, and distance from the syntactic head after fine-tuning. Moreover, we test how this attention shift impacts the representation of the interested words in the embedding space. The study covers both Italian and English, enabling a cross-linguistic perspective on attention and representation shifts in multilingual encoders grounded in human reading behavior.
Campo DC Valore Lingua
dc.authority.orgunit Istituto di linguistica computazionale "Antonio Zampolli" - ILC en
dc.authority.people Lucia Domenichelli en
dc.authority.people Luca Dini en
dc.authority.people Dominique Brunato en
dc.authority.people Felice Dell'Orletta en
dc.collection.id.s 71c7200a-7c5f-4e83-8d57-d3d2ba88f40d *
dc.collection.name 04.01 Contributo in Atti di convegno *
dc.contributor.appartenenza Istituto di linguistica computazionale "Antonio Zampolli" - ILC *
dc.contributor.appartenenza.mi 918 *
dc.contributor.area Non assegn *
dc.contributor.area Non assegn *
dc.contributor.area Non assegn *
dc.date.accessioned 2026/03/03 15:18:26 -
dc.date.available 2026/03/03 15:18:26 -
dc.date.firstsubmission 2026/03/02 18:37:29 *
dc.date.issued 2025 -
dc.date.submission 2026/03/02 18:37:29 *
dc.description.abstracteng This paper falls within ongoing research aimed at enhancing the human interpretability of neural language models by incorporating physiological data. Specifically, we leverage eye-tracking data collected during reading to explore how such information can guide model behavior. We train a multilingual encoder model to predict eye-tracking features from the Multilingual Eye-tracking Corpus (MECO) and analyze the resulting shifts in model attention patterns, focusing on how attention redistributes across linguistically informed categories such as part of speech, word position, word length, and distance from the syntactic head after fine-tuning. Moreover, we test how this attention shift impacts the representation of the interested words in the embedding space. The study covers both Italian and English, enabling a cross-linguistic perspective on attention and representation shifts in multilingual encoders grounded in human reading behavior. -
dc.description.allpeople Domenichelli, Lucia; Dini, Luca; Brunato, Dominique; Dell'Orletta, Felice -
dc.description.allpeopleoriginal Lucia Domenichelli, Luca Dini, Dominique Brunato, Felice Dell'Orletta en
dc.description.fulltext open en
dc.description.numberofauthors 4 -
dc.identifier.source manual *
dc.identifier.uri https://hdl.handle.net/20.500.14243/570463 -
dc.language.iso eng en
dc.relation.ispartofbook Proceedings of the Eleventh Italian Conference on Computational Linguistics (CLiC-it 2025), 24-26 September 2025, Cagliari, Italy. en
dc.subject.keywordseng Eye-tracking, Neural Attention, Multilingual models, Embedding space, Interpretability -
dc.subject.singlekeyword Eye-tracking *
dc.subject.singlekeyword Neural Attention *
dc.subject.singlekeyword Multilingual models *
dc.subject.singlekeyword Embedding space *
dc.subject.singlekeyword Interpretability *
dc.title The Role of Eye-Tracking Data in Encoder-Based Models: an In-depth Linguistic Analysis en
dc.type.driver info:eu-repo/semantics/conferenceObject -
dc.type.full 04 Contributo in convegno::04.01 Contributo in Atti di convegno it
dc.type.miur 273 -
iris.mediafilter.data 2026/03/04 02:52:21 *
iris.orcid.lastModifiedDate 2026/03/03 15:18:26 *
iris.orcid.lastModifiedMillisecond 1772547506148 *
iris.sitodocente.maxattempts 1 -
Appare nelle tipologie: 04.01 Contributo in Atti di convegno
File in questo prodotto:
File Dimensione Formato  
40_main_long.pdf

accesso aperto

Licenza: Creative commons
Dimensione 913.53 kB
Formato Adobe PDF
913.53 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/570463
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact