This paper falls within ongoing research aimed at enhancing the human interpretability of neural language models by incorporating physiological data. Specifically, we leverage eye-tracking data collected during reading to explore how such information can guide model behavior. We train a multilingual encoder model to predict eye-tracking features from the Multilingual Eye-tracking Corpus (MECO) and analyze the resulting shifts in model attention patterns, focusing on how attention redistributes across linguistically informed categories such as part of speech, word position, word length, and distance from the syntactic head after fine-tuning. Moreover, we test how this attention shift impacts the representation of the interested words in the embedding space. The study covers both Italian and English, enabling a cross-linguistic perspective on attention and representation shifts in multilingual encoders grounded in human reading behavior.
The Role of Eye-Tracking Data in Encoder-Based Models: an In-depth Linguistic Analysis
Lucia Domenichelli;Luca Dini;Dominique Brunato;Felice Dell'Orletta
2025
Abstract
This paper falls within ongoing research aimed at enhancing the human interpretability of neural language models by incorporating physiological data. Specifically, we leverage eye-tracking data collected during reading to explore how such information can guide model behavior. We train a multilingual encoder model to predict eye-tracking features from the Multilingual Eye-tracking Corpus (MECO) and analyze the resulting shifts in model attention patterns, focusing on how attention redistributes across linguistically informed categories such as part of speech, word position, word length, and distance from the syntactic head after fine-tuning. Moreover, we test how this attention shift impacts the representation of the interested words in the embedding space. The study covers both Italian and English, enabling a cross-linguistic perspective on attention and representation shifts in multilingual encoders grounded in human reading behavior.| Campo DC | Valore | Lingua |
|---|---|---|
| dc.authority.orgunit | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | en |
| dc.authority.people | Lucia Domenichelli | en |
| dc.authority.people | Luca Dini | en |
| dc.authority.people | Dominique Brunato | en |
| dc.authority.people | Felice Dell'Orletta | en |
| dc.collection.id.s | 71c7200a-7c5f-4e83-8d57-d3d2ba88f40d | * |
| dc.collection.name | 04.01 Contributo in Atti di convegno | * |
| dc.contributor.appartenenza | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | * |
| dc.contributor.appartenenza.mi | 918 | * |
| dc.contributor.area | Non assegn | * |
| dc.contributor.area | Non assegn | * |
| dc.contributor.area | Non assegn | * |
| dc.date.accessioned | 2026/03/03 15:18:26 | - |
| dc.date.available | 2026/03/03 15:18:26 | - |
| dc.date.firstsubmission | 2026/03/02 18:37:29 | * |
| dc.date.issued | 2025 | - |
| dc.date.submission | 2026/03/02 18:37:29 | * |
| dc.description.abstracteng | This paper falls within ongoing research aimed at enhancing the human interpretability of neural language models by incorporating physiological data. Specifically, we leverage eye-tracking data collected during reading to explore how such information can guide model behavior. We train a multilingual encoder model to predict eye-tracking features from the Multilingual Eye-tracking Corpus (MECO) and analyze the resulting shifts in model attention patterns, focusing on how attention redistributes across linguistically informed categories such as part of speech, word position, word length, and distance from the syntactic head after fine-tuning. Moreover, we test how this attention shift impacts the representation of the interested words in the embedding space. The study covers both Italian and English, enabling a cross-linguistic perspective on attention and representation shifts in multilingual encoders grounded in human reading behavior. | - |
| dc.description.allpeople | Domenichelli, Lucia; Dini, Luca; Brunato, Dominique; Dell'Orletta, Felice | - |
| dc.description.allpeopleoriginal | Lucia Domenichelli, Luca Dini, Dominique Brunato, Felice Dell'Orletta | en |
| dc.description.fulltext | open | en |
| dc.description.numberofauthors | 4 | - |
| dc.identifier.source | manual | * |
| dc.identifier.uri | https://hdl.handle.net/20.500.14243/570463 | - |
| dc.language.iso | eng | en |
| dc.relation.ispartofbook | Proceedings of the Eleventh Italian Conference on Computational Linguistics (CLiC-it 2025), 24-26 September 2025, Cagliari, Italy. | en |
| dc.subject.keywordseng | Eye-tracking, Neural Attention, Multilingual models, Embedding space, Interpretability | - |
| dc.subject.singlekeyword | Eye-tracking | * |
| dc.subject.singlekeyword | Neural Attention | * |
| dc.subject.singlekeyword | Multilingual models | * |
| dc.subject.singlekeyword | Embedding space | * |
| dc.subject.singlekeyword | Interpretability | * |
| dc.title | The Role of Eye-Tracking Data in Encoder-Based Models: an In-depth Linguistic Analysis | en |
| dc.type.driver | info:eu-repo/semantics/conferenceObject | - |
| dc.type.full | 04 Contributo in convegno::04.01 Contributo in Atti di convegno | it |
| dc.type.miur | 273 | - |
| iris.mediafilter.data | 2026/03/04 02:52:21 | * |
| iris.orcid.lastModifiedDate | 2026/03/03 15:18:26 | * |
| iris.orcid.lastModifiedMillisecond | 1772547506148 | * |
| iris.sitodocente.maxattempts | 1 | - |
| Appare nelle tipologie: | 04.01 Contributo in Atti di convegno | |
| File | Dimensione | Formato | |
|---|---|---|---|
|
40_main_long.pdf
accesso aperto
Licenza:
Creative commons
Dimensione
913.53 kB
Formato
Adobe PDF
|
913.53 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


