This paper falls within ongoing research aimed at enhancing the human interpretability of neural language models by incorporating physiological data. Specifically, we leverage eye-tracking data collected during reading to explore how such information can guide model behavior. We train a multilingual encoder model to predict eye-tracking features from the Multilingual Eye-tracking Corpus (MECO) and analyze the resulting shifts in model attention patterns, focusing on how attention redistributes across linguistically informed categories such as part of speech, word position, word length, and distance from the syntactic head after fine-tuning. Moreover, we test how this attention shift impacts the representation of the interested words in the embedding space. The study covers both Italian and English, enabling a cross-linguistic perspective on attention and representation shifts in multilingual encoders grounded in human reading behavior.

The Role of Eye-Tracking Data in Encoder-Based Models: an In-depth Linguistic Analysis

Lucia Domenichelli;Luca Dini;Dominique Brunato;Felice Dell'Orletta
2025

Abstract

This paper falls within ongoing research aimed at enhancing the human interpretability of neural language models by incorporating physiological data. Specifically, we leverage eye-tracking data collected during reading to explore how such information can guide model behavior. We train a multilingual encoder model to predict eye-tracking features from the Multilingual Eye-tracking Corpus (MECO) and analyze the resulting shifts in model attention patterns, focusing on how attention redistributes across linguistically informed categories such as part of speech, word position, word length, and distance from the syntactic head after fine-tuning. Moreover, we test how this attention shift impacts the representation of the interested words in the embedding space. The study covers both Italian and English, enabling a cross-linguistic perspective on attention and representation shifts in multilingual encoders grounded in human reading behavior.
2025
Istituto di linguistica computazionale "Antonio Zampolli" - ILC
Eye-tracking, Neural Attention, Multilingual models, Embedding space, Interpretability
File in questo prodotto:
File Dimensione Formato  
40_main_long.pdf

accesso aperto

Licenza: Creative commons
Dimensione 913.53 kB
Formato Adobe PDF
913.53 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/570463
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact