The age of acquisition of a word is a psycholin- guistic variable concerning the age at which a word is typically learned. It correlates with other psycholinguistic variables such as famil- iarity, concreteness, and imageability. Exist- ing datasets for multiple languages also in- clude linguistic variables such as the length and the frequency of lemmas in different cor- pora. There are substantial sets of normative values for English, but for other languages, such as Italian, the coverage is scarce. In this paper, a set of regression experiments investigates whether it is possible to guess the age of acqui- sition of Italian lemmas that have not been pre- viously rated by humans. An intrinsic evalua- tion is proposed, correlating estimated Italian lemmas’ AoA with English lemmas’ AoA. An extrinsic evaluation - using AoA values as fea- tures for the classification of literary excerpts labeled by age appropriateness - shows how es- sential is lexical coverage for this task.

Guessing the age of acquisition of italian lemmas through linear regression

irene russo
2020

Abstract

The age of acquisition of a word is a psycholin- guistic variable concerning the age at which a word is typically learned. It correlates with other psycholinguistic variables such as famil- iarity, concreteness, and imageability. Exist- ing datasets for multiple languages also in- clude linguistic variables such as the length and the frequency of lemmas in different cor- pora. There are substantial sets of normative values for English, but for other languages, such as Italian, the coverage is scarce. In this paper, a set of regression experiments investigates whether it is possible to guess the age of acqui- sition of Italian lemmas that have not been pre- viously rated by humans. An intrinsic evalua- tion is proposed, correlating estimated Italian lemmas’ AoA with English lemmas’ AoA. An extrinsic evaluation - using AoA values as fea- tures for the classification of literary excerpts labeled by age appropriateness - shows how es- sential is lexical coverage for this task.
2020
Istituto di linguistica computazionale "Antonio Zampolli" - ILC
lexical complexity, computational psycholinguistics
File in questo prodotto:
File Dimensione Formato  
2020.cmcl-1.5(2).pdf

accesso aperto

Tipologia: Versione Editoriale (PDF)
Licenza: Creative commons
Dimensione 173.3 kB
Formato Adobe PDF
173.3 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/505921
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact