In this study, we used a mixed-effects logistic regression model in combination with generalized additive logistic modeling to predict lexical differences in Tuscan dialects with respect to standard Italian. We used lexical information for 170 concepts in 213 locations in Tuscany. Although geographical position is an important predictor with locations distant from Florence having lexical forms more likely to differ from standard Italian, several other factors emerged as significant. The model predicts that lexical variants used by older speakers and in smaller as well as poorer communities are more likely to differ from standard Italian. The impact of the demographic variables, however, varied from concept to concept. For a majority of concepts, smaller and poorer communities have lexical forms different from standard Italian. For a smaller minority of concepts, however, larger and richer communities have lexical forms different from standard Italian. Similarly, the effect of speaker age and the average community age also varied per concept. While not significant as a fixed effect, the concept frequency showed significant geographical variation. These results clearly identify important factors involved in dialect variation at the lexical level. In addition, this study illustrates the usefulness of mixed-effects regression techniques together with generalized additive modeling for analyzing lexical dialect data.

Definizione di un modello computazionale della variazione dialettale basato sull'integrazione di fattori socio-demografici e geografici

Simonetta Montemagni;
2011

Abstract

In this study, we used a mixed-effects logistic regression model in combination with generalized additive logistic modeling to predict lexical differences in Tuscan dialects with respect to standard Italian. We used lexical information for 170 concepts in 213 locations in Tuscany. Although geographical position is an important predictor with locations distant from Florence having lexical forms more likely to differ from standard Italian, several other factors emerged as significant. The model predicts that lexical variants used by older speakers and in smaller as well as poorer communities are more likely to differ from standard Italian. The impact of the demographic variables, however, varied from concept to concept. For a majority of concepts, smaller and poorer communities have lexical forms different from standard Italian. For a smaller minority of concepts, however, larger and richer communities have lexical forms different from standard Italian. Similarly, the effect of speaker age and the average community age also varied per concept. While not significant as a fixed effect, the concept frequency showed significant geographical variation. These results clearly identify important factors involved in dialect variation at the lexical level. In addition, this study illustrates the usefulness of mixed-effects regression techniques together with generalized additive modeling for analyzing lexical dialect data.
Campo DC Valore Lingua
dc.authority.orgunit Istituto di linguistica computazionale "Antonio Zampolli" - ILC -
dc.authority.people Simonetta Montemagni it
dc.authority.people Martijn Wieling it
dc.collection.id.s 95773a9f-8d06-4466-a951-5d4e15d70690 *
dc.collection.name 08.04 Rapporto tecnico *
dc.contributor.appartenenza Istituto di linguistica computazionale "Antonio Zampolli" - ILC *
dc.contributor.appartenenza.mi 918 *
dc.date.accessioned 2024/02/21 03:02:23 -
dc.date.available 2024/02/21 03:02:23 -
dc.date.issued 2011 -
dc.description.abstracteng In this study, we used a mixed-effects logistic regression model in combination with generalized additive logistic modeling to predict lexical differences in Tuscan dialects with respect to standard Italian. We used lexical information for 170 concepts in 213 locations in Tuscany. Although geographical position is an important predictor with locations distant from Florence having lexical forms more likely to differ from standard Italian, several other factors emerged as significant. The model predicts that lexical variants used by older speakers and in smaller as well as poorer communities are more likely to differ from standard Italian. The impact of the demographic variables, however, varied from concept to concept. For a majority of concepts, smaller and poorer communities have lexical forms different from standard Italian. For a smaller minority of concepts, however, larger and richer communities have lexical forms different from standard Italian. Similarly, the effect of speaker age and the average community age also varied per concept. While not significant as a fixed effect, the concept frequency showed significant geographical variation. These results clearly identify important factors involved in dialect variation at the lexical level. In addition, this study illustrates the usefulness of mixed-effects regression techniques together with generalized additive modeling for analyzing lexical dialect data. -
dc.description.affiliations CNR-ILC, Pisa; Department of Humanities Computing, University of Groningen -
dc.description.allpeople Montemagni, Simonetta; Wieling, Martijn -
dc.description.allpeopleoriginal Simonetta Montemagni, Martijn Wieling, -
dc.description.fulltext none en
dc.description.note ID_PUMA: cnr.ilc/2011-TR-008 -
dc.description.numberofauthors 2 -
dc.identifier.uri https://hdl.handle.net/20.500.14243/177115 -
dc.language.iso eng -
dc.subject.keywords Dialettologia toscana -
dc.subject.keywords Dialettometria -
dc.subject.keywords variazione lessicale -
dc.subject.singlekeyword Dialettologia toscana *
dc.subject.singlekeyword Dialettometria *
dc.subject.singlekeyword variazione lessicale *
dc.title Definizione di un modello computazionale della variazione dialettale basato sull'integrazione di fattori socio-demografici e geografici en
dc.type.driver info:eu-repo/semantics/other -
dc.type.full 08 Report e Working Paper::08.04 Rapporto tecnico it
dc.type.miur -2.0 -
dc.ugov.descaux1 206506 -
iris.orcid.lastModifiedDate 2024/04/04 11:28:23 *
iris.orcid.lastModifiedMillisecond 1712222903599 *
iris.sitodocente.maxattempts 1 -
Appare nelle tipologie: 08.04 Rapporto tecnico
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/177115
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact