We present and describe the Italian Roots in Australian Soil (IRIAS) speech corpus. Following a sociophonetic approach, our aim is to extend and complement the frequently investigated macro-structures of lexical, syntactic and morphological interactions among immigrants' languages and common sociolinguistic investigations about immigrants' language attitudes. We first discuss and motivate the creation of the IRIAS corpus. We then focus on the specific methodological issues we addressed in compiling a corpus of natural spontaneous speech collected in Veneto or Calabrese dialects, Italian and English from first and second generation Italo-Australian speakers originating from two specific regions in Italy (Veneto and Calabria). A detailed description of the IRIAS corpus follows, including its design, collection procedure and processing. The latter focuses on novel manual and automatic solutions we implemented to overcome the challenging dearth of existing resources. These solutions help advance work on spontaneous speech data. We conclude by providing some insights on what has been achieved thus far as well as the analyses currently being carried out on subsets of the IRIAS corpus.

The Italian Roots in Australian Soil (IRIAS) multilingual speech corpus. Speech variation in two generations of Italo-Australians

Galatà, Vincenzo
;
Avesani, Cinzia;
2022

Abstract

We present and describe the Italian Roots in Australian Soil (IRIAS) speech corpus. Following a sociophonetic approach, our aim is to extend and complement the frequently investigated macro-structures of lexical, syntactic and morphological interactions among immigrants' languages and common sociolinguistic investigations about immigrants' language attitudes. We first discuss and motivate the creation of the IRIAS corpus. We then focus on the specific methodological issues we addressed in compiling a corpus of natural spontaneous speech collected in Veneto or Calabrese dialects, Italian and English from first and second generation Italo-Australian speakers originating from two specific regions in Italy (Veneto and Calabria). A detailed description of the IRIAS corpus follows, including its design, collection procedure and processing. The latter focuses on novel manual and automatic solutions we implemented to overcome the challenging dearth of existing resources. These solutions help advance work on spontaneous speech data. We conclude by providing some insights on what has been achieved thus far as well as the analyses currently being carried out on subsets of the IRIAS corpus.
2022
Istituto di Scienze e Tecnologie della Cognizione - ISTC - Sede Secondaria Padova
Annotation
Automatic transcription
Forced alignment
Italo-Australian community
Language change
Multilingual speech resource
Sociophonetics
Speech corpus compilation
File in questo prodotto:
File Dimensione Formato  
2021_The_Italian_Roots_In_Australian_Soil_IRIAS_multili.pdf

solo utenti autorizzati

Descrizione: Galatà, V., Avesani, C., Best, C.T. et al. The Italian Roots in Australian Soil (IRIAS) multilingual speech corpus. Speech variation in two generations of Italo-Australians. Lang Resources & Evaluation 56, 37–78 (2022). https://doi.org/10.1007/s10579-021-09539-3
Tipologia: Versione Editoriale (PDF)
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 840.88 kB
Formato Adobe PDF
840.88 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/397826
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact