We present and describe the Italian Roots in Australian Soil (IRIAS) speech corpus. Following a sociophonetic approach, our aim is to extend and complement the frequently investigated macro-structures of lexical, syntactic and morphological interactions among immigrants' languages and common sociolinguistic investigations about immigrants' language attitudes. We first discuss and motivate the creation of the IRIAS corpus. We then focus on the specific methodological issues we addressed in compiling a corpus of natural spontaneous speech collected in Veneto or Calabrese dialects, Italian and English from first and second generation Italo-Australian speakers originating from two specific regions in Italy (Veneto and Calabria). A detailed description of the IRIAS corpus follows, including its design, collection procedure and processing. The latter focuses on novel manual and automatic solutions we implemented to overcome the challenging dearth of existing resources. These solutions help advance work on spontaneous speech data. We conclude by providing some insights on what has been achieved thus far as well as the analyses currently being carried out on subsets of the IRIAS corpus.
The Italian Roots in Australian Soil (IRIAS) multilingual speech corpus. Speech variation in two generations of Italo-Australians
Galatà, Vincenzo
;Avesani, Cinzia;
2022
Abstract
We present and describe the Italian Roots in Australian Soil (IRIAS) speech corpus. Following a sociophonetic approach, our aim is to extend and complement the frequently investigated macro-structures of lexical, syntactic and morphological interactions among immigrants' languages and common sociolinguistic investigations about immigrants' language attitudes. We first discuss and motivate the creation of the IRIAS corpus. We then focus on the specific methodological issues we addressed in compiling a corpus of natural spontaneous speech collected in Veneto or Calabrese dialects, Italian and English from first and second generation Italo-Australian speakers originating from two specific regions in Italy (Veneto and Calabria). A detailed description of the IRIAS corpus follows, including its design, collection procedure and processing. The latter focuses on novel manual and automatic solutions we implemented to overcome the challenging dearth of existing resources. These solutions help advance work on spontaneous speech data. We conclude by providing some insights on what has been achieved thus far as well as the analyses currently being carried out on subsets of the IRIAS corpus.File | Dimensione | Formato | |
---|---|---|---|
2021_The_Italian_Roots_In_Australian_Soil_IRIAS_multili.pdf
solo utenti autorizzati
Descrizione: Galatà, V., Avesani, C., Best, C.T. et al. The Italian Roots in Australian Soil (IRIAS) multilingual speech corpus. Speech variation in two generations of Italo-Australians. Lang Resources & Evaluation 56, 37–78 (2022). https://doi.org/10.1007/s10579-021-09539-3
Tipologia:
Versione Editoriale (PDF)
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
840.88 kB
Formato
Adobe PDF
|
840.88 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.