CNR Institutional Research Information System

We present and describe the Italian Roots in Australian Soil (IRIAS) speech corpus. Following a sociophonetic approach, our aim is to extend and complement the frequently investigated macro-structures of lexical, syntactic and morphological interactions among immigrants' languages and common sociolinguistic investigations about immigrants' language attitudes. We first discuss and motivate the creation of the IRIAS corpus. We then focus on the specific methodological issues we addressed in compiling a corpus of natural spontaneous speech collected in Veneto or Calabrese dialects, Italian and English from first and second generation Italo-Australian speakers originating from two specific regions in Italy (Veneto and Calabria). A detailed description of the IRIAS corpus follows, including its design, collection procedure and processing. The latter focuses on novel manual and automatic solutions we implemented to overcome the challenging dearth of existing resources. These solutions help advance work on spontaneous speech data. We conclude by providing some insights on what has been achieved thus far as well as the analyses currently being carried out on subsets of the IRIAS corpus.

The Italian Roots in Australian Soil (IRIAS) multilingual speech corpus. Speech variation in two generations of Italo-Australians

Galatà, Vincenzo;Avesani, Cinzia;Best, Catherine T.;Di Biase, Bruno;Vayra, Mario

2022

Abstract

We present and describe the Italian Roots in Australian Soil (IRIAS) speech corpus. Following a sociophonetic approach, our aim is to extend and complement the frequently investigated macro-structures of lexical, syntactic and morphological interactions among immigrants' languages and common sociolinguistic investigations about immigrants' language attitudes. We first discuss and motivate the creation of the IRIAS corpus. We then focus on the specific methodological issues we addressed in compiling a corpus of natural spontaneous speech collected in Veneto or Calabrese dialects, Italian and English from first and second generation Italo-Australian speakers originating from two specific regions in Italy (Veneto and Calabria). A detailed description of the IRIAS corpus follows, including its design, collection procedure and processing. The latter focuses on novel manual and automatic solutions we implemented to overcome the challenging dearth of existing resources. These solutions help advance work on spontaneous speech data. We conclude by providing some insights on what has been achieved thus far as well as the analyses currently being carried out on subsets of the IRIAS corpus.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2022
			
	Strutture organizzative
	
				Istituto di Scienze e Tecnologie della Cognizione - ISTC - Sede Secondaria Padova
			
	Parole chiave
	
				Annotation
Automatic transcription
Forced alignment
Italo-Australian community
Language change
Multilingual speech resource
Sociophonetics
Speech corpus compilation
			
	Appare nelle tipologie:
	
				01.01 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
2021_The_Italian_Roots_In_Australian_Soil_IRIAS_multili.pdf solo utenti autorizzati Descrizione: Galatà, V., Avesani, C., Best, C.T. et al. The Italian Roots in Australian Soil (IRIAS) multilingual speech corpus. Speech variation in two generations of Italo-Australians. Lang Resources & Evaluation 56, 37–78 (2022). https://doi.org/10.1007/s10579-021-09539-3 Tipologia: Versione Editoriale (PDF) Licenza: NON PUBBLICO - Accesso privato/ristretto Dimensione 840.88 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	840.88 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/397826

Citazioni

ND

1

2

social impact