CNR Institutional Research Information System

Transformation of sound by statistical techniques is a promising method for a new range of digital audio effects. In this paper a data driven voice transformation algorithm is used to alter the timbre of a neutral (non-emotional) voice in order to reproduce a particular emotional vocal timbre. Perceptually based Mel-Cepstral analysis and Mel Log Spectral Approximation digital filter are used to represent the speech timbre and to synthesize speech with modified spectral envelope. The transformation function adopts a GMM (Gaussian Mixture Model) based parametrization in order convert the spectral envelopes. Experiments with the first and second order derivatives of the mel-cepstral coefficients have been undertaken to prove the benefit of including dynamic information in the model. The proposed algorithm has been evaluated by means of objective measures in the neutral-to-happy and neutral-to-sad tasks.

Statistical Spectral Envelope Transformation applied to Emotional Speech

Fabio Tesser;Enrico Zovato;Piero Cosi

2010

Abstract

Transformation of sound by statistical techniques is a promising method for a new range of digital audio effects. In this paper a data driven voice transformation algorithm is used to alter the timbre of a neutral (non-emotional) voice in order to reproduce a particular emotional vocal timbre. Perceptually based Mel-Cepstral analysis and Mel Log Spectral Approximation digital filter are used to represent the speech timbre and to synthesize speech with modified spectral envelope. The transformation function adopts a GMM (Gaussian Mixture Model) based parametrization in order convert the spectral envelopes. Experiments with the first and second order derivatives of the mel-cepstral coefficients have been undertaken to prove the benefit of including dynamic information in the model. The proposed algorithm has been evaluated by means of objective measures in the neutral-to-happy and neutral-to-sad tasks.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2010
			
	Strutture organizzative
	
				Istituto di Scienze e Tecnologie della Cognizione - ISTC
			
	Codice ISBN
	
				978-3-200-01940-9
			
	Parole chiave
	
				Voice Conversion
Spectral Envelopes
Mel-Cepstral Analysis
GMM
			
	Appare nelle tipologie:
	
				02.01 Contributo in volume (Capitolo o Saggio)

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/130357

Citazioni

ND

ND

ND

social impact