In this paper, we describe the application of two vocoder techniques for an experiment of spectral envelope transformation. We processed speech data in a neutral standard reading style in order to reproduce the spectral shapes of two emotional speaking styles: happy and sad. This was achieved by means of conversion functions which operate in the frequency domain and are trained with aligned source-target pairs of spectral features. The first vocoder is based on the source-filter model of speech production and exploits the Mel Log Spectral Approximation filter, while the second is the Phase vocoder. Objective distance measures were calculated in order to evaluate the effectiveness of the conversion framework in predicting the target spectral envelopes. Subjective listening tests also provided interesting elements for the evaluation.

Two Vocoder Techniques for Neutral to Emotional Timbre Conversion

Tesser F;Cosi P
2010

Abstract

In this paper, we describe the application of two vocoder techniques for an experiment of spectral envelope transformation. We processed speech data in a neutral standard reading style in order to reproduce the spectral shapes of two emotional speaking styles: happy and sad. This was achieved by means of conversion functions which operate in the frequency domain and are trained with aligned source-target pairs of spectral features. The first vocoder is based on the source-filter model of speech production and exploits the Mel Log Spectral Approximation filter, while the second is the Phase vocoder. Objective distance measures were calculated in order to evaluate the effectiveness of the conversion framework in predicting the target spectral envelopes. Subjective listening tests also provided interesting elements for the evaluation.
2010
Istituto di Scienze e Tecnologie della Cognizione - ISTC
Istituto di Scienze e Tecnologie della Cognizione - ISTC
Inglese
Yoshinori Sagisaka, And Keiichi Tokuda (Ed.)
Proceedings of 7th Speech Synthesis Workshop (SSW)
SSW7 (7th ISCA Workshop on Speech Synthesis)
130
135
384
http://www.researchgate.net/publication/228722990_Two_Vocoder_Techniques_for_Neutral_to_Emotional_Timbre_Conversion/file/79e4150a363d77778a.pdf
Sì, ma tipo non specificato
22-24 September 2010
ATR, Kyoto, Japan
Emotional Speech
Spectral Transformation
Phase Vocoder
MLSA filter
GMM
Proceedings of SSW7 (7th ISCA Workshop on Speech Synthesis) http://isw3.naist.jp/~tomoki/ssw7/www/doc/ssw7_proceedings_rev.pdf
4
none
Tesser, F; Zovato, E; Nicolao, M; Cosi, P
273
info:eu-repo/semantics/conferenceObject
04 Contributo in convegno::04.01 Contributo in Atti di convegno
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/130358
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact