Transformation of sound by statistical techniques is a promising method for a new range of digital audio effects. In this paper a data driven voice transformation algorithm is used to alter the timbre of a neutral (non-emotional) voice in order to reproduce a particular emotional vocal timbre. Perceptually based Mel-Cepstral analysis and Mel Log Spectral Approximation digital filter are used to represent the speech timbre and to synthesize speech with modified spectral envelope. The transformation function adopts a GMM (Gaussian Mixture Model) based parametrization in order convert the spectral envelopes. Experiments with the first and second order derivatives of the mel-cepstral coefficients have been undertaken to prove the benefit of including dynamic information in the model. The proposed algorithm has been evaluated by means of objective measures in the neutral-to-happy and neutral-to-sad tasks.

Statistical Spectral Envelope Transformation applied to Emotional Speech

Fabio Tesser;Piero Cosi
2010

Abstract

Transformation of sound by statistical techniques is a promising method for a new range of digital audio effects. In this paper a data driven voice transformation algorithm is used to alter the timbre of a neutral (non-emotional) voice in order to reproduce a particular emotional vocal timbre. Perceptually based Mel-Cepstral analysis and Mel Log Spectral Approximation digital filter are used to represent the speech timbre and to synthesize speech with modified spectral envelope. The transformation function adopts a GMM (Gaussian Mixture Model) based parametrization in order convert the spectral envelopes. Experiments with the first and second order derivatives of the mel-cepstral coefficients have been undertaken to prove the benefit of including dynamic information in the model. The proposed algorithm has been evaluated by means of objective measures in the neutral-to-happy and neutral-to-sad tasks.
2010
Istituto di Scienze e Tecnologie della Cognizione - ISTC
Inglese
Hannes Pomberger, Franz Zotter And Alois Sontacchi
Proceedings of DAFx-10 13th International Conference on Digital Audio Effects
479
482
4
978-3-200-01940-9
http://www.scopus.com/record/display.url?eid=2-s2.0-84872703286&origin=inward
Helmut Schmidt University - University of the Federal Armed Forces
Hamburg
GERMANIA
Voice Conversion
Spectral Envelopes
Mel-Cepstral Analysis
GMM
Udo Zölzer Published Online: 10 MAR 2011 DOI: 10.1002/9781119991298.fmatter Copyright © 2011 John Wiley & Sons, Ltd Book Title Helmut Schmidt University - University of the Federal Armed Forces, Hamburg, Germany Published Online: 10 MAR 2011 Published Print: 11 MAR 2011 Print ISBN: 9780470665992 Online ISBN: 9781119991298
3
02 Contributo in Volume::02.01 Contributo in volume (Capitolo o Saggio)
268
none
Tesser, Fabio; Zovato, Enrico; Cosi, Piero
info:eu-repo/semantics/bookPart
   Adaptive Strategies for Sustainable Long-Term Social Interaction
   ALIZ-E
   FP7
   248116
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/130357
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact