Cochlear transformations of speech signals result in an auditory neural firing pattern sig-nificantly different from the spectrogram, a popular time-frequency-energy representation of speech. Phonetic features may correspond in a rather straightforward manner to the neural discharge pattern with which speech is coded by the auditory nerve. For these reasons, even an ear model that is just an approximation of physical reality appears to be a suitable system for identifying those aspects of the speech signal that are relevant for recognition. A recently developed joint Synchrony/Mean-Rate (S/M-R) Auditory Speech Processing (ASP) scheme [8] was successfully applied in speech recognition tasks, where promising re-sults were obtained for speech segmentation and labelling [9]. Moreover, results reported elsewhere in the literature show that a combination of the same ASP scheme with multi-layer artificial neural networks produced an effective generalisation amongst speakers in classify¬ing vowels both for English [1] and Italian [2]. The joint S/M-R ASP scheme will be very briefly described and its application to the problem of speech segmentation and labelling, both for clean and noisy speech, will be intro¬duced and analysed.

Auditory Modelling for Speech Analysis and Recognition

Cosi P
1992

Abstract

Cochlear transformations of speech signals result in an auditory neural firing pattern sig-nificantly different from the spectrogram, a popular time-frequency-energy representation of speech. Phonetic features may correspond in a rather straightforward manner to the neural discharge pattern with which speech is coded by the auditory nerve. For these reasons, even an ear model that is just an approximation of physical reality appears to be a suitable system for identifying those aspects of the speech signal that are relevant for recognition. A recently developed joint Synchrony/Mean-Rate (S/M-R) Auditory Speech Processing (ASP) scheme [8] was successfully applied in speech recognition tasks, where promising re-sults were obtained for speech segmentation and labelling [9]. Moreover, results reported elsewhere in the literature show that a combination of the same ASP scheme with multi-layer artificial neural networks produced an effective generalisation amongst speakers in classify¬ing vowels both for English [1] and Italian [2]. The joint S/M-R ASP scheme will be very briefly described and its application to the problem of speech segmentation and labelling, both for clean and noisy speech, will be intro¬duced and analysed.
1992
Istituto di Scienze e Tecnologie della Cognizione - ISTC
Istituto di Scienze e Tecnologie della Cognizione - ISTC
0-471-93537-9
Audotory Modelling
DSP
Viasual Representation
Speech Signal
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/200198
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact