This paper describes a series of experiments that compare different approaches to training a speaker-independent continuous-speech digit recognizer using the CSLU Toolkit. Comparisons are made between the Hidden Markov Model (HMM) and Neural Network (NN) approaches. In addition, a description of the CSLU Toolkit research environment is given. The CSLU Toolkit is a research and development software environment that provides a powerful and flexible tool for creating and using spoken language systems for telephone and PC applications. In particular, the CSLU-HMM, the CSLU-NN, and the CSLU-FBNN development environments, with which our experiments were implemented, will be described in detail and recognition results will be compared. Our speech corpus is OGI 30K-Numbers, which is a collection of spontaneous ordinal and cardinal numbers, continuous digit strings and isolated digit strings. The utterances were recorded by having a large number of people recite their ZIP code, street address, or other numeric information over the telephone. This corpus represents a very noisy and difficult recognition task. Our best results (98% word recognition, 92% sentence recognition), obtained with the FBNN architecture, suggest the effectiveness of the CSLU Toolkit in building real-life speech recognition systems.

Connected Digit Recognition Experiments with the OGI Toolkit's Neural Network and HMM-Based Recognizers

Cosi P;
1998

Abstract

This paper describes a series of experiments that compare different approaches to training a speaker-independent continuous-speech digit recognizer using the CSLU Toolkit. Comparisons are made between the Hidden Markov Model (HMM) and Neural Network (NN) approaches. In addition, a description of the CSLU Toolkit research environment is given. The CSLU Toolkit is a research and development software environment that provides a powerful and flexible tool for creating and using spoken language systems for telephone and PC applications. In particular, the CSLU-HMM, the CSLU-NN, and the CSLU-FBNN development environments, with which our experiments were implemented, will be described in detail and recognition results will be compared. Our speech corpus is OGI 30K-Numbers, which is a collection of spontaneous ordinal and cardinal numbers, continuous digit strings and isolated digit strings. The utterances were recorded by having a large number of people recite their ZIP code, street address, or other numeric information over the telephone. This corpus represents a very noisy and difficult recognition task. Our best results (98% word recognition, 92% sentence recognition), obtained with the FBNN architecture, suggest the effectiveness of the CSLU Toolkit in building real-life speech recognition systems.
1998
Istituto di Scienze e Tecnologie della Cognizione - ISTC
Istituto di Scienze e Tecnologie della Cognizione - ISTC
Inglese
Proceedings of 4th IEEE Workshop on Interactive Voice Technology for Telecommunications Applications
Interactive Voice Technology for Telecommunications Applications, 1998. 1998 IEEE 4th Workshop IVTTA-ETWR '98
135
140
http://www.pd.istc.cnr.it/Papers/PieroCosi/cp-IVTTA98.pdf
Sì, ma tipo non specificato
29-30 September, 1998
Turin, Italy
Connected Digit Recognition
OGI Toolkit
Neural Network
HMM
Cosi P., Hosom J.P., Shalkwyk J., Sutton S., Cole R.A. Connected Digit Recognition Experiments with the OGI Toolkit's Neural Network and HMM-Based Recognizers Proceedings of Interactive Voice Technology for Telecommunications Applications, 1998. 1998 IEEE 4th Workshop IVTTA-ETWR '98 Turin, Italy 29-30 September, 1998 pp. 135-140 http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=5886 http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=727708
5
none
Cosi, P; Hosom, Jp; Shalkwyk, J; Sutton, S; Cole, Ra
273
info:eu-repo/semantics/conferenceObject
04 Contributo in convegno::04.01 Contributo in Atti di convegno
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/14420
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact