The development of a speaker independent "general purpose" phonetic recognizer for Italian is described. The CSLU Toolkit was used to develop and implement the system. The recognizer, based on a frame-based hybrid HMM/ANN architecture trained on context-dependent categories to account for coarticulatory variation, recognizes 38 different phonemes (not including silence or closures), and can distinguish between stressed and unstressed vowels as well as open and closed vowels. The APASCI corpus, containing nearly 2500 sentences read by 100 speakers, where the sentences have been designed to maximize the number of phonemes occurring in different contexts, was used for training and testing. As of the time of this writing, a phoneme-level accuracy of 82.90% on the development set and of 80.53% on the test set has been obtained. This level of accuracy is much greater than on a similar English-language corpus (with state-of-the-art performance of slightly better than 70%) and it represents the best performance obtained so far on this corpus.

High Performance "General Purpose" Phonetic Recognition for Italian

Cosi P;
2000

Abstract

The development of a speaker independent "general purpose" phonetic recognizer for Italian is described. The CSLU Toolkit was used to develop and implement the system. The recognizer, based on a frame-based hybrid HMM/ANN architecture trained on context-dependent categories to account for coarticulatory variation, recognizes 38 different phonemes (not including silence or closures), and can distinguish between stressed and unstressed vowels as well as open and closed vowels. The APASCI corpus, containing nearly 2500 sentences read by 100 speakers, where the sentences have been designed to maximize the number of phonemes occurring in different contexts, was used for training and testing. As of the time of this writing, a phoneme-level accuracy of 82.90% on the development set and of 80.53% on the test set has been obtained. This level of accuracy is much greater than on a similar English-language corpus (with state-of-the-art performance of slightly better than 70%) and it represents the best performance obtained so far on this corpus.
2000
Istituto di Scienze e Tecnologie della Cognizione - ISTC
Istituto di Scienze e Tecnologie della Cognizione - ISTC
Inglese
Baozong Yuan, Taiyi Huang, Xiaofang Tang
ICSLP 2000 - 6th International Conference on Spoken language Processing
ICSLP 2000 - 6th International Conference on Spoken language Processing
527
530
1092
7-80150-114-4
http://www2.pd.istc.cnr.it/Papers/PieroCosi/cp-ICSLP2000-02.pdf
China Military Friendship Publish
Beijing
REPUBBLICA POPOLARE CINESE
Sì, ma tipo non specificato
16-20 October, 2000
Beijing, Cina
High Performance
"General Purpose"
Phonetic Recognition
Italian
Cosi P., Hosom J.P. "High Performance "General Purpose" Phonetic Recognition for Italian" Proceedings ICSLP-2000 International Conference on Spoken Language Processing Beijing, Cina 16-20 October, 2000 http://www2.pd.istc.cnr.it/Papers/PieroCosi/cp-ICSLP2000-02.pdf Vol. II, pp. 527-530
1
none
Cosi P.; Hosom J.P.
273
info:eu-repo/semantics/conferenceObject
04 Contributo in convegno::04.01 Contributo in Atti di convegno
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/18529
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact