Hidden Markov Model Techniques 1

Home
Full List of Titles
1: ICSLP'98 Proceedings
Keynote Speeches
Text-To-Speech Synthesis 1
Spoken Language Models and Dialog 1
Prosody and Emotion 1
Hidden Markov Model Techniques 1
Speaker and Language Recognition 1
Multimodal Spoken Language Processing 1
Isolated Word Recognition
Robust Speech Processing in Adverse Environments 1
Spoken Language Models and Dialog 2
Articulatory Modelling 1
Talking to Infants, Pets and Lovers
Robust Speech Processing in Adverse Environments 2
Spoken Language Models and Dialog 3
Speech Coding 1
Articulatory Modelling 2
Prosody and Emotion 2
Neural Networks, Fuzzy and Evolutionary Methods 1
Utterance Verification and Word Spotting 1 / Speaker Adaptation 1
Text-To-Speech Synthesis 2
Spoken Language Models and Dialog 4
Human Speech Perception 1
Robust Speech Processing in Adverse Environments 3
Speech and Hearing Disorders 1
Prosody and Emotion 3
Spoken Language Understanding Systems 1
Signal Processing and Speech Analysis 1
Spoken Language Generation and Translation 1
Spoken Language Models and Dialog 5
Segmentation, Labelling and Speech Corpora 1
Multimodal Spoken Language Processing 2
Prosody and Emotion 4
Neural Networks, Fuzzy and Evolutionary Methods 2
Large Vocabulary Continuous Speech Recognition 1
Speaker and Language Recognition 2
Signal Processing and Speech Analysis 2
Prosody and Emotion 5
Robust Speech Processing in Adverse Environments 4
Segmentation, Labelling and Speech Corpora 2
Speech Technology Applications and Human-Machine Interface 1
Large Vocabulary Continuous Speech Recognition 2
Text-To-Speech Synthesis 3
Language Acquisition 1
Acoustic Phonetics 1
Speaker Adaptation 2
Speech Coding 2
Hidden Markov Model Techniques 2
Multilingual Perception and Recognition 1
Large Vocabulary Continuous Speech Recognition 3
Articulatory Modelling 3
Language Acquisition 2
Speaker and Language Recognition 3
Text-To-Speech Synthesis 4
Spoken Language Understanding Systems 4
Human Speech Perception 2
Large Vocabulary Continuous Speech Recognition 4
Spoken Language Understanding Systems 2
Signal Processing and Speech Analysis 3
Human Speech Perception 3
Speaker Adaptation 3
Spoken Language Understanding Systems 3
Multimodal Spoken Language Processing 3
Acoustic Phonetics 2
Large Vocabulary Continuous Speech Recognition 5
Speech Coding 3
Language Acquisition 3 / Multilingual Perception and Recognition 2
Segmentation, Labelling and Speech Corpora 3
Text-To-Speech Synthesis 5
Spoken Language Generation and Translation 2
Human Speech Perception 4
Robust Speech Processing in Adverse Environments 5
Text-To-Speech Synthesis 6
Speech Technology Applications and Human-Machine Interface 2
Prosody and Emotion 6
Hidden Markov Model Techniques 3
Speech and Hearing Disorders 2 / Speech Processing for the Speech and Hearing Impaired 1
Human Speech Production
Segmentation, Labelling and Speech Corpora 4
Speaker and Language Recognition 4
Speech Technology Applications and Human-Machine Interface 3
Utterance Verification and Word Spotting 2
Large Vocabulary Continuous Speech Recognition 6
Neural Networks, Fuzzy and Evolutionary Methods 3
Speech Processing for the Speech-Impaired and Hearing-Impaired 2
Prosody and Emotion 7
2: SST Student Day
SST Student Day - Poster Session 1
SST Student Day - Poster Session 2

Author Index
A B C D E F G H I
J K L M N O P Q R
S T U V W X Y Z

Multimedia Files

Nonreciprocal Data Sharing in Estimating HMM Parameters

Authors:

Xiaoqiang Luo, CLSP, The Johns Hopkins University (USA)
Frederick Jelinek, CLSP, The Johns Hopkins University (USA)

Page (NA) Paper number 365

Abstract:

Parameter tying is often used in large vocabulary continuous speech recognition (LVCSR) systems to balance the model resolution and generalizability. However, one consequence of tying is that the differences among tied constructs are ignored. Parameter tying can be alternatively viewed as reciprocal data sharing in that a tied construct uses data associated with all others in its tied-class. To capture the fine difference among tied constructs, we propose to use nonreciprocal data sharing (NRDS) when estimating HMM parameters. In particular, when estimating Gaussian parameters for a HMM state, contributions from other acoustically similar HMM states will be weighted, thus allowing different statistics to govern different states. Data sharing weights are optimized using cross-validation. It can be shown that the objective function for cross-validation is a sum of rational functions and can be efficiently optimized by the growth-transform. Our results on Switchboard show that NRDS reduces the word error rate (WER) significantly compared with a state-of-art baseline system using HMM state-tying.

SL980365.PDF (From Author) SL980365.PDF (Rasterized)

TOP


Data-Driven Extensions to HMM Statistical Dependencies

Authors:

Jeff A. Bilmes, ICSI/U.C. Berkeley (USA)

Page (NA) Paper number 894

Abstract:

In this paper, a new technique is introduced that relaxes the HMM conditional independence assumption in a principled way. Without increasing the number of states, the modeling power of an HMM is increased by including only those additional probabilistic dependencies (to the surrounding observation context) that are believed to be both relevant and discriminative. Conditional mutual information is used to determine both relevance and discriminability. Extended Gaussian-mixture HMMs and new EM update equations are introduced. In an isolated word speech database, results show an average 34% word error improvement over an HMM with the same number of states, and a 15% improvement over an HMM with a comparable number of parameters.

SL980894.PDF (From Author) SL980894.PDF (Rasterized)

TOP


Use of High-Level Linguistic Constraints for Constructing Feature-Based Phonological Model in Speech Recognition

Authors:

Jiping Sun, Department of Electrical and Computer Engineering, University of Waterloo (Canada)
Li Deng, Department of Electrical and Computer Engineering, University of Waterloo (Canada)

Page (NA) Paper number 43

Abstract:

Modeling phonological units of speech is a critical issue in speech recognition. In this paper, we report our recent development of an overlapping feature-based phonological model which gives long-span contextual dependency. We extend our earlier work by incorporating high-level linguistic constraints in automatic construction of the feature overlapping patterns. The main linguistic information explored includes morpheme, syllable, syllable constituent categories and word stress markers. We describe a consistent computational framework developed for the construction of the feature-based model, and discuss use of the model as the HMM state topology for speech recognizers.

SL980043.PDF (From Author) SL980043.PDF (Rasterized)

TOP