Authors:
Xiaoqiang Luo, CLSP, The Johns Hopkins University (USA)
Frederick Jelinek, CLSP, The Johns Hopkins University (USA)
Page (NA) Paper number 365
Abstract:
Parameter tying is often used in large vocabulary continuous speech
recognition (LVCSR) systems to balance the model resolution and generalizability.
However, one consequence of tying is that the differences among tied
constructs are ignored. Parameter tying can be alternatively viewed
as reciprocal data sharing in that a tied construct uses data associated
with all others in its tied-class. To capture the fine difference among
tied constructs, we propose to use nonreciprocal data sharing (NRDS)
when estimating HMM parameters. In particular, when estimating Gaussian
parameters for a HMM state, contributions from other acoustically similar
HMM states will be weighted, thus allowing different statistics to
govern different states. Data sharing weights are optimized using cross-validation.
It can be shown that the objective function for cross-validation is
a sum of rational functions and can be efficiently optimized by the
growth-transform. Our results on Switchboard show that NRDS reduces
the word error rate (WER) significantly compared with a state-of-art
baseline system using HMM state-tying.
Authors:
Jeff A. Bilmes, ICSI/U.C. Berkeley (USA)
Page (NA) Paper number 894
Abstract:
In this paper, a new technique is introduced that relaxes the HMM conditional
independence assumption in a principled way. Without increasing the
number of states, the modeling power of an HMM is increased by including
only those additional probabilistic dependencies (to the surrounding
observation context) that are believed to be both relevant and discriminative.
Conditional mutual information is used to determine both relevance
and discriminability. Extended Gaussian-mixture HMMs and new EM update
equations are introduced. In an isolated word speech database, results
show an average 34% word error improvement over an HMM with the same
number of states, and a 15% improvement over an HMM with a comparable
number of parameters.
Authors:
Jiping Sun, Department of Electrical and Computer Engineering, University of Waterloo (Canada)
Li Deng, Department of Electrical and Computer Engineering, University of Waterloo (Canada)
Page (NA) Paper number 43
Abstract:
Modeling phonological units of speech is a critical issue in speech
recognition. In this paper, we report our recent development of an
overlapping feature-based phonological model which gives long-span
contextual dependency. We extend our earlier work by incorporating
high-level linguistic constraints in automatic construction of the
feature overlapping patterns. The main linguistic information explored
includes morpheme, syllable, syllable constituent categories and word
stress markers. We describe a consistent computational framework developed
for the construction of the feature-based model, and discuss use of
the model as the HMM state topology for speech recognizers.
|