Authors:
Hajime Tsukada, ATR Interpreting Telecommunications Research Laboratories (Japan)
Hirofumi Yamamoto, ATR Interpreting Telecommunications Research Laboratories (Japan)
Toshiyuki Takezawa, ATR Interpreting Telecommunications Research Laboratories (Japan)
Yoshinori Sagisaka, ATR Interpreting Telecommunications Research Laboratories (Japan)
Page (NA) Paper number 485
Abstract:
We propose a novel recognition method for generating an accurate grammatical
word-graph allowing grammatical deviations. Our method uses both an
n-gram and a grammar-based statistical language model and aligns utterances
with the grammar by adding deviation information during the search
process. Our experiments confirm that the word-graph obtained by our
proposed method is superior to the one obtained by only using the n-gram
with the same word-graph density. In addition, our recognition method
can search enormous hypotheses more efficiently than the conventional
word-graph based search method.
Authors:
Norimichi Yodo, Graduate School of Information Science, Nara Institute of Science and Technology (Japan)
Kiyohiro Shikano, Graduate School of Information Science, Nara Institute of Science and Technology (Japan)
Satoshi Nakamura, Graduate School of Information Science, Nara Institute of Science and Technology (Japan)
Page (NA) Paper number 716
Abstract:
In this paper we propose an algorithm for reducing the size of back-off
N-gram models, with less affecting its performance than the traditional
cutoff method. The algorithm is based on the Maximum Likelihood (ML)
estimation and realizes an N-gram language model with a given number
of N-gram probability parameters that minimize the training set perplexity.
To confirm the effectiveness of our algorithm, we apply it to trigram
and bigram models, and the experiments in terms of perplexity and word
error rate in a dictation system are carried out.
Authors:
Ulla Uebler, Bavarian Research Center for Knowledge Based Systems (Germany)
Heinrich Niemann, Bavarian Research Center for Knowledge Based Systems (Germany)
Page (NA) Paper number 338
Abstract:
It is well known that good language models improve performance of speech
recognition. One requirement for the estimation of language models
is a sufficient amount of texts of the application domain. If not
all words of the domain occur in the training texts for language models,
a way must be found to model these words adequately. In this paper
we report on a new approach of building word classes for language modeling
in the bilingual (German, Italian) SpeeData project. The main idea
is to classify words according to their morphological properties.
Therefore we decompose words into their morphological units and put
the words with the same prefix or suffix into the same class. Since
morphological decomposition is error prone for unknown word stems,
we also decomposed words by counting beginnings and endings of different
length and used these subunits like prefixes and suffixes. The advantage
of this approach is that it can be carried out automatically. We achieved
a reduction in error rate from 9.83 % to 5.77 % for morphological decomposition
and 5.99 % for automatical decomposition which can be performed without
any morphological knowledge.
Authors:
Imed Zitouni, LORIA / INRIA-Lorraine (France)
Kamel Smaïli, LORIA / INRIA-Lorraine (France)
Jean-Paul Haton, LORIA / INRIA-Lorraine (France)
Sabine Deligne, ATR-ITL (Japan)
Frédéric Bimbot, IRISA-CNRS/INRIA (France)
Page (NA) Paper number 498
Abstract:
In this paper, we introduce the concept of Multiclass for language
modeling and we compare it to the Polyclass model. The originality
of the Multiclass is its capability to parse a string of class/tags
into variable length independent sequences. A few experimental tests
were carried out on a class corpus extracted from the French "Le Monde"
word corpus labeled automatically. This corpus contains a set of 43
million of words. In our experiments, Multiclass outperform first-order
Polyclass but are slightly outperformed by second-order Polyclass.
Authors:
Dietrich Klakow, Philips Research Laboratories (Germany)
Page (NA) Paper number 522
Abstract:
Combining different language models is an important task. Linear interpolation
is the established method to do this. We will present a new method
called log-linear interpolation (LLI) which combines the simplicity
of linear interpolation with essential parts from maximum-entropy models
by linearly interpolating the scores of different models. The first
series of experiments focuses on adaptation. Unigram, bigram and trigram
models trained on NAB are combined with unigram and bigram models trained
on a small domain specific corpus. LLI compares favorably with linear
interpolation. The second series combines bigram and distance-bigram
models. Here, relative improvements are larger (~20% in perplexity).
This task seems to be the ideal application of LLI. To further scrutinize
the method, first frequent pairs of words are joined to phrases and
next, bigram and distance-bigram are combined by LLI. This experiment
yields perplexities just 2.5% above the original trigram perplexity.
Authors:
Philip Clarkson, Cambridge University Engineering Department (U.K.)
Tony Robinson, Cambridge University Engineering Department (U.K.)
Page (NA) Paper number 962
Abstract:
Adaptive language models have consistently been shown to lead to a
significant reduction in language model perplexity compared to the
equivalent static trigram model on many data sets. When these language
models have been applied to speech recognition, however, they have
seldom resulted in a corresponding reduction in word error rate. This
paper will investigate some of the possible reasons for this apparent
discrepancy, and will explore the circumstances under which adaptive
language models can be useful. We will concentrate on cache-based and
mixture-based models and their use on the Broadcast News task.
|