Authors:
Wolfgang Reichl, Bell-Labs, Lucent Technologies (USA)
Bob Carpenter, Bell-Labs, Lucent Technologies (USA)
Jennifer Chu-Carroll, Bell-Labs, Lucent Technologies (USA)
Wu Chou, Bell-Labs, Lucent Technologies (USA)
Page (NA) Paper number 588
Abstract:
In this paper we discuss the role of language modeling in a novel natural
language dialogue system designed to automatically route incoming customer
calls. We arrive at two significant conclusions: First, standard word
error rate measures do not reflect application specific requirements;
highly reliable content extraction is possible with relatively high
word error rates. Secondly blending human-human data with human-machine
data did not improve the performance in language modeling.
Authors:
John Gillett, Carnegie Mellon University (USA)
Wayne Ward, Carnegie Mellon University (USA)
Page (NA) Paper number 872
Abstract:
We propose a class trigram language model in which each class is specified
by a probabilistic context-free grammar. We show how to estimate the
parameters of the model, and how to smooth these estimates. We present
experimental perplexity and speech recognition results.
Authors:
Bernd Souvignier, Philips Research Laboratories (Germany)
Andreas Kellner, Philips Research Laboratories (Germany)
Page (NA) Paper number 961
Abstract:
The robust estimation of language models for new applications of spoken
dialogue systems often suffers from a lack of available training material.
An alternative to training is to adapt initial language models to
a new task by exploiting material from recognition. We investigate
different methods for online-adaptation of language models. Apart
from supervised and unsupervised adaptation, we look at two refined
approaches: the first allows multiple hypotheses from N-best lists
for adaptation and the second uses confidence measures to reject unreliably
recognized sentences. We apply adaptation both to the language model
used in the recognizer to focus the beam search and to the stochastic
language understanding grammar. It turns out that the understanding
grammar can be improved quite significantly using N-best lists or confidence
measures, whereas unsupervised adaptation may even result in a deterioration
of the system. The language model used in the recognizer is also improved
very satisfactory.
Authors:
Giuseppe Riccardi, AT&T-Labs Research (USA)
Alexandros Potamianos, AT&T-Labs Research (USA)
Shrikanth Narayanan, AT&T-Labs Research (USA)
Page (NA) Paper number 1052
Abstract:
In a human-machine interaction (dialog) the statistical language variations
are large among different stages of the dialog and across different
speakers. Moreover, spoken dialog systems require extensive training
data for training stochastic language models. In this paper we address
the problem of open-vocabulary language models allowing the user for
any possible response at each stage of the dialog. We propose a novel
off-line adaptation of stochastic language models effective for their
generalization (open-vocabulary) and selective (dialog context) properties.
We outline the integration of the finite state dialog and the language
model adaptation algorithm. The performance of the speech recognition
and understanding language models are evaluated with the Carmen Sandiego
multimodal computer game. The new language models give an overall
understanding error rate reduction of 44% over the baseline system.
Authors:
Brigitte Bigi, LIA CERI-IUP University of Avignon (France)
Renato De Mori, LIA CERI-IUP University of Avignon (France)
Marc El-Beze, LIA CERI-IUP University of Avignon (France)
Thierry Spriet, LIA CERI-IUP University of Avignon (France)
Page (NA) Paper number 77
Abstract:
The use of cache memories and symmetric Kullback-Leibler distances
is proposed for topic classification and topic-shift detection. Experiments
with a large corpus of articles from the French newspaper "Le Monde
show tangible advantages when different models are combined with a
suitable strategy. Experimental results show that different strategies
for topic shift detection have to be used depending on whether high
recall or high precision are sought. Furthermore, methods based on
topic independent distributions provide complementary candidates with
respect to the use of topic-dependent distributions leading to an increase
in recall with a minor loss in precision.
Authors:
Lori Levin, Carnegie Mellon University (USA)
Ann Thymé-Gobbel, Natural Speech Technologies (USA)
Alon Lavie, Carnegie Mellon University (USA)
Klaus Ries, Carnegie Mellon University (USA)
Klaus Zechner, Carnegie Mellon University (USA)
Page (NA) Paper number 1000
Abstract:
This paper describes a 3-level manual discourse coding scheme that
we have devised for manual tagging of the CallHome Spanish (CHS) and
CallFriend Spanish (CFS) databases used in the CLARITY project. The
goal of CLARITY is to explore the use of discourse structure in understanding
conversational speech. The project combines empirical methods for
dialogue processing with state-of-the art LVCSR (using the JANUS recognizer).
The three levels of the coding scheme are (1) a speech act level consisting
of a tag set extended from DAMSL and Switchboard; (2) dialogue game
level defined by initiative and intention; and (3) an activity level
defined within topic units. The manually tagged dialogues are used
to train automatic classifiers. We present preliminary results for
automatic speech act classification and topic boundary identification
and inter-coder speech act confusion matrices.
|