Spoken Language Systems

Home


Development And Evaluation Of The ATOS Spontaneous Speech Conversational System

Authors:

Jorge Alvarez, TID (Spain)
Daniel Tapias, TID (Spain)
Carlos Crespo, TID (Spain)
Ismael Cortazar, TID (Spain)
Fernando Martinez, TID (Spain)

Volume 2, Page 1139

Abstract:

In this paper we report our recent development work in Spanish spontaneous speech conversational systems. We describe the Automatic Telephone Operator Service (ATOS) and present the improvements introduced into it to deal with spontaneous speech, which are: (a) a task independent dialogue manager, that can be adapted to a new semantic domain by changing a configuration file. It also generates a prediction about the user's expected utterance to constrain the language model used by the speech recognizer. (b) a language modeling strategy, which allows to adapt the statistical language model to a new task with just few hundreds of sentences. This strategy reduces a 27% the word error rate. We also report the results, conclusions and the speech database collected in the evaluation of the ATOS system, which has been tested by 30 real users.

ic971139.pdf

ic971139.pdf

TOP



A Spoken Language System For Automated Call Routing

Authors:

Giuseppe Riccardi, AT&T Labs-Research (U.S.A.)
Allen Gorin, AT&T Labs-Research (U.S.A.)
Andrej Ljolje, AT&T Labs-Research (U.S.A.)
Michael Riley, AT&T Labs-Research (U.S.A.)

Volume 2, Page 1143

Abstract:

We are interested in the problem of understanding fluently spoken language. In particular, we consider people's responses to the open-ended prompt of "How May I help you?". We then further restrict the problem to classifying and automatically routing such a call, based on the meaning of the user's response. Thus, we aim at extracting a relatively small number of semantic actions from the utterances of a very large set of users who are not trained to the system's capabilities and limitations. In this paper, we describe the main components of our speech understanding system: the large vocabulary recognizer and the language understanding module performing the call-type classification. In particular, we propose automatic algorithms for selecting phrases from a training corpus in order to enhance the prediction power of the standard word n-gram. The phrase language models are integrated into stochastic finite state machines which outperform standard word n-gram language model. From the speech recognizer output we recognize and exploit automatically acquired salient phrase fragments to make a call-type classification. This system is evaluated on a database of 10K fluently spoken utterances collected from interactions between users and human agents.

ic971143.pdf

ic971143.pdf

TOP



Dialogos: A Robust System for Human-Machine Spoken Dialogue on the Telephone

Authors:

Dario Albesano, CSELT (Italy)
Paolo Baggia, CSELT (Italy)
Morena Danieli, CSELT (Italy)
Roberto Gemello, CSELT (Italy)
Elisabetta Gerbino, CSELT (Italy)
Claudio Rullent, CSELT (Italy)

Volume 2, Page 1147

Abstract:

This paper presents Dialogos, a real time system for human-machine spoken dialogue on the telephone in task-oriented domains. The system has been tested in a large trial with inexperienced users and it has proved robust enough to allow spontaneous interactions both to users which get good recognition performance and to the ones which get lower scores. The robust behavior of the system has been achieved by combining the use of specific language models during the recognition phase of analysis, the tolerance toward spontaneous speech phenomena, the activity of a robust parser, and the use of pragmatic-based dialogue knowledge. This integration of the different modules allows to deal with partial or total breakdowns of the different levels of analysis. We report the field trial data of the system and the evaluation results of the overall system and of the submodules.

ic971147.pdf

ic971147.pdf

TOP



Surfin' the World Wide Web with Japanese

Authors:

Kazuhiro Kondo, Texas Instruments Inc. (U.S.A.)
Charles T. Hemphill, Texas Instruments Inc. (U.S.A.)

Volume 2, Page 1151

Abstract:

Previously, we have developed Speech-Aware Multimedia (SAM) which controls a WWW browser using English speech. We recently extended its capability to use Japanese speech to browse Japanese pages, and developed a prototype using speaker-independent, continuous speech recognition with Japanese context- dependent phonetic models. Some challenges not seen in English include: segregation of Japanese text into word units for optional silence insertion, Japanese text to phone conversion and accommodation of English link names embedded in Japanese pages. In order to accomplish the first two, we modified a public-domain dictionary look-up tool for segmentation and to accommodate heuristics required for improved text-to-phone conversion accuracy. Preliminary tests show that the conversion result contains the correct phone sequence over 97% of the time, and the prototype correctly understands the input speech 91.5 % of the time.

ic971151.pdf

ic971151.pdf

TOP



Internet Chinese Information Retrieval Using Unconstrained Mandarin Speech Queries Based on A Client-Server Architecture and APAT-tree-based Language Model

Authors:

Lee-Feng Chien, IIS, Sinica Academia (Taiwan)
Ming-Chiuan Chen, IIS, Sinica Academia (Taiwan)
Hsin-Min Wang, IIS, Sinica Academia (Taiwan)
Lin-Shan Lee, IIS, Sinica Academia (Taiwan)
Sung-Chien Lin, Dept. CSIE, National Taiwan University (Taiwan)
Jenn-Chau Hong, Dept. CSIE, National Taiwan University (Taiwan)
Jia-Lin Shen, Dept. EE, National Taiwan University (Taiwan)

Volume 2, Page 1155

Abstract:

In order to pursue high performance of Chines information access on the Internet,this paper presents an attractive approach with a successful integration of efficient speech recognition and information retrieval techniques. A working system based on the proposed approach for speech retrieval of real-time Chinese net news services has been implemented and tested. Very exciting performance has been achieved.

ic971155.pdf

ic971155.pdf

TOP



Combining Key-Phrase Detection and Subword-based Verification for Flexible Speech Understanding

Authors:

Tatsuya Kawahara, Kyoto University (Japan)
Chin Hui Lee, Bell Labs (U.S.A.)
Biing-Hwang Juang, Bell Labs (U.S.A.)

Volume 2, Page 1159

Abstract:

A flexible speech understanding framework combining key-phrase detection and verification is presented. Detection of semantically-tagged key-phrases directly leads to robust understanding. In order to select reliable detection and eliminate false alarms, utterance verification technique is incorporated. A phrase verifier combines subword-based likelihood ratios of correct models and anti-subword alternate models. A confidence measure that focuses on mis-matched subwords is proposed and demonstrated as the most effective. The combined strategy drastically improves the semantic accuracy for out-of-grammar utterances, while maintaining the performance for in-grammar samples. We also found that utterance verification applied after grammar-based decoding is not so effective as the proposed detection and verification strategy.

ic971159.pdf

ic971159.pdf

TOP



Controlling Limited-Domain Applications by Probabilistic Semantic Decoding of Natural Speech

Authors:

Holger Stahl, TUM (Germany)
Johannes Müller, TUM (Germany)
Manfred Lang, TUM (Germany)

Volume 2, Page 1163

Abstract:

The paper describes a speech understanding system, which allows the online control of arbitrary running applications owning a well-defined command interface. A sequential combination of a signal preprocessor, a stochastic-driven one-stage semantic decoder and a rule-based intention decoder is proposed. Following this principle and using the respective algorithms, speech understanding front-ends for the domains 'graphic editor' and 'service robot' could be successfully realized.

ic971163.pdf

ic971163.pdf

TOP