ICSLP'98 Spoken Language Models and Dialog 5

Spoken Language Models and Dialog 5
Home Full List of Titles 1: ICSLP'98 Proceedings Keynote Speeches Text-To-Speech Synthesis 1 Spoken Language Models and Dialog 1 Prosody and Emotion 1 Hidden Markov Model Techniques 1 Speaker and Language Recognition 1 Multimodal Spoken Language Processing 1 Isolated Word Recognition Robust Speech Processing in Adverse Environments 1 Spoken Language Models and Dialog 2 Articulatory Modelling 1 Talking to Infants, Pets and Lovers Robust Speech Processing in Adverse Environments 2 Spoken Language Models and Dialog 3 Speech Coding 1 Articulatory Modelling 2 Prosody and Emotion 2 Neural Networks, Fuzzy and Evolutionary Methods 1 Utterance Verification and Word Spotting 1 / Speaker Adaptation 1 Text-To-Speech Synthesis 2 Spoken Language Models and Dialog 4 Human Speech Perception 1 Robust Speech Processing in Adverse Environments 3 Speech and Hearing Disorders 1 Prosody and Emotion 3 Spoken Language Understanding Systems 1 Signal Processing and Speech Analysis 1 Spoken Language Generation and Translation 1 Spoken Language Models and Dialog 5 Segmentation, Labelling and Speech Corpora 1 Multimodal Spoken Language Processing 2 Prosody and Emotion 4 Neural Networks, Fuzzy and Evolutionary Methods 2 Large Vocabulary Continuous Speech Recognition 1 Speaker and Language Recognition 2 Signal Processing and Speech Analysis 2 Prosody and Emotion 5 Robust Speech Processing in Adverse Environments 4 Segmentation, Labelling and Speech Corpora 2 Speech Technology Applications and Human-Machine Interface 1 Large Vocabulary Continuous Speech Recognition 2 Text-To-Speech Synthesis 3 Language Acquisition 1 Acoustic Phonetics 1 Speaker Adaptation 2 Speech Coding 2 Hidden Markov Model Techniques 2 Multilingual Perception and Recognition 1 Large Vocabulary Continuous Speech Recognition 3 Articulatory Modelling 3 Language Acquisition 2 Speaker and Language Recognition 3 Text-To-Speech Synthesis 4 Spoken Language Understanding Systems 4 Human Speech Perception 2 Large Vocabulary Continuous Speech Recognition 4 Spoken Language Understanding Systems 2 Signal Processing and Speech Analysis 3 Human Speech Perception 3 Speaker Adaptation 3 Spoken Language Understanding Systems 3 Multimodal Spoken Language Processing 3 Acoustic Phonetics 2 Large Vocabulary Continuous Speech Recognition 5 Speech Coding 3 Language Acquisition 3 / Multilingual Perception and Recognition 2 Segmentation, Labelling and Speech Corpora 3 Text-To-Speech Synthesis 5 Spoken Language Generation and Translation 2 Human Speech Perception 4 Robust Speech Processing in Adverse Environments 5 Text-To-Speech Synthesis 6 Speech Technology Applications and Human-Machine Interface 2 Prosody and Emotion 6 Hidden Markov Model Techniques 3 Speech and Hearing Disorders 2 / Speech Processing for the Speech and Hearing Impaired 1 Human Speech Production Segmentation, Labelling and Speech Corpora 4 Speaker and Language Recognition 4 Speech Technology Applications and Human-Machine Interface 3 Utterance Verification and Word Spotting 2 Large Vocabulary Continuous Speech Recognition 6 Neural Networks, Fuzzy and Evolutionary Methods 3 Speech Processing for the Speech-Impaired and Hearing-Impaired 2 Prosody and Emotion 7 2: SST Student Day SST Student Day - Poster Session 1 SST Student Day - Poster Session 2 Author Index A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Multimedia Files	A Robust Dialogue Model for Spoken Dialogue Processing Authors: Masahiro Araki, Kyoto University (Japan) Shuji Doshita, Kyoto University (Japan) Page (NA) Paper number 729 Abstract: In this paper, we propose a robust processing model of spoken dialogue. Our dialogue model is a cognitive process model (1) which integrates stepwise processing from utterance understanding to response generation, (2) which specifies the interactions between the processing of each steps and two level dialogue management mechanism, and (3) which identifies the possible errors caused by speech recognition error and specifies the method of recovering from the error. Also, We examined the validity of this model using new evaluation paradigm: system-to-system dialogue with linguistic noise. By this evaluation, the robustness of proposed cognitive process model is shown in relatively low recognition error situation. SL980729.PDF (From Author) SL980729.PDF (Rasterized) TOP The REWARD Service Creation Environment. An Overview Authors: Tom Brøndsted, Center for PersonKommunikation (Denmark) Bo Nygaard Bai, Center for PersonKommunikation (Denmark) Jesper Østergaard Olsen, Center for PersonKommunikation (Denmark) Page (NA) Paper number 811 Abstract: The paper describes the platform for building spoken language systems being designed and implemented within the EU-language engineering project REWARD. The platform collects and streamlines a set of software tools such that they together constitute the basic modules needed to enable dialogue developers to establish new dialogue applications with only minimal knowledge outside their own field of experience and within a minimum amount of time. The system differs from other platforms, as non-expert users have been strongly involved in the design phase. SL980811.PDF (From Author) SL980811.PDF (Rasterized) TOP An Analysis of the Timing of Turn-Taking in a Corpus of Goal-Oriented Dialogue Authors: Matthew Bull, Human Communication Research Centre, University of Edinburgh (U.K.) Matthew Aylett, Human Communication Research Centre, University of Edinburgh (U.K.) Page (NA) Paper number 790 Abstract: This paper presents a context-based analysis of the intervals between different speakers' utterances in a corpus of task-oriented dialogue (the Human Communication Research Centre's Map Task Corpus). In the analysis, we assessed the relationship between inter-speaker intervals and various contextual factors, such as the effects of eye contact, the presence of conversational game boundaries, the category of move in an utterance, and the degree of experience with the task in hand. The results of the analysis indicated that the main factors which gave rise to significant differences in inter-speaker intervals were those which related to decision-making and planning - the greater the amount of planning, the greater the inter-speaker interval. Differences between speakers were also found to be significant, although this effect did not necessarily interact with all other effects. These results provide unique and useful data for the improved effectiveness of dialogue systems. SL980790.PDF (From Author) SL980790.PDF (Rasterized) TOP The Provision of Corrective Feedback in a Spoken Dialogue CALL System Authors: Sarah Davies, Human Communication Research Centre, University of Edinburgh (U.K.) Massimo Poesio, Human Communication Research Centre, University of Edinburgh (U.K.) Page (NA) Paper number 813 Abstract: In this paper we report on the development of a spoken dialogue system for computer aided language learning (CALL), and explore some of the issues involved in the incorporation of a corrective feedback module. We initially developed a small prototype system, and tested it for usability with visiting students of English as a foreign language. In the light of the positive results we obtained for this, we began to develop a more advanced system, with the aim of investigating how spoken dialogue systems might best be tailored to help language learning. The issue we focussed on was the kind of feedback on errors which might be most useful to the learner. We show the types of feedback we have considered, and highlight some of the problems associated with providing different types of feedback. SL980813.PDF (From Author) SL980813.PDF (Rasterized) TOP Evaluation of Dialog Strategies for a Tourist Information Retrieval System Authors: Laurence Devillers, LIMSI/CNRS (France) Helene Bonneau-Maynard, LIMSI/CNRS (France) Page (NA) Paper number 378 Abstract: In this paper, we describe the evaluation of the dialog management and response generation strategies being developed for retrieval of touristic information, selected as a common domain for the ARC-AUPELF-B2-action. Comparing and evaluating different strategies is a difficult task, which often remains unexplored, because in most cases evaluation approaches require a unified database structure and efficient integration of data from several disparate sources and forms. To avoid this problem, we implemented two dialog strategy versions within the same general platform. We investigate qualitative and quantitative criteria for evaluation of these dialog control strategies: in particular, by testing the efficiency of our system with and without automatic mechanisms for guiding the user via suggestive prompts. An evaluation phase has been carried out to assess the utility of guiding the user with 32 naive and experienced subjects. The experiments show that user guidance is appropriate for novices and appreciated by all users. SL980378.PDF (From Author) SL980378.PDF (Rasterized) TOP Designing a Multimodal Dialogue System for Information Retrieval Authors: Sadaoki Furui, Tokyo Institute of Technology (Japan) Koh'ichiro Yamaguchi, Tokyo Institute of Technology (Japan) Page (NA) Paper number 36 Abstract: This paper introduces a paradigm for designing multimodal dialogue systems. An example system task of the system is to retrieve particular information about different shops in the Tokyo Metropolitan area, such as their names, addresses and phone numbers. The system accepts speech and screen touching as input, and presents retrieved information on a screen display. The speech recognition part is modeled by the FSN (finite state network) consisting of keywords and fillers, both of which are implemented by the DAWG (directed acyclic word-graph) structure. The number of keywords is 306, consisting of district names and business names. The fillers accept roughly 100,000 non-keywords/phrases occuring in spontaneous speech. A variety of dialogue strategies are designed and evaluated based on an objective cost function having a set of actions and states as parameters. Expected dialogue cost is calculated for each strategy, and the best strategy is selected according to the keyword recognition accuracy. SL980036.PDF (From Author) SL980036.PDF (Rasterized) TOP The Research Project of Man-Computer Dialogue System in Chinese Authors: Dinghua Guan, Institute of Acoustics, Chinese Academy of Sciences (China) Min Chu, Institute of Acoustics, Chinese Academy of Sciences (China) Quan Zhang, Institute of Acoustics, Chinese Academy of Sciences (China) Jian Liu, Institute of Acoustics, Chinese Academy of Sciences (China) Xiangdong Zhang, Institute of Acoustics, Chinese Academy of Sciences (China) Page (NA) Paper number 245 Abstract: This paper gives a brief introduction about the five-year research project of "Man-Computer Dialogue System in Chinese", which was supported by the Chinese Academy of Sciences. The project is carried out in two steps. In the first step, research works undertook by several research groups separately on the core area such as speech recognition, speech synthesis, language understanding and dialogue organizing module. And in the second step, all techniques are assembled together to form a demo dialogue system of traveling information inquiry system. The current state of all above core areas and some evaluation results are discussed in the first part of this paper and the framework of the traveling information inquiry system is presented in the second part. SL980245.PDF (From Author) SL980245.PDF (Rasterized) TOP Interfaces for Speech Recognition Systems: the Impact of Vocabulary Constraints and Syntax on Performance Authors: Kate S. Hone, ICL Institute of Information Technology, University of Nottingham (U.K.) David Golightly, ICL Institute of Information Technology, University of Nottingham (U.K.) Page (NA) Paper number 519 Abstract: An experiment was conducted to investigate the effects of vocabulary constraints and syntax on human interactions with a speech interactive system. Three dialogue styles for a telephone banking application, all using constrained vocabularies, were compared: yes/no, menu and query prompts. These styles differ both in the degree of vocabulary constraint, and in how that constraint is communicated to the user. It was found that although i t involved more dialogue steps the yes/no interaction style was the most effective in terms of both task completion rates and performance time. The query strategy was least preferred by users. SL980519.PDF (From Author) SL980519.PDF (Rasterized) TOP Pacing Spoken Directions to Suit the Listener Authors: Tatsuya Iwase, University of Tokyo (Japan) Nigel Ward, University of Tokyo (Japan) Page (NA) Paper number 224 Abstract: To make human-computer dialog as `natural' as human-human dialog requires paying attention to the timing of utterances. This is done with reference to responses from the listener, in particular back-channel feedback, questions and mumbles. On the basis of corpus analysis, We have made direction-giving dialog system which adjust the pace of dialog using only prosodic information without using speech recognition; no word recognition was used. We contrived a method to evaluate a dialog system talking to human naturally with prosodic information. To evaluate the naturalness of dialog made by our system, we made three experiment with 10 subjects each. The system accomplished natural dialog, and most of subjects weren't aware that it was a computer. This fact, that reasonably good performance was obtained by paying attention to prosodic information alone, indicates the utility of using prosody in producing appropriate timing in dialog. This confirms a commonly held belief. SL980224.PDF (From Author) SL980224.PDF (Rasterized) TOP A Spoken Dialogue System Utilizing Spatial Information Authors: Annika Flycht-Eriksson, Department of Computer and Information Science, Linköping University (Sweden) Arne Jönsson, Department of Computer and Information Science, Linköping University (Sweden) Page (NA) Paper number 479 Abstract: Spatial reasoning plays an important role in many spoken dialogue systems. One application area where it is especially important is timetable information for local bus traffic. Users of such systems often request information based on vague spatial descriptions and a usable system must be able to handle this. We have extended a dialogue system with abilities to transform vague spatial expressions into a form that can be used to access the information base. In our approach we use the power of a Geographical Information System (GIS) for the spatial reasoning. SL980479.PDF (From Author) SL980479.PDF (Rasterized) TOP From Novice to Expert: The Effect of Tutorials on User Expertise with Spoken Dialogue Systems Authors: Candace A. Kamm, AT&T Labs - Research (USA) Diane J. Litman, AT&T Labs - Research (USA) Marilyn A. Walker, AT&T Labs - Research (USA) Page (NA) Paper number 883 Abstract: One challenge for current spoken dialogue systems is how to make the limitations of the system (vocabulary, grammar, and application domain) apparent to users. This study explored the use of a 4-minute tutorial session to acquaint novice users with features of a spoken dialogue system for accessing email. On three scenario-based tasks, novice users who had the tutorial had task completion times and user satisfaction ratings that were comparable to those of expert users. Novices who did not experience the tutorial had significantly longer task completion times on the initial task, but similar completion times to the tutorial group on the final task. User satisfaction ratings of the no-tutorial group were consistently lower than the ratings of the other two groups. Evaluation using the PARADISE framework indicated that perceived task completion, mean recognition score, and number of help requests were significant predictors of user satisfaction with the system. SL980883.PDF (From Author) SL980883.PDF (Rasterized) TOP Emergent Computational Dialogue Management Architecture For Task-Oriented Spoken Dialogue Systems Authors: Takeshi Kawabata, NTT Basic Research Laboratories (Japan) Page (NA) Paper number 143 Abstract: This paper proposes a new dialogue management architecture for human-machine speech communication systems. In our daily speech communication, incremental, non-deterministic and quick-response behaviors are required for effortless information interchange. Emergent computational architectures, proposed in the robot control domain, are promising to enable such features. The dialogue manager (ECL-DIALOG) consists of multiple "phrase pattern" detectors as input sensors. The CFG driven phrase detectors search for phrase patterns in user utterances and generates numerous emergent slot-filling signals. The system integrates them according to their "phrase pattern" priorities and updates the current task-completion context. When a slot value is updated, the system generates an appropriate response. For example, when the system finds a new slot value from user utterances, the system generates a chiming utterance "yeah". When the context slot is replaced by a different value "Tuesday" that has a lower priority, the system asks for confirmation "On Tuesday?". SL980143.PDF (From Author) SL980143.PDF (Rasterized) TOP An Analysis of Dialogues with Our Dialogue System Through a WWW page Authors: Tadahiko Kumamoto, Communications Research Laboratory, MPT of Japan (Japan) Akira Ito, Yamagata University (Japan) Page (NA) Paper number 493 Abstract: Many researchers have been developing natural language dialogue systems as a human-friendly man-machine interface. The human factors in a man-machine dialogue, however, are not obvious enough to understand with regard to how people talk with a dialogue system. 141 dialogues which our dialogue system had in DiaLeague '97 were analyzed at the utterance and dialogue levels, where DiaLeague '97 was the second dialogue contest in which a dialogue system engaged in a dialogue with a human in order to solve a specific problem. For the analyses at the utterance level, we investigated the users' speaking styles, the richness of the users' utterances in a variety of surface patterns, and the influence of the system's utterance pattern on the users' utterance. For the analyses at the dialogue level, we investigated the instances of confusion observed in the 141 dialogues and we also show how the users behaved when the confusion occurred. SL980493.PDF (From Author) SL980493.PDF (Rasterized) TOP Modelling Spoken Dialogues With State Transition Diagrams: Experiences With The CSLU Toolkit Authors: Michael F. McTear, University of Ulster (U.K.) Page (NA) Paper number 545 Abstract: The development of a spoken dialogue system is a complex process involving the integration of several component technologies. Various toolkits and authoring environments have been produced that provide assistance with this process. This paper reports on several projects involving CSLU's RAD (Rapid Application Developer) and critically evaluates the applicability of state transition diagrams for modelling different types of spoken dialogue. State transition methods have been recommended for dialogues that involve well-structured tasks that can be mapped directly on to a dialogue structure. However, other significant factors to be considered include the structure of the information to be transacted and the need for verification of the user's input as determined by the system's level of recognition accuracy. Examples of different types of dialogue are presented together with recommendations concerning the advantages and disadvantages of state transition based dialogue control. SL980545.PDF (From Author) SL980545.PDF (Rasterized) TOP Situated Dialogue Coordination For Spoken Dialogue Systems Authors: Michio Okada, ATR Media Integration & Communications Research Laboratories (Japan) Noriko Suzuki, ATR Media Integration & Communications Research Laboratories (Japan) Jacques Terken, IPO, Center for Research on User-System Interaction, Eindhoven University of Technology (The Netherlands) Page (NA) Paper number 801 Abstract: In this paper, we present a general framework and architecture for maintaining dialogue coordination in spoken dialogue systems, in which intended behaviors and goals are incrementally performed during the course of maintaining dialogue coordination. The dialogue structure emerges as a result from interaction between user and the dialogue system. The key feature of this design for the systems is to use multiple situated-agents for coordinating communicative acts that are realized as a hierarchy of autonomous behaviors by using a subsumption architecture. In this architecture it should be noted that the lower-level behaviors act autonomously for maintaining the dialogue coordination and are linked to the specifications from higher-level behaviors for dialogue management. In order to make the behavior of the system social, in general, the maintaining of dialogue coordination takes priority over the realization of intended goals of the system as a dialogue participant. We introduce an under-specification strategy for controlling the preference of the concurrent behaviors. This is in contrast to the classical, top-down approach to dialogue coordination. SL980801.PDF (From Author) SL980801.PDF (Rasterized) TOP Robust Spoken Dialogue Systems for Consumer Products: a Concrete Application Authors: Xavier Pouteau, IPO, Center for Research on User-System Interaction (The Netherlands) Luis Arévalo, Robert Bosch GmbH, Corporate R&D (Germany) Page (NA) Paper number 368 Abstract: In this paper, we report the significant results of a fully-implemented voice operated dialogue system, and particularly its main component: the Dialogue Manager (DM). Just like for other interfaces, spoken interfaces require a well-conducted design, implying a good analysis of the users' needs throughout the dialogue. The VODIS project 1 has led to the design and development of a spoken interface for the control of car equipment. Due to the workload caused by the task of driving the vehicle, spoken communication provides a potentially safe and efficient mode of operating the car equipment. To achieve this, we present the main characteristics of the task model specified during the design stage, and show how its specific features related to the spoken communication allowed to implement a robust dialogue. SL980368.PDF (From Author) SL980368.PDF (Rasterized) TOP A German Dialogue System for Scheduling Dates and Meetings by Naturally Spoken Continuous Speech Authors: Daniel Willett, Gerhard-Mercator-University, Duisburg (Germany) Arno Römer, Gerhard-Mercator-University, Duisburg (Germany) Jörg Rottland, Gerhard-Mercator-University, Duisburg (Germany) Gerhard Rigoll, Gerhard-Mercator-University, Duisburg (Germany) Page (NA) Paper number 524 Abstract: In this paper, we present the basic design principles and architecture of a dialogue system for scheduling appointments. This mixed-initiative dialogue system integrates an automatic speaker-independent speech recognition engine for continuously spoken German, a speech synthesizer and a scheduler database application to build up a scheduler that is purely driven by natural continuous speech and thus, does not need any visual display device. With these properties it is a prototype for a speech driven palm-size computer application and could be integrated in miniature computers that come along with no display device at all. SL980524.PDF (From Author) SL980524.PDF (Rasterized) TOP Spoken Dialogue System Using Corpus-Based Hidden Markov Model Authors: Chung-Hsien Wu, National Cheng Kung University (China) Gwo-Lang Yan, National Cheng Kung University (China) Chien-Liang Lin, National Cheng Kung University (China) Page (NA) Paper number 219 Abstract: In a spoken dialogue system, the intention is the most important component for speech understanding. In this paper, we propose a corpus-based hidden Markov model (HMM) to model the intention of a sentence. Each intention is represented by a sequence of word segment categories determined by a task-specific lexicon and a corpus. In the training procedure, five intention HMM's are defined, each representing one intention in our approach. In the intention identification process, the phrase sequence is fed to each intention HMM. Given a speech utterance, the Viterbi algorithm is used to find the most likely intention sequences. The intention HMM considers not only the phrase frequency but also the syntactic and semantic structure in a phrase sequence. In order to evaluate the proposed method, a spoken dialogue model for air travel information service is investigated. The experiments were carried out using a test database from 25 speakers (15 male and 10 female). There are 120 dialogues, which contain 725 sentences in the test database. The experimental results show that the correct response rate can achieve about 80.3% using intention HMM. SL980219.PDF (From Author) SL980219.PDF (Rasterized) TOP A Realistic Wizard of Oz Simulation of a Multimodal Spoken Language System Authors: Peter Wyard, BT Labs (U.K.) Gavin Churcher, BT Labs (U.K.) Page (NA) Paper number 556 Abstract: This paper describes a Wizard of Oz (WOZ) system that allows the realistic simulation of a multimodal spoken language system. A Wizard protocol has been drawn up which means that the WOZ system will simulate the limitations of an automatic system rather than allow the user to engage in the full range of human-human dialogue. In support of this protocol is a sophisticated Wizard response panel and underlying response generation functionality. This enables the Wizard to respond to complex multimodal inputs in near real-time. The chosen application is a 3D retail service, in which users can select furnishings from a database according to colour, pattern, fabric type, etc., transfer furnishings to objects in a virtual showroom, ask about prices and matching of fabrics, etc. The system includes a "virtual assistant", i.e. a synthetic persona which speaks the verbal system output. Users make their input by a combination of fluent speech and touchscreen input. The paper describes a formal trial carried out with the WOZ system, and discusses the results. SL980556.PDF (From Author) SL980556.PDF (Rasterized) TOP A Syllable-Based Chinese Spoken Dialogue System for Telephone Directory Services Primarily Trained with A Corpus Authors: Yen-Ju Yang, Dept. of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan (Taiwan) Lin-Shan Lee, Dept. of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan (Taiwan) Page (NA) Paper number 528 Abstract: This paper presents a syllable-based Chinese spoken dialogue system for telephone directory services primarily trained with a corpus. It integrates automatic phrase extraction, robust phrase spotting, statistics-based semantic parsing by phrase-concept joint language model as well as concept-based dialogue model, and intention identification by probabilistic finite state network to form a speech intention estimator. By applying the proposed techniques, the concept sequence with the maximum a-posteriori (MAP) probability based on intra and inter sentence consideration conveyed in the user's speech sentence, i.e. the speaker's intention, can be identified. This approach is convenient to be trained by a given corpus and flexible to be ported to different dialogue tasks. Incorporate a mixed-initiative goal-oriented dialogue manager, we have successfully developed a dialogue system for telephone directory service. Very promising results have been obtained in on-line tests. SL980528.PDF (From Author) SL980528.PDF (Rasterized) TOP How Disagreement Expressions are Used in Cooperative Tasks Authors: Hiroyuki Yano, Communications Research Laboratory (Japan) Akira Ito, Faculty of Engineering, Yamagata University (Japan) Page (NA) Paper number 719 Abstract: Analysis was made of disagreement expressions in dialogues recorded in a cooperative task experiment. A disagreement expression is defined as the latter utterance of consecutive utterances, which shows disagreement with the former. Subjects used two types of disagreement expressions: to the partner's utterance, and to their own. These were classified into three subtypes according to part of speech: conjunction, interjection, and content word. The role of disagreement expressions in cooperative tasks was examined. It was found that subjects used disagreement expressions suitable to the occasion to maintain good relation with their partners. It was concluded that using expressions that disagree with one's own previous utterance is an effective strategy for expressing an opinion for which one lacks adequate evidence and for eliciting utterances from one's partner. SL980719.PDF (From Author) SL980719.PDF (Rasterized) TOP