Spoken Language Understanding Systems 2

Home
Full List of Titles
1: ICSLP'98 Proceedings
Keynote Speeches
Text-To-Speech Synthesis 1
Spoken Language Models and Dialog 1
Prosody and Emotion 1
Hidden Markov Model Techniques 1
Speaker and Language Recognition 1
Multimodal Spoken Language Processing 1
Isolated Word Recognition
Robust Speech Processing in Adverse Environments 1
Spoken Language Models and Dialog 2
Articulatory Modelling 1
Talking to Infants, Pets and Lovers
Robust Speech Processing in Adverse Environments 2
Spoken Language Models and Dialog 3
Speech Coding 1
Articulatory Modelling 2
Prosody and Emotion 2
Neural Networks, Fuzzy and Evolutionary Methods 1
Utterance Verification and Word Spotting 1 / Speaker Adaptation 1
Text-To-Speech Synthesis 2
Spoken Language Models and Dialog 4
Human Speech Perception 1
Robust Speech Processing in Adverse Environments 3
Speech and Hearing Disorders 1
Prosody and Emotion 3
Spoken Language Understanding Systems 1
Signal Processing and Speech Analysis 1
Spoken Language Generation and Translation 1
Spoken Language Models and Dialog 5
Segmentation, Labelling and Speech Corpora 1
Multimodal Spoken Language Processing 2
Prosody and Emotion 4
Neural Networks, Fuzzy and Evolutionary Methods 2
Large Vocabulary Continuous Speech Recognition 1
Speaker and Language Recognition 2
Signal Processing and Speech Analysis 2
Prosody and Emotion 5
Robust Speech Processing in Adverse Environments 4
Segmentation, Labelling and Speech Corpora 2
Speech Technology Applications and Human-Machine Interface 1
Large Vocabulary Continuous Speech Recognition 2
Text-To-Speech Synthesis 3
Language Acquisition 1
Acoustic Phonetics 1
Speaker Adaptation 2
Speech Coding 2
Hidden Markov Model Techniques 2
Multilingual Perception and Recognition 1
Large Vocabulary Continuous Speech Recognition 3
Articulatory Modelling 3
Language Acquisition 2
Speaker and Language Recognition 3
Text-To-Speech Synthesis 4
Spoken Language Understanding Systems 4
Human Speech Perception 2
Large Vocabulary Continuous Speech Recognition 4
Spoken Language Understanding Systems 2
Signal Processing and Speech Analysis 3
Human Speech Perception 3
Speaker Adaptation 3
Spoken Language Understanding Systems 3
Multimodal Spoken Language Processing 3
Acoustic Phonetics 2
Large Vocabulary Continuous Speech Recognition 5
Speech Coding 3
Language Acquisition 3 / Multilingual Perception and Recognition 2
Segmentation, Labelling and Speech Corpora 3
Text-To-Speech Synthesis 5
Spoken Language Generation and Translation 2
Human Speech Perception 4
Robust Speech Processing in Adverse Environments 5
Text-To-Speech Synthesis 6
Speech Technology Applications and Human-Machine Interface 2
Prosody and Emotion 6
Hidden Markov Model Techniques 3
Speech and Hearing Disorders 2 / Speech Processing for the Speech and Hearing Impaired 1
Human Speech Production
Segmentation, Labelling and Speech Corpora 4
Speaker and Language Recognition 4
Speech Technology Applications and Human-Machine Interface 3
Utterance Verification and Word Spotting 2
Large Vocabulary Continuous Speech Recognition 6
Neural Networks, Fuzzy and Evolutionary Methods 3
Speech Processing for the Speech-Impaired and Hearing-Impaired 2
Prosody and Emotion 7
2: SST Student Day
SST Student Day - Poster Session 1
SST Student Day - Poster Session 2

Author Index
A B C D E F G H I
J K L M N O P Q R
S T U V W X Y Z

Multimedia Files

Automatic Ambiguity Detection

Authors:

Richard Sproat, Bell Laboratories, Lucent Technologies (USA)
Jan P.H. van Santen, Bell Laboratories, Lucent Technologies (USA)

Page (NA) Paper number 41

Abstract:

Most work on sense disambiguation presumes that one knows beforehand --- e.g. from a thesaurus --- a set of polysemous terms. But published lists invariably give only partial coverage. For example, the English word tan has several obvious senses, but one may overlook the abbreviation for tangent. In this paper, we present an algorithm for identifying interesting polysemous terms and measuring their degree of polysemy, given an unlabeled corpus. The algorithm involves: (i) collecting all terms within a k-term window of the target term; (ii) computing the inter-term distances of the contextual terms, and reducing the multi-dimensional distance space to two dimensions using standard methods; (iii) converting the two-dimensional representation into radial coordinates and using isotonic/antitonic regression to compute the degree to which the distribution deviates from a single-peak model. The amount of deviation is the proposed polysemy index.

SL980041.PDF (From Author) SL980041.PDF (Rasterized)

TOP


Empowering Knowledge Based Speech Understanding through Statistics

Authors:

Julia Fischer, Chair for Pattern Recognition, University of Erlangen-Nuremberg (Germany)
Juergen Haas, Chair for Pattern Recognition, University of Erlangen-Nuremberg (Germany)
Elmar Nöth, Chair for Pattern Recognition, University of Erlangen-Nuremberg (Germany)
Heinrich Niemann, Chair for Pattern Recognition, University of Erlangen-Nuremberg (Germany)
Frank Deinzer, Chair for Pattern Recognition, University of Erlangen-Nuremberg (Germany)

Page (NA) Paper number 369

Abstract:

In this paper we present an innovative approach to speech understanding which is based on a fine-grained knowledge representation automatically compiled from a semantic network and on iterative optimization. Besides allowing an efficient exploitation of parallelism, any-time capability is provided since after each iteration step a (sub-)optimal solution is always available. We apply this approach to a real-world task, which is a dialog system able to answer queries about the German train timetable. In order to speed up the search for the best interpretation of an utterance we make use of statistical methods, e.g. neural networks, n-grams, and classification trees, which are trained on application relevant utterances collected over the public telephone network. At the moment the real-time factor for interpreting the initial user's utterance is 0.7.

SL980369.PDF (From Author) SL980369.PDF (Rasterized)

TOP


Concept-Driven Speech Understanding Incorporated with a Statistic Language Model

Authors:

Akito Nagai, Information Technology R&D Center, MITSUBISHI Electric Corporation (Japan)
Yasushi Ishikawa, Information Technology R&D Center, MITSUBISHI Electric Corporation (Japan)

Page (NA) Paper number 1023

Abstract:

We have proposed a method of concept-driven semantic interpretation based on general semantic knowledge of conceptual dependency. In our approach, a concept is a unit of semantic interpretation and an utterance is regarded as a sequence of concepts that convey an intention. However, a considerable number of accepted results were not syntactically meaningful. This is because the order in which linguistic features occurred in the sequence of concepts was not taken into account in constructing the whole meaning from the concepts: only semantic constraint was used to attain linguistic robustness. Therefore, we introduce a statistical language model which calculates the plausibility of a sequence of concepts from the points of view of the order in which shallow linguistic features occur. Experimental results of speech understanding for 1000-word-vocabulary spontaneous speech show that the proposed method significantly improves the system performance.

SL981023.PDF (From Author) SL981023.PDF (Rasterized)

TOP


On The Limitations of Stochastic Conceptual Finite-State Language Models For Speech Understanding

Authors:

José Colás, Grupo de Tecnología del Habla - Departamento de Ingeniería Electrónica - E. T. S. I. Telecomunicación - Universidad Politécnica de Madrid (Spain)
Javier Ferreiros, Grupo de Tecnología del Habla - Departamento de Ingeniería Electrónica - E. T. S. I. Telecomunicación - Universidad Politécnica de Madrid (Spain)
Juan Manuel Montero, Grupo de Tecnología del Habla - Departamento de Ingeniería Electrónica - E. T. S. I. Telecomunicación - Universidad Politécnica de Madrid (Spain)
Julio Pastor, Grupo de Tecnología del Habla - Departamento de Ingeniería Electrónica - E. T. S. I. Telecomunicación - Universidad Politécnica de Madrid (Spain)
Ascensión Gallardo, Grupo de Tecnología del Habla - Departamento de Ingeniería Electrónica - E. T. S. I. Telecomunicación - Universidad Politécnica de Madrid (Spain)
José Manuel Pardo, Grupo de Tecnología del Habla - Departamento de Ingeniería Electrónica - E. T. S. I. Telecomunicación - Universidad Politécnica de Madrid (Spain)

Page (NA) Paper number 1095

Abstract:

In a limited domain task (e.g. airline reservation, database retrieval, etc) many robust understanding systems, designed for both speech and text input, have been implemented [1][3][5][6][10] based on the Stochastic Conceptual Finite-State paradigm (Semantic Network) or CHRONUS paradigm [11] (Conceptual Hidden Representation of Natural Unconstrained Speech), which establishes relations between conceptual entities through a probabilistic graph-like structure. The use of this kind of grammar to model semantic information presents limitations, which have been analysed during the implementation of a flexible architecture for a robust information retrieval system, based on the same paradigm [1]. We have tried to solve some of them by integrating a set of conceptual probabilistic and non-probabilistic grammars, which allow certain complexity in the functionality of the application, such as applying non-SQL functions to the results of SQL queries in order to retrieve information not explicitly included in the database, translating certain natural spoken sentences (that would produce difficult embedded queries and therefore more natural queries) without so many restrictions in the relative position of inter-concept relationship.

SL981095.PDF (From Author) SL981095.PDF (Rasterized)

TOP


Towards Speech Understanding Across Multiple Languages

Authors:

Todd Ward, IBM (USA)
Salim Roukos, IBM (USA)
Chalapathy Neti, IBM (USA)
Jerome Gros, IBM (formerly of) (USA)
Mark Epstein, IBM (USA)
Satya Dharanipragada, IBM (USA)

Page (NA) Paper number 400

Abstract:

In this paper we describe our initial efforts in building a natural language understanding (NLU) system across multiple languages. The system allows users to switch languages seamlessly in a single session without requiring any switch in the speech recognition system. Context-dependence is maintained across utterances, even when the user changes languages. Towards this end we have begun building a universal speech recognizer for English and French languages. We experiment with a common phonology for both French and English with a novel mechanism to handle language dependent variations. Our best results so far show about 5% relative performance degradation for English relative to a unilingual English system and a 9% relative degradation in French relative to a unilingual French system. The NLU system uses the same statistical understanding algorithms for each language, making system development, maintenance and portability vastly superior to systems built customly for each language.

SL980400.PDF (From Author) SL980400.PDF (Rasterized)

TOP


Automatic Detection of Sentence Boundaries and Disfluencies Based on Recognized Words

Authors:

Andreas Stolcke, SRI International (USA)
Elizabeth Shriberg, SRI International (USA)
Rebecca Bates, Boston University (USA)
Mari Ostendorf, Boston University (USA)
Dilek Hakkani, SRI International (USA)
Madelaine Plauche, SRI International (USA)
Gökhan Tür, SRI International (USA)
Yu Lu, SRI International (USA)

Page (NA) Paper number 59

Abstract:

We study the problem of detecting linguistic events at interword boundaries, such as sentence boundaries and disfluency locations, in speech transcribed by an automatic recognizer. Recovering such events is crucial to facilitate speech understanding and other natural language processing tasks. Our approach is based on a combination of prosodic cues modeled by decision trees, and word-based event N-gram language models. Several model combination approaches are investigated. The techniques are evaluated on conversational speech from the Switchboard corpus. Model combination is shown to give a significant win over individual knowledge sources.

SL980059.PDF (From Author) SL980059.PDF (Rasterized)

TOP