Language Modeling and Understanding

Chair: Roberto Pieraccini, AT&T Labs, USA

Home

Statistics-Based Segment Pattern Lexicon - A New Direction for Chinese Language Modeling

Authors:

Kae-Cherng Yang, National Taiwan University (Taiwan)
Tai-Hsuan Ho, National Taiwan University (Taiwan)
Lee-Feng Chien, Institute of Information Science, Academia Sinica (Taiwan)
Lin-Shan Lee, Institute of Information Science, Academia Sinica (Taiwan)

Volume 1, Page 169, Paper number 2395

Abstract:

This paper presents a new direction for Chinese language modeling based on a different concept of lexicon. Because every Chinese character has its own meaning and there are not ""blanks"" in Chinese sentences as word boundaries, also because the wording structure in Chinese language is extremely flexible, the ""words"" in Chinese are actually not well defined, and there does not exist a commonly accepted lexicon. This makes language modeling very sophisticated in Chinese language, and the ""out of vocabulary"" (OOV) problem specially serious. A new concept for lexicon is thus proposed in this paper. The elements of this lexicon can be words or any other ""segment patterns"". They should be extracted from the training corpus by statistical approaches with a goal to minimize the overall perplexity. The language models can then be developed based on this new lexicon. Very encouraging experimental results have been obtained.

ic982395.pdf (From Postscript)

TOP

Building Class-based Language Models with Contextual Statistics

Authors:

Shuanghu Bai, Institute of Systems Science, National University of Singapore (Singapore)
Haizhou Li, Institute of Systems Science, National University of Singapore (Singapore)
Zhiwei Lin, Institute of Systems Science, National University of Singapore (Singapore)
Baosheng Yuan, Institute of Systems Science, National University of Singapore (Singapore)

Volume 1, Page 173, Paper number 1662

Abstract:

In this paper, novel clustering algorithms are proposed by using the contextual statistics of words for class-based language models. The minimum discriminative information (MDI) is used as a distance measure. Three algorithms are implemented to build bigram language models for a vocabulary of 50,000 words over a corpus of 200 million words. The computational cost of algorithms and resulting LM perplexity are studied. The comparisons between the MDI algorithm and maximum mutual information (MMI) algorithm are also given to demonstrate the effectiveness and the efficiency of the new algorithms. It is shown that the MDI approaches make the tree-building clustering possible with large vocabulary.

ic981662.pdf (From Postscript)

TOP

Comparison of Part-of-Speech and Automatically Derived Category-Based Language Models for Speech Recognition

Authors:

Thomas R. Niesler, Cambridge University (U.K.)
Edward W.D. Whittaker, Cambridge University (U.K.)
Philip C. Woodland, Cambridge University (U.K.)

Volume 1, Page 177, Paper number 2003

Abstract:

This paper compares various category-based language models when used in conjunction with a word-based trigram by means of linear interpolation. Categories corresponding to parts-of-speech as well as automatically clustered groupings are considered. The category-based model employs variable-length n-grams and permits each word to belong to multiple categories. Relative word error rate reductions of between 2 and 7 % over the baseline are achieved in N-best rescoring experiments on the Wall Street Journal corpus. The largest improvement is obtained with a model using automatically determined categories. Perplexities continue to decrease as the number of different categories is increased, but improvements in the word error rate reach an optimum.

ic982003.pdf (From Postscript)

TOP

Balancing Acoustic and Linguistic Probabilities

Authors:

Atsunori Ogawa, Nagoya University (Japan)
Kazuya Takeda, Nagoya University (Japan)
Fumitada Itakura, Nagoya University (Japan)

Volume 1, Page 181, Paper number 1986

Abstract:

The length of the word sequence is not taken into account under language modeling of n-gram local probability modeling. Due to this property, the optimal values of the language weight and word insertion penalty for balancing acoustic and linguistic probabilities is affected by the length of word sequence. To deal with this problem, a new language model is developed based on Bernoulli trial model taking the length of the word sequence into account. Not only better recognition accuracy but also robust balancing with acoustic probability compared with the normal n-gram model of the proposed method is confirmed through recognition experiments.

ic981986.pdf (From Postscript)

TOP

Initial Language Models for Spoken Dialogue Systems

Authors:

Andreas Kellner, Philips GmbH Forschungslaboratorien Aachen (Germany)

Volume 1, Page 185, Paper number 1796

Abstract:

The estimation of initial language models for new applications of spoken dialogue systems without large task-specific training corpora is becoming an increasingly important issue. This paper investigates two different approaches in which thetask-specific knowledge contained in the language understanding grammar is exploited in order to generate n-gram language models for the speech recognizer: The first uses class-based language models for which the word-classes are automatically derived from the grammar. In the second approach, language models are estimated on artificial corpora which have been created from the understanding grammar. The application of fill-up techniques allows the combination of the strengths of both approaches and leads to a language model which shows optimal performance regardless of the amount of training data available. Perplexities and word error rates are reported for two different domains.

ic981796.pdf (From Postscript)

TOP

Maximum Likelihood and Discriminative Training of Direct Translation Models

Authors:

Kishore A Papineni, IBM (U.S.A.)
Salim E. Roukos, IBM (U.S.A.)
R T Ward, IBM (U.S.A.)

Volume 1, Page 189, Paper number 2195

Abstract:

We consider translating natural language sentences into a formal language using direct translation models built automatically from training data. Direct translation models have three components: an arbitrary prior conditional probability distribution, features that capture correlations between automatically determined key phrases or sets of words in both languages, and weights associated with these features. The features and the weights are selected using a training corpus of matched pairs of source and target language sentences to maximize the entropy or a new discrimination measure of the resulting conditional probability model. We report results in Air Travel Information System (ATIS) domain and compare the two methods of training.

ic982195.pdf (Scanned)

TOP

A Telephone Number Inquiry System with Dialog Structure

Authors:

Hsien-Chang Wang, National Cheng-Kung University (Taiwan)
Jhing-Fa Wang, National Cheng-Kung University (Taiwan)

Volume 1, Page 193, Paper number 1158

Abstract:

A Telephone Number Inquiry System (TNIS) answers caller the phone number he/she wants to know. Traditional system requires the caller to know the full name of the party. If the caller forgets the name, the system fails to retrieve correct information for the caller. In this paper, we propose a novel TNIS with dialog structure that can let caller use a more flexible method while inquiring, i.e., the caller may interact with our system to inquire the phone number by providing just the working, researching area, the surname, or the title, etc. Our system takes the telephone speech as input, after generating the word sequence, it performs a maximum likelihood key-feature matching with knowledge base. If necessary information is not derived, interactive dialog manager is activated to resolve the caller*s requirement. The experimental results show that our novel approach can make the system more natural.

ic981158.pdf (From Postscript)

TOP

Spoken Dialog Systems for Children

Authors:

Alexandros Potamianos, AT&T Labs (U.S.A.)
Shrikanth Narayanan, AT&T Labs (U.S.A.)

Volume 1, Page 197, Paper number 1824

Abstract:

In this paper, we outline the main issues when designing interactive multimedia systems for children and propose a unified approach --acoustic, linguistic, and dialog modeling-- to system development. Acoustic, linguistic and dialog data collected in a wizard of Oz experiment from 160 children ages 8-14 playing an interactive computer game are analyzed and children-specific modeling issues are presented. Age-dependent and modality-dependent dialog flow patterns are identified. Furthermore, extraneous speech patterns, linguistic variability and disfluencies are investigated in spontaneous children's speech, and important new results are reported. Finally, baseline automatic speech recognition (ASR) results are presented for various tasks using simple acoustic and language models.

ic981824.pdf (From Postscript)

TOP

Using Markov Decision Process for Learning Dialogue Strategies

Authors:

Esther Levin, AT&T Labs - Research (U.S.A.)
Roberto Pieraccini, AT&T Labs - Research (U.S.A.)
Wieland Eckert, AT&T Labs - Research (U.S.A.)

Volume 1, Page 201, Paper number 2501

Abstract:

In this paper we introduce a stochastic model for dialogue systems based on Markov Decision Process. Within this framework we show that the problem of dialogue strategy design can be stated as an optimization problem, and solved by a variety of methods, including the reinforcement learning approach. The advantages of this new paradigm include objective evaluation of dialogue systems, and their automatic design and adapatation. We show some results on learning a dialogue strategy for an Air Travel Information System.

ic982501.pdf (From Postscript)

TOP

An Implementation of Partial Parser in the Spoken Language Translator

Authors:

Nam-Yong Han, Electronics and Telecommunications Research Institute (Korea)
Un-Cheon Choi, Electronics and Telecommunications Research Institute (Korea)
Youngjik Lee, Electronics and Telecommunications Research Institute (Korea)

Volume 1, Page 205, Paper number 1891

Abstract:

We describe characteristics of the partial parser and evaluations of the output of the spoken language translator with concept-based grammars. This translator translates the Korean utterance generated by a speech recognizer which recognizes spontaneous speech into an English/Japanese utterances through a concept analysis approach. The parsing fails to parse input utterance when the parser finished to medium level tokens because the successful parsing results come from only when all concepts except for the highest top level tokens are reduced into the ones of the highest top level tokens. At this time, the partial parser is ran to analyze those medium level tokens without parsing fail. We obtained 55.2% for the recognized data as the translation rate of meaning based on intention before applying partial parser to the spoken translator, and now obtained 79.1% after applying partial parser.