ICSLP'98 Large Vocabulary Continuous Speech Recognition 2

Large Vocabulary Continuous Speech Recognition 2
Home Full List of Titles 1: ICSLP'98 Proceedings Keynote Speeches Text-To-Speech Synthesis 1 Spoken Language Models and Dialog 1 Prosody and Emotion 1 Hidden Markov Model Techniques 1 Speaker and Language Recognition 1 Multimodal Spoken Language Processing 1 Isolated Word Recognition Robust Speech Processing in Adverse Environments 1 Spoken Language Models and Dialog 2 Articulatory Modelling 1 Talking to Infants, Pets and Lovers Robust Speech Processing in Adverse Environments 2 Spoken Language Models and Dialog 3 Speech Coding 1 Articulatory Modelling 2 Prosody and Emotion 2 Neural Networks, Fuzzy and Evolutionary Methods 1 Utterance Verification and Word Spotting 1 / Speaker Adaptation 1 Text-To-Speech Synthesis 2 Spoken Language Models and Dialog 4 Human Speech Perception 1 Robust Speech Processing in Adverse Environments 3 Speech and Hearing Disorders 1 Prosody and Emotion 3 Spoken Language Understanding Systems 1 Signal Processing and Speech Analysis 1 Spoken Language Generation and Translation 1 Spoken Language Models and Dialog 5 Segmentation, Labelling and Speech Corpora 1 Multimodal Spoken Language Processing 2 Prosody and Emotion 4 Neural Networks, Fuzzy and Evolutionary Methods 2 Large Vocabulary Continuous Speech Recognition 1 Speaker and Language Recognition 2 Signal Processing and Speech Analysis 2 Prosody and Emotion 5 Robust Speech Processing in Adverse Environments 4 Segmentation, Labelling and Speech Corpora 2 Speech Technology Applications and Human-Machine Interface 1 Large Vocabulary Continuous Speech Recognition 2 Text-To-Speech Synthesis 3 Language Acquisition 1 Acoustic Phonetics 1 Speaker Adaptation 2 Speech Coding 2 Hidden Markov Model Techniques 2 Multilingual Perception and Recognition 1 Large Vocabulary Continuous Speech Recognition 3 Articulatory Modelling 3 Language Acquisition 2 Speaker and Language Recognition 3 Text-To-Speech Synthesis 4 Spoken Language Understanding Systems 4 Human Speech Perception 2 Large Vocabulary Continuous Speech Recognition 4 Spoken Language Understanding Systems 2 Signal Processing and Speech Analysis 3 Human Speech Perception 3 Speaker Adaptation 3 Spoken Language Understanding Systems 3 Multimodal Spoken Language Processing 3 Acoustic Phonetics 2 Large Vocabulary Continuous Speech Recognition 5 Speech Coding 3 Language Acquisition 3 / Multilingual Perception and Recognition 2 Segmentation, Labelling and Speech Corpora 3 Text-To-Speech Synthesis 5 Spoken Language Generation and Translation 2 Human Speech Perception 4 Robust Speech Processing in Adverse Environments 5 Text-To-Speech Synthesis 6 Speech Technology Applications and Human-Machine Interface 2 Prosody and Emotion 6 Hidden Markov Model Techniques 3 Speech and Hearing Disorders 2 / Speech Processing for the Speech and Hearing Impaired 1 Human Speech Production Segmentation, Labelling and Speech Corpora 4 Speaker and Language Recognition 4 Speech Technology Applications and Human-Machine Interface 3 Utterance Verification and Word Spotting 2 Large Vocabulary Continuous Speech Recognition 6 Neural Networks, Fuzzy and Evolutionary Methods 3 Speech Processing for the Speech-Impaired and Hearing-Impaired 2 Prosody and Emotion 7 2: SST Student Day SST Student Day - Poster Session 1 SST Student Day - Poster Session 2 Author Index A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Multimedia Files	Grammatical Word Graph Re-Generation for Spontaneous Speech Recognition Authors: Hajime Tsukada, ATR Interpreting Telecommunications Research Laboratories (Japan) Hirofumi Yamamoto, ATR Interpreting Telecommunications Research Laboratories (Japan) Toshiyuki Takezawa, ATR Interpreting Telecommunications Research Laboratories (Japan) Yoshinori Sagisaka, ATR Interpreting Telecommunications Research Laboratories (Japan) Page (NA) Paper number 485 Abstract: We propose a novel recognition method for generating an accurate grammatical word-graph allowing grammatical deviations. Our method uses both an n-gram and a grammar-based statistical language model and aligns utterances with the grammar by adding deviation information during the search process. Our experiments confirm that the word-graph obtained by our proposed method is superior to the one obtained by only using the n-gram with the same word-graph density. In addition, our recognition method can search enormous hypotheses more efficiently than the conventional word-graph based search method. SL980485.PDF (From Author) SL980485.PDF (Rasterized) TOP Compression Algorithm Of Trigram Language Models Based On Maximum Likelihood Estimation Authors: Norimichi Yodo, Graduate School of Information Science, Nara Institute of Science and Technology (Japan) Kiyohiro Shikano, Graduate School of Information Science, Nara Institute of Science and Technology (Japan) Satoshi Nakamura, Graduate School of Information Science, Nara Institute of Science and Technology (Japan) Page (NA) Paper number 716 Abstract: In this paper we propose an algorithm for reducing the size of back-off N-gram models, with less affecting its performance than the traditional cutoff method. The algorithm is based on the Maximum Likelihood (ML) estimation and realizes an N-gram language model with a given number of N-gram probability parameters that minimize the training set perplexity. To confirm the effectiveness of our algorithm, we apply it to trigram and bigram models, and the experiments in terms of perplexity and word error rate in a dictation system are carried out. SL980716.PDF (From Author) SL980716.PDF (Rasterized) TOP Morphological Modeling of Word Classes for Language Models Authors: Ulla Uebler, Bavarian Research Center for Knowledge Based Systems (Germany) Heinrich Niemann, Bavarian Research Center for Knowledge Based Systems (Germany) Page (NA) Paper number 338 Abstract: It is well known that good language models improve performance of speech recognition. One requirement for the estimation of language models is a sufficient amount of texts of the application domain. If not all words of the domain occur in the training texts for language models, a way must be found to model these words adequately. In this paper we report on a new approach of building word classes for language modeling in the bilingual (German, Italian) SpeeData project. The main idea is to classify words according to their morphological properties. Therefore we decompose words into their morphological units and put the words with the same prefix or suffix into the same class. Since morphological decomposition is error prone for unknown word stems, we also decomposed words by counting beginnings and endings of different length and used these subunits like prefixes and suffixes. The advantage of this approach is that it can be carried out automatically. We achieved a reduction in error rate from 9.83 % to 5.77 % for morphological decomposition and 5.99 % for automatical decomposition which can be performed without any morphological knowledge. SL980338.PDF (From Author) SL980338.PDF (Rasterized) TOP A Comparative Study Between Polyclass and Multiclass Language Models Authors: Imed Zitouni, LORIA / INRIA-Lorraine (France) Kamel Smaïli, LORIA / INRIA-Lorraine (France) Jean-Paul Haton, LORIA / INRIA-Lorraine (France) Sabine Deligne, ATR-ITL (Japan) Frédéric Bimbot, IRISA-CNRS/INRIA (France) Page (NA) Paper number 498 Abstract: In this paper, we introduce the concept of Multiclass for language modeling and we compare it to the Polyclass model. The originality of the Multiclass is its capability to parse a string of class/tags into variable length independent sequences. A few experimental tests were carried out on a class corpus extracted from the French "Le Monde" word corpus labeled automatically. This corpus contains a set of 43 million of words. In our experiments, Multiclass outperform first-order Polyclass but are slightly outperformed by second-order Polyclass. SL980498.PDF (From Author) SL980498.PDF (Rasterized) TOP Log-Linear Interpolation Of Language Models Authors: Dietrich Klakow, Philips Research Laboratories (Germany) Page (NA) Paper number 522 Abstract: Combining different language models is an important task. Linear interpolation is the established method to do this. We will present a new method called log-linear interpolation (LLI) which combines the simplicity of linear interpolation with essential parts from maximum-entropy models by linearly interpolating the scores of different models. The first series of experiments focuses on adaptation. Unigram, bigram and trigram models trained on NAB are combined with unigram and bigram models trained on a small domain specific corpus. LLI compares favorably with linear interpolation. The second series combines bigram and distance-bigram models. Here, relative improvements are larger (~20% in perplexity). This task seems to be the ideal application of LLI. To further scrutinize the method, first frequent pairs of words are joined to phrases and next, bigram and distance-bigram are combined by LLI. This experiment yields perplexities just 2.5% above the original trigram perplexity. SL980522.PDF (From Author) SL980522.PDF (Rasterized) TOP The Applicability of Adaptive Language Modelling for the Broadcast News Task Authors: Philip Clarkson, Cambridge University Engineering Department (U.K.) Tony Robinson, Cambridge University Engineering Department (U.K.) Page (NA) Paper number 962 Abstract: Adaptive language models have consistently been shown to lead to a significant reduction in language model perplexity compared to the equivalent static trigram model on many data sets. When these language models have been applied to speech recognition, however, they have seldom resulted in a corresponding reduction in word error rate. This paper will investigate some of the possible reasons for this apparent discrepancy, and will explore the circumstances under which adaptive language models can be useful. We will concentrate on cache-based and mixture-based models and their use on the Broadcast News task. SL980962.PDF (From Author) SL980962.PDF (Rasterized) TOP