Spacer ICASSP '98 Main Page

Spacer
General Information
Spacer
Conference Schedule
Spacer
Technical Program
Spacer
    Overview
    50th Annivary Events
    Plenary Sessions
    Special Sessions
    Tutorials
    Technical Sessions
    
By Date
    May 12, Tue
May 13, Wed
May 14, Thur
May 15, Fri
    
By Category
    AE    ANNIV   
COMM    DSP   
IMDSP    MMSP   
NNSP    PLEN   
SP    SPEC   
SSAP    UA   
VLSI   
    
By Author
    A    B    C    D    E   
F    G    H    I    J   
K    L    M    N    O   
P    Q    R    S    T   
U    V    W    X    Y   
Z   

    Invited Speakers
Spacer
Registration
Spacer
Exhibits
Spacer
Social Events
Spacer
Coming to Seattle
Spacer
Satellite Events
Spacer
Call for Papers/
Author's Kit

Spacer
Future Conferences
Spacer
Help

Abstract -  SP21   


 
SP21.1

   
Exploiting Both Local and Global Constraints for Multi-Span Statistical Language Modeling
J. Bellegarda  (Apple Computer, USA)
A new framework is proposed to integrate the various constraints, both local and global, that are present in the language. Local constraints are captured via n-gram language modeling, while global constraints are taken into account through the use of latent semantic analysis. An integrative formulation is derived for the combination of these two paradigms, resulting in several families of multi-span language models for large vocabulary speech recognition. Because of the inherent complementarity in the two types of constraints, the performance of the integrated language models, as measured by perplexity, compares favorably with the corresponding n-gram performance.
 
SP21.2

   
Topic Adaptation for Language Modeling Using Unnormalized Exponential Models
S. Chen, K. Seymore, R. Rosenfeld  (Carnegie Mellon University, USA)
In this paper, we present novel techniques for performing topic adaptation on an n-gram language model. Given training text labeled with topic information, we automatically identify the most relevant topics for new text. We adapt our language model toward these topics using an exponential model, by adjusting probabilities in our model to agree with those found in the topical subset of the training data. For efficiency, we do not normalize the model; that is, we do not require that the "probabilities" in the language model sum to 1. With these techniques, we were able to achieve a modest reduction in speech recognition word-error rate in the Broadcast News domain.
 
SP21.3

   
Shrinking Language Models by Robust Approximation
A. Buchsbaum  (AT&T Labs, USA);   R. Giancarlo  (University of Palermo, Italy);   J. Westbrook  (AT&T Labs, USA)
We study the problem of reducing the size of a language model while preserving recognition performance (accuracy and speed). A successful approach has been to represent language models by weighted finite-state automata (WFAs). Analogues of classical automata determinization and minimization algorithms then provide a general method to produce smaller but equivalent WFAs. We extend this approach by introducing the notion of approximate determinization. We provide an algorithm that, when applied to language models for the North American Business task, achieves 25-35% size reduction compared to previous techniques, with negligible effects on recognition time and accuracy.
 
SP21.4

   
A Lightweight Punctuation Annotation System for Speech
D. Beeferman, A. Berger, J. Lafferty  (Carnegie Mellon University, USA)
This paper describes a lightweight method for the automatic insertion of intra-sentence punctuation into text. Despite the intuition that pauses in an acoustic stream are a positive indicator for some types of punctuation, this work will demonstrate the feasibility of a system which relies solely on lexical information. Besides its potential role in a speech recognition system, such a system could serve equally well in non-speech applications such as automatic grammar correction in a word processor and parsing of spoken text. After describing the design of a punctuation-restoration system, which relies on a trigram language model and a straightforward application of the Viterbi algorithm, we summarize results, both quantitative and subjective, of the performance and behavior of a prototype system.
 
SP21.5

   
Sub-Sentence Discourse Models for Conversational Speech Recognition
K. Ma, G. Zavaliagkos, M. Meteer  (GTE/BBN Technologies, USA)
According to discourse theories in linguistics, conversational utterances possess an informational structure that partitions each sentence into two portions: a "given" and "new". In this work, we explore this idea by building sub-sentence discourse language models for conversational speech recognition. The internal sentence structure is captured in statistical language modeling by training multiple n-gram models using the Expectation-Maximization algorithm on the Switchboard corpus. The resulting model contributes to a 30% reduction in language model perplexity and a small gain in word error rate.
 
SP21.6

   
Two-Step Generation of Variable-Word-Length Language Model Integrating Local and Global Constraints
S. Matsunaga, S. Sagayama  (NTT Human Interface Laboratories, Japan)
This paper proposes two-step generation of a variable-length class-based language model that integrates local and global constraints. In the first-step, an initial class set is recursively designed using local constraints. Word elements for each class are determined using Kullback divergence and total entropy. In the second step, the word classes are recursively and words are iteratively recreated, by grouping consecutive words to generate longer units and by splitting the initial classes into finer classes. These operations in the second step are carried out selectively, taking into account local and global constraints on the basis of a minimum entropy criterion. Experiments showed that the perplexity of the proposed initial class set is superior to that of the conventional part-of-speech class, and the perplexity of the variable-word-length model consequently becomes lower. Furthermore, this two-step model generation approach greatly reduces the training time.
 
SP21.7

   
Language-Model Optimization by Mapping of Corpora
D. Klakow  (Philips GmbH Forschungslaboratorien, Germany)
It is questionable whether words are really the best basic units for the estimation of stochastic language models - grouping frequent word sequences to phrases can improve language models. More generally, we have investigated various coding schemes for a corpus. In this paper, they are applied to optimize the perplexity of n-gram language models. In tests on two large corpora (WSJ and BNA) the bigram perplexity was reduced by up to 29\%. Furthermore, this approach allows to tackle the problem of an open vocabulary with no unknown word.
 
SP21.8

   
Just-In-Time Language Modelling
A. Berger, R. Miller  (Carnegie Mellon University, USA)
Traditional approaches to language modelling have relied on a fixed corpus of text to inform the parameters of a probability distribution over word sequences. Increasing the corpus size often leads to better-performing language models, but no matter how large, the corpus is a static entity, unable to reflect information about events which postdate it. In these pages we introduce an online paradigm which interleaves the estimation and application of a language model. We present a Bayesian approach to online language modelling, in which the marginal probabilities of a static trigram model are dynamically updated to match the topic being dictated to the system. We also describe the architecture of a prototype we have implemented which uses the World Wide Web (WWW) as a source of information, and the results of some initial proof of concept experiments.
 

< Previous Abstract - SP20

SP22 - Next Abstract >