ICASSP '98 Main Page

General Information

Conference Schedule

Technical Program

    Overview
    50th Annivary Events
    Plenary Sessions
    Special Sessions
    Tutorials
    Technical Sessions

	By Date
		May 12, Tue
		May 13, Wed
		May 14, Thur
		May 15, Fri

By Category
	AE	ANNIV
	COMM	DSP
	IMDSP	MMSP
	NNSP	PLEN
	SP	SPEC
	SSAP	UA
	VLSI

By Author
	A	B	C	D	E
	F	G	H	I	J
	K	L	M	N	O
	P	Q	R	S	T
	U	V	W	X	Y
	Z

Invited Speakers

Registration

Exhibits

Social Events

Coming to Seattle

Satellite Events

Call for Papers/
Author's Kit

Future Conferences

Help

Abstract - SP21

SP21.1	Exploiting Both Local and Global Constraints for Multi-Span Statistical Language Modeling J. Bellegarda (Apple Computer, USA) A new framework is proposed to integrate the various constraints, both local and global, that are present in the language. Local constraints are captured via n-gram language modeling, while global constraints are taken into account through the use of latent semantic analysis. An integrative formulation is derived for the combination of these two paradigms, resulting in several families of multi-span language models for large vocabulary speech recognition. Because of the inherent complementarity in the two types of constraints, the performance of the integrated language models, as measured by perplexity, compares favorably with the corresponding n-gram performance.
SP21.2	Topic Adaptation for Language Modeling Using Unnormalized Exponential Models S. Chen, K. Seymore, R. Rosenfeld (Carnegie Mellon University, USA) In this paper, we present novel techniques for performing topic adaptation on an n-gram language model. Given training text labeled with topic information, we automatically identify the most relevant topics for new text. We adapt our language model toward these topics using an exponential model, by adjusting probabilities in our model to agree with those found in the topical subset of the training data. For efficiency, we do not normalize the model; that is, we do not require that the "probabilities" in the language model sum to 1. With these techniques, we were able to achieve a modest reduction in speech recognition word-error rate in the Broadcast News domain.
SP21.3	Shrinking Language Models by Robust Approximation A. Buchsbaum (AT&T Labs, USA); R. Giancarlo (University of Palermo, Italy); J. Westbrook (AT&T Labs, USA) We study the problem of reducing the size of a language model while preserving recognition performance (accuracy and speed). A successful approach has been to represent language models by weighted finite-state automata (WFAs). Analogues of classical automata determinization and minimization algorithms then provide a general method to produce smaller but equivalent WFAs. We extend this approach by introducing the notion of approximate determinization. We provide an algorithm that, when applied to language models for the North American Business task, achieves 25-35% size reduction compared to previous techniques, with negligible effects on recognition time and accuracy.
SP21.4	A Lightweight Punctuation Annotation System for Speech D. Beeferman, A. Berger, J. Lafferty (Carnegie Mellon University, USA) This paper describes a lightweight method for the automatic insertion of intra-sentence punctuation into text. Despite the intuition that pauses in an acoustic stream are a positive indicator for some types of punctuation, this work will demonstrate the feasibility of a system which relies solely on lexical information. Besides its potential role in a speech recognition system, such a system could serve equally well in non-speech applications such as automatic grammar correction in a word processor and parsing of spoken text. After describing the design of a punctuation-restoration system, which relies on a trigram language model and a straightforward application of the Viterbi algorithm, we summarize results, both quantitative and subjective, of the performance and behavior of a prototype system.
SP21.5	Sub-Sentence Discourse Models for Conversational Speech Recognition K. Ma, G. Zavaliagkos, M. Meteer (GTE/BBN Technologies, USA) According to discourse theories in linguistics, conversational utterances possess an informational structure that partitions each sentence into two portions: a "given" and "new". In this work, we explore this idea by building sub-sentence discourse language models for conversational speech recognition. The internal sentence structure is captured in statistical language modeling by training multiple n-gram models using the Expectation-Maximization algorithm on the Switchboard corpus. The resulting model contributes to a 30% reduction in language model perplexity and a small gain in word error rate.
SP21.6	Two-Step Generation of Variable-Word-Length Language Model Integrating Local and Global Constraints S. Matsunaga, S. Sagayama (NTT Human Interface Laboratories, Japan) This paper proposes two-step generation of a variable-length class-based language model that integrates local and global constraints. In the first-step, an initial class set is recursively designed using local constraints. Word elements for each class are determined using Kullback divergence and total entropy. In the second step, the word classes are recursively and words are iteratively recreated, by grouping consecutive words to generate longer units and by splitting the initial classes into finer classes. These operations in the second step are carried out selectively, taking into account local and global constraints on the basis of a minimum entropy criterion. Experiments showed that the perplexity of the proposed initial class set is superior to that of the conventional part-of-speech class, and the perplexity of the variable-word-length model consequently becomes lower. Furthermore, this two-step model generation approach greatly reduces the training time.
SP21.7	Language-Model Optimization by Mapping of Corpora D. Klakow (Philips GmbH Forschungslaboratorien, Germany) It is questionable whether words are really the best basic units for the estimation of stochastic language models - grouping frequent word sequences to phrases can improve language models. More generally, we have investigated various coding schemes for a corpus. In this paper, they are applied to optimize the perplexity of n-gram language models. In tests on two large corpora (WSJ and BNA) the bigram perplexity was reduced by up to 29\%. Furthermore, this approach allows to tackle the problem of an open vocabulary with no unknown word.
SP21.8	Just-In-Time Language Modelling A. Berger, R. Miller (Carnegie Mellon University, USA) Traditional approaches to language modelling have relied on a fixed corpus of text to inform the parameters of a probability distribution over word sequences. Increasing the corpus size often leads to better-performing language models, but no matter how large, the corpus is a static entity, unable to reflect information about events which postdate it. In these pages we introduce an online paradigm which interleaves the estimation and application of a language model. We present a Bayesian approach to online language modelling, in which the marginal probabilities of a static trigram model are dynamically updated to match the topic being dictated to the system. We also describe the architecture of a prototype we have implemented which uses the World Wide Web (WWW) as a source of information, and the results of some initial proof of concept experiments.

< Previous Abstract - SP20

SP22 - Next Abstract >