ICASSP '98 Main Page

General Information

Conference Schedule

Technical Program

    Overview
    50th Annivary Events
    Plenary Sessions
    Special Sessions
    Tutorials
    Technical Sessions

	By Date
		May 12, Tue
		May 13, Wed
		May 14, Thur
		May 15, Fri

By Category
	AE	ANNIV
	COMM	DSP
	IMDSP	MMSP
	NNSP	PLEN
	SP	SPEC
	SSAP	UA
	VLSI

By Author
	A	B	C	D	E
	F	G	H	I	J
	K	L	M	N	O
	P	Q	R	S	T
	U	V	W	X	Y
	Z

Invited Speakers

Registration

Exhibits

Social Events

Coming to Seattle

Satellite Events

Call for Papers/
Author's Kit

Future Conferences

Help

Abstract - MMSP3

MMSP3.1	Sign Language Communication between Japanese-Korean and Japanese-Portuguese using CG Animation Y. Aoki, R. Mitsumori, J. Li (Hokkaido University, Japan); A. Burger (Germany) In this paper we propose a sign language communication between different languages such as Japanese-Korean and Japanese-Portuguese using CG animation of sign language based on the intelligent image communication method. For this purpose sign language animation is produced using data of gesture or text data expressing sign language. In the roduction process of CG animation of sign language, MATLAB and LIFO language are used, where MATLAB is useful for three-dimensional signal processing of gestures and for displaying animation of sign language. On the other hand LIFO language, which is a descendant of the LISP and FORTH language families, is developed and used to produce live CG animations, resulting in a high-speed interactive system of designing and displaying sign language animations. A simple experiment was conducted to translate Japanese sign language into Korean and Portuguese sign languages using the developed CG animation system .
MMSP3.2	Probing the Relationship between Qualitative and Quantitative Performance Measures for Voice-Enabled Telecommunication Services S. Narayanan, M. Subramaniam, B. Stern, B. Hollister, C. Lin (AT&T Labs, USA) The relationship between objective speech recognition performance measures and perceived performance is analyzed and modeled using data obtained from a voice-dialing trial with 798 AT&T customers. The ability of these models for predicting user perception and overall demand for such voice-enabled services is discussed.
MMSP3.3	Signal Processing for Recognition of Human Frustration R. Fernandez, P. Rosalind (MIT Media Lab, USA) In this work, inspired by the application of human-machine interaction and the potential use that human-computer interfaces can make of knowledge regarding the affective state of a user, we investigate the problem of sensing and recognizing typical affective experiences that arise when people communicate with computers. In particular, we address the problem of detecting "frustration" in human computer interfaces. By first sensing human biophysiological correlates of internal affective states, we proceed to stochastically model the biological time series with Hidden Markov Models to obtain user-dependent recognition systems that learn affective patterns from a set of training data. Labeling criteria to classify the data are discussed, and generalization of the results to a set of unobserved data is evaluated. Significant recognition results (greater than random) are reported for 21 of 24 subjects.
MMSP3.4	Quick Audio Retrieval Using Active Search G. Smith, H. Murase, K. Kashino (NTT Basic Research Laboratories, Japan) This paper discusses a method to search quickly through broadcast audio data to detect and locate known sounds using reference templates, based on the active search algorithm and histogram modeling of zero-crossing features. Active search reduces the number of candidate matches between reference and test template by up to 36 times compared to exhaustive search, while still remaining optimal. Computation is further reduced by using computationally inexpensive zero-crossing features. The method is robust against white noise addition down to 20dB signal-to-noise ratios and digitization noise..
MMSP3.5	Retrieval of Broadcast News Documents with the THISL System D. Abberley, S. Renals (Sheffield University, UK); G. Cook (Cambridge University, UK) This paper describes a spoken document retrieval system, combining the Abbot large vocabulary continuous speech recognition (LVCSR) system developed by Cambridge University, Sheffield University and SoftSound, and the PRISE information retrieval engine developed by NIST. The system was constructed to enable us to participate in the TREC 6 Spoken Document Retrieval experimental evaluation. Our key aims in this work were to produce a complete system for the SDR task, to investigate the effect of a word error rate of 30-50% on retrieval performance and to investigate the integration of LVCSR and word spotting in a retrieval task.

< Previous Abstract - MMSP2