Interfaces for Multimedia Applications

Chair: T. Chen, Carnegie Mellon University, USA

Home

Sign Language Communication between Japanese-Korean and Japanese-Portuguese using CG Animation

Authors:

Yoshinao Aoki, Hokkaido University (Japan)
Ricardo Mitsumori, Hokkaido University (Japan)
Jincan Li, Hokkaido University (Japan)
Alexander Burger, Langweid (Germany)

Volume 6, Page 3765, Paper number 1334

Abstract:

In this paper we propose a sign language communication between different languages such as Japanese-Korean and Japanese-Portuguese using CG animation of sign language based on the intelligent image communication method. For this purpose sign language animation is produced using data of gesture or text data expressing sign language. In the roduction process of CG animation of sign language, MATLAB and LIFO language are used, where MATLAB is useful for three-dimensional signal processing of gestures and for displaying animation of sign language. On the other hand LIFO language, which is a descendant of the LISP and FORTH language families, is developed and used to produce live CG animations, resulting in a high-speed interactive system of designing and displaying sign language animations. A simple experiment was conducted to translate Japanese sign language into Korean and Portuguese sign languages using the developed CG animation system .

ic981334.pdf (From Postscript)

TOP

Probing the Relationship between Qualitative and Quantitative Performance Measures for Voice-Enabled Telecommunication Services

Authors:

Shrikanth Narayanan, AT&T Labs (U.S.A.)
Mani Subramaniam, AT&T Labs (U.S.A.)
Benjamin Stern, AT&T Labs (U.S.A.)
Barbara Hollister, AT&T Labs (U.S.A.)
Chih-mei Lin, AT&T Labs (U.S.A.)

Volume 6, Page 3769, Paper number 2142

Abstract:

The relationship between objective speech recognition performance measures and perceived performance is analyzed and modeled using data obtained from a voice-dialing trial with 798 AT&T customers. The ability of these models for predicting user perception and overall demand for such voice-enabled services is discussed.

ic982142.pdf (From Postscript)

TOP

Signal Processing for Recognition of Human Frustration

Authors:

Raul Fernandez, MIT Media Lab (U.S.A.)
Rosalind W. Picard, MIT Media Lab (U.S.A.)

Volume 6, Page 3773, Paper number 2468

Abstract:

In this work, inspired by the application of human-machine interaction and the potential use that human-computer interfaces can make of knowledge regarding the affective state of a user, we investigate the problem of sensing and recognizing typical affective experiences that arise when people communicate with computers. In particular, we address the problem of detecting ""frustration"" in human computer interfaces. By first sensing human biophysiological correlates of internal affective states, we proceed to stochastically model the biological time series with Hidden Markov Models to obtain user-dependent recognition systems that learn affective patterns from a set of training data. Labeling criteria to classify the data are discussed, and generalization of the results to a set of unobserved data is evaluated. Significant recognition results (greater than random) are reported for 21 of 24 subjects.

ic982468.pdf (From Postscript)

TOP

Quick Audio Retrieval Using Active Search

Authors:

Gavin A Smith, NTT Basic Research Laboratories (Japan)
Hiroshi Murase, NTT Basic Research Laboratories (Japan)
Kunio Kashino, NTT Basic Research Laboratories (Japan)

Volume 6, Page 3777, Paper number 1987

Abstract:

This paper discusses a method to search quickly through broadcast audio data to detect and locate known sounds using reference templates, based on the active search algorithm and histogram modeling of zero-crossing features. Active search reduces the number of candidate matches between reference and test template by up to 36 times compared to exhaustive search, while still remaining optimal. Computation is further reduced by using computationally inexpensive zero-crossing features. The method is robust against white noise addition down to 20dB signal-to-noise ratios and digitization noise..

ic981987.pdf (From Postscript)

TOP

Retrieval of Broadcast News Documents with the THISL System

Authors:

David C Abberley, Sheffield University (U.K.)
Steve J Renals, Sheffield University (U.K.)
Gary D Cook, Cambridge University (U.K.)

Volume 6, Page 3781, Paper number 2015

Abstract:

This paper describes a spoken document retrieval system, combining the Abbot large vocabulary continuous speech recognition (LVCSR) system developed by Cambridge University, Sheffield University and SoftSound, and the PRISE information retrieval engine developed by NIST. The system was constructed to enable us to participate in the TREC 6 Spoken Document Retrieval experimental evaluation. Our key aims in this work were to produce a complete system for the SDR task, to investigate the effect of a word error rate of 30-50% on retrieval performance and to investigate the integration of LVCSR and word spotting in a retrieval task.