Speech Processing for the Speech-Impaired and Hearing-Impaired 2

Paul Duchnowski, Massachusetts Institute of Technology (USA)
Louis Braida, Massachusetts Institute of Technology (USA)
Maroula Bratakos, Massachusetts Institute of Technology (USA)
David Lum, Massachusetts Institute of Technology (USA)
Matthew Sexton, Massachusetts Institute of Technology (USA)
Jean Krause, Massachusetts Institute of Technology (USA)

Page (NA) Paper number 589

Abstract:

Manual Cued Speech (MCS) is an effective method of communication by the deaf and hearing-impaired. We describe our work on assessing the feasibility of automatic determination and presentation of cues without intervention by the speaker. The conclusions of this study are applied to the design and implementation of a prototype automatic cueing system using HMM-based automatic speech recognition software to identify the cues in real time. We also describe the features of our cue display that enhance its effectiveness such as style of cue images and the timing of their transitions. Our experiments show keyword reception by experienced MCS users improves significantly with the use of our system (66%) relative to speechreading alone (35%) on low-context sentences.

SL980589.PDF (From Author) SL980589.PDF (Rasterized)

0589_01.MPG (was: 0589_01.MPG)	The manually cued sentence "The old castle passed from the duke to the king." File type: Video File Format: Video File: MPEG Tech. description: 30 frames/second, 320 x 240 frame size Creating Application:: mpeg\_encode Creating OS: linux
0589_02.MPG (was: 0589_02.MPG)	Automatically cued (discrete cues) sentence "The loss and two wins were fair games." File type: Video File Format: Video File: MPEG Tech. description: 30 frames/second, 320 x 240 frame size Creating Application:: mpeg\_encode Creating OS: linux
0589_03.MPG (was: 0589_03.MPG)	Automatically cued (dynamic cues) sentence "The kite may fly on this windy day." File type: Video File Format: Video File: MPEG Tech. description: 30 frames/second, 320 x 240 frame size Creating Application:: mpeg\_encode Creating OS: linux

TOP

Training Speech through Visual Feedback Patterns

Authors:

Jan Nouza, Technical University of Liberec (Czech Republic)

Page (NA) Paper number 1139

Abstract:

The paper describes a new version of a visual feedback aid for speech training. The aid is a PC based speech processing system that visualizes incoming signal and its most relevant parameters (such as volume, pitch, timing, spectrum) and compares them to utterances recorded by reference speakers. The goal is to help a trained person in identifying the most severe deviations in his or her pronunciation. The learning through visual comparison is supported by displaying multiple reference utterances, including phonetic labels both to the reference speakers' and trainee's speech, indicating the areas with larger deviations in any of the displayed features and offering a simple tutoring assessment of the trainee's attempts. Primarily, the system was aimed at hearing-impaired users, but its features make it well applicable also for foreign language pronunciation learning and practicing. The latter possibility was verified in an experiment in which a group of subjects tried to learn pronunciation of a couple of words in an exotic for them foreign language.

SL981139.PDF (From Author) SL981139.PDF (Rasterized)

TOP

Word Sequence Pair Spotting for Synchronization of Speech and Text in Production of Closed-Caption TV Programs for the Hearing Impaired

Authors:

Ichiro Maruyama, Telecommunications Advancement Organization (TA0) of Japan (Japan)
Yoshiharu Abe, Mitsubishi Electric Corporation / TAO (Japan)
Takahiro Wakao, TAO (Japan)
Eiji Sawamura, TAO (Japan)
Terumasa Ehara, NHK Science and Technical Research Laboratories / TAO (Japan)
Katsuhiko Shirai, Waseda University / TAO (Japan)

Page (NA) Paper number 1113

Abstract:

This paper describes a method of automatically synchronizing TV news speech and its captions. A news item consists of sentences and often has a corresponding computerized text, which can be used as a caption. We have developed a new phonetically HMM-based word spotter. In this word spotter, word sequences before and after a synchronization point are concatenated and scoring is based on the state of the synchronization point. The detection accuracy of the proposed method is shown to be superior to a conventional method using no word sequence pair. Model configurations are shown for detection failure, an announcer's misstatements and restatements, and erroneous transcriptions. A 100% detection rate with no false alarms is achieved by combining multiple word sequence pairs in series. A 100% detection rate with few false alarms is obtained by using model configurations for misstatements or erroneous transcriptions.

SL981113.PDF (From Author) SL981113.PDF (Rasterized)

TOP

Volume Regulation in Parkinsonian Speech

Authors:

Aileen K. Ho, Department of Psychology,Monash University (Australia)
John L. Bradshaw, Department of Psychology, Monash University (Australia)
Robert Iansek, Geriatric Research Unit, Kingston Centre (Australia)
Robin J. Alfredson, Department of Mechanical Engineering, Monash University (Australia)

Page (NA) Paper number 10

Abstract:

This study investigated the ability to regulate speech volume in a group of six-volume impaired idiopathic Parkinson's disease (PD) patients and their age and sex-matched controls. Participants were asked to read under three conditions; as softly as possible, as loudly as possible, and at normal volume (no volume instruction). The stimuli consisted of a target sentence, easily read in one breath, embedded in a short paragraph of text. Mean volume and volume over time (intensity slope) for the target sentence were obtained. It was found that for all three conditions, patients' speech volume was less than controls' by a constant. Patients also showed a significantly greater reduction of volume (negative intensity slope) towards the end of the sentence, especially for the loud condition. The findings indicate that patients with Parkinsonian hypophonic dysarthria have significant difficulty maintaining speech volume in addition to the inadequate generation of overall speech volume.

Speech Processing for the Speech-Impaired and Hearing-Impaired 2

Authors:

Page (NA) Paper number 589

Abstract:

(was: 0589_01.MPG)

(was: 0589_02.MPG)

(was: 0589_03.MPG)

Authors:

Page (NA) Paper number 1139

Abstract:

Authors:

Page (NA) Paper number 1113

Abstract:

Authors:

Page (NA) Paper number 10

Abstract: