Speech Processing for the Speech-Impaired and Hearing-Impaired 2

Home
Full List of Titles
1: ICSLP'98 Proceedings
Keynote Speeches
Text-To-Speech Synthesis 1
Spoken Language Models and Dialog 1
Prosody and Emotion 1
Hidden Markov Model Techniques 1
Speaker and Language Recognition 1
Multimodal Spoken Language Processing 1
Isolated Word Recognition
Robust Speech Processing in Adverse Environments 1
Spoken Language Models and Dialog 2
Articulatory Modelling 1
Talking to Infants, Pets and Lovers
Robust Speech Processing in Adverse Environments 2
Spoken Language Models and Dialog 3
Speech Coding 1
Articulatory Modelling 2
Prosody and Emotion 2
Neural Networks, Fuzzy and Evolutionary Methods 1
Utterance Verification and Word Spotting 1 / Speaker Adaptation 1
Text-To-Speech Synthesis 2
Spoken Language Models and Dialog 4
Human Speech Perception 1
Robust Speech Processing in Adverse Environments 3
Speech and Hearing Disorders 1
Prosody and Emotion 3
Spoken Language Understanding Systems 1
Signal Processing and Speech Analysis 1
Spoken Language Generation and Translation 1
Spoken Language Models and Dialog 5
Segmentation, Labelling and Speech Corpora 1
Multimodal Spoken Language Processing 2
Prosody and Emotion 4
Neural Networks, Fuzzy and Evolutionary Methods 2
Large Vocabulary Continuous Speech Recognition 1
Speaker and Language Recognition 2
Signal Processing and Speech Analysis 2
Prosody and Emotion 5
Robust Speech Processing in Adverse Environments 4
Segmentation, Labelling and Speech Corpora 2
Speech Technology Applications and Human-Machine Interface 1
Large Vocabulary Continuous Speech Recognition 2
Text-To-Speech Synthesis 3
Language Acquisition 1
Acoustic Phonetics 1
Speaker Adaptation 2
Speech Coding 2
Hidden Markov Model Techniques 2
Multilingual Perception and Recognition 1
Large Vocabulary Continuous Speech Recognition 3
Articulatory Modelling 3
Language Acquisition 2
Speaker and Language Recognition 3
Text-To-Speech Synthesis 4
Spoken Language Understanding Systems 4
Human Speech Perception 2
Large Vocabulary Continuous Speech Recognition 4
Spoken Language Understanding Systems 2
Signal Processing and Speech Analysis 3
Human Speech Perception 3
Speaker Adaptation 3
Spoken Language Understanding Systems 3
Multimodal Spoken Language Processing 3
Acoustic Phonetics 2
Large Vocabulary Continuous Speech Recognition 5
Speech Coding 3
Language Acquisition 3 / Multilingual Perception and Recognition 2
Segmentation, Labelling and Speech Corpora 3
Text-To-Speech Synthesis 5
Spoken Language Generation and Translation 2
Human Speech Perception 4
Robust Speech Processing in Adverse Environments 5
Text-To-Speech Synthesis 6
Speech Technology Applications and Human-Machine Interface 2
Prosody and Emotion 6
Hidden Markov Model Techniques 3
Speech and Hearing Disorders 2 / Speech Processing for the Speech and Hearing Impaired 1
Human Speech Production
Segmentation, Labelling and Speech Corpora 4
Speaker and Language Recognition 4
Speech Technology Applications and Human-Machine Interface 3
Utterance Verification and Word Spotting 2
Large Vocabulary Continuous Speech Recognition 6
Neural Networks, Fuzzy and Evolutionary Methods 3
Speech Processing for the Speech-Impaired and Hearing-Impaired 2
Prosody and Emotion 7
2: SST Student Day
SST Student Day - Poster Session 1
SST Student Day - Poster Session 2

Author Index
A B C D E F G H I
J K L M N O P Q R
S T U V W X Y Z

Multimedia Files

A Speechreading Aid Based on Phonetic ASR

Authors:

Paul Duchnowski, Massachusetts Institute of Technology (USA)
Louis Braida, Massachusetts Institute of Technology (USA)
Maroula Bratakos, Massachusetts Institute of Technology (USA)
David Lum, Massachusetts Institute of Technology (USA)
Matthew Sexton, Massachusetts Institute of Technology (USA)
Jean Krause, Massachusetts Institute of Technology (USA)

Page (NA) Paper number 589

Abstract:

Manual Cued Speech (MCS) is an effective method of communication by the deaf and hearing-impaired. We describe our work on assessing the feasibility of automatic determination and presentation of cues without intervention by the speaker. The conclusions of this study are applied to the design and implementation of a prototype automatic cueing system using HMM-based automatic speech recognition software to identify the cues in real time. We also describe the features of our cue display that enhance its effectiveness such as style of cue images and the timing of their transitions. Our experiments show keyword reception by experienced MCS users improves significantly with the use of our system (66%) relative to speechreading alone (35%) on low-context sentences.

SL980589.PDF (From Author) SL980589.PDF (Rasterized)

0589_01.MPG
(was: 0589_01.MPG)
The manually cued sentence "The old castle passed from the duke to the king."
File type: Video File
Format: Video File: MPEG
Tech. description: 30 frames/second, 320 x 240 frame size
Creating Application:: mpeg\_encode
Creating OS: linux
0589_02.MPG
(was: 0589_02.MPG)
Automatically cued (discrete cues) sentence "The loss and two wins were fair games."
File type: Video File
Format: Video File: MPEG
Tech. description: 30 frames/second, 320 x 240 frame size
Creating Application:: mpeg\_encode
Creating OS: linux
0589_03.MPG
(was: 0589_03.MPG)
Automatically cued (dynamic cues) sentence "The kite may fly on this windy day."
File type: Video File
Format: Video File: MPEG
Tech. description: 30 frames/second, 320 x 240 frame size
Creating Application:: mpeg\_encode
Creating OS: linux

TOP


Training Speech through Visual Feedback Patterns

Authors:

Jan Nouza, Technical University of Liberec (Czech Republic)

Page (NA) Paper number 1139

Abstract:

The paper describes a new version of a visual feedback aid for speech training. The aid is a PC based speech processing system that visualizes incoming signal and its most relevant parameters (such as volume, pitch, timing, spectrum) and compares them to utterances recorded by reference speakers. The goal is to help a trained person in identifying the most severe deviations in his or her pronunciation. The learning through visual comparison is supported by displaying multiple reference utterances, including phonetic labels both to the reference speakers' and trainee's speech, indicating the areas with larger deviations in any of the displayed features and offering a simple tutoring assessment of the trainee's attempts. Primarily, the system was aimed at hearing-impaired users, but its features make it well applicable also for foreign language pronunciation learning and practicing. The latter possibility was verified in an experiment in which a group of subjects tried to learn pronunciation of a couple of words in an exotic for them foreign language.

SL981139.PDF (From Author) SL981139.PDF (Rasterized)

TOP


Word Sequence Pair Spotting for Synchronization of Speech and Text in Production of Closed-Caption TV Programs for the Hearing Impaired

Authors:

Ichiro Maruyama, Telecommunications Advancement Organization (TA0) of Japan (Japan)
Yoshiharu Abe, Mitsubishi Electric Corporation / TAO (Japan)
Takahiro Wakao, TAO (Japan)
Eiji Sawamura, TAO (Japan)
Terumasa Ehara, NHK Science and Technical Research Laboratories / TAO (Japan)
Katsuhiko Shirai, Waseda University / TAO (Japan)

Page (NA) Paper number 1113

Abstract:

This paper describes a method of automatically synchronizing TV news speech and its captions. A news item consists of sentences and often has a corresponding computerized text, which can be used as a caption. We have developed a new phonetically HMM-based word spotter. In this word spotter, word sequences before and after a synchronization point are concatenated and scoring is based on the state of the synchronization point. The detection accuracy of the proposed method is shown to be superior to a conventional method using no word sequence pair. Model configurations are shown for detection failure, an announcer's misstatements and restatements, and erroneous transcriptions. A 100% detection rate with no false alarms is achieved by combining multiple word sequence pairs in series. A 100% detection rate with few false alarms is obtained by using model configurations for misstatements or erroneous transcriptions.

SL981113.PDF (From Author) SL981113.PDF (Rasterized)

TOP


Volume Regulation in Parkinsonian Speech

Authors:

Aileen K. Ho, Department of Psychology,Monash University (Australia)
John L. Bradshaw, Department of Psychology, Monash University (Australia)
Robert Iansek, Geriatric Research Unit, Kingston Centre (Australia)
Robin J. Alfredson, Department of Mechanical Engineering, Monash University (Australia)

Page (NA) Paper number 10

Abstract:

This study investigated the ability to regulate speech volume in a group of six-volume impaired idiopathic Parkinson's disease (PD) patients and their age and sex-matched controls. Participants were asked to read under three conditions; as softly as possible, as loudly as possible, and at normal volume (no volume instruction). The stimuli consisted of a target sentence, easily read in one breath, embedded in a short paragraph of text. Mean volume and volume over time (intensity slope) for the target sentence were obtained. It was found that for all three conditions, patients' speech volume was less than controls' by a constant. Patients also showed a significantly greater reduction of volume (negative intensity slope) towards the end of the sentence, especially for the loud condition. The findings indicate that patients with Parkinsonian hypophonic dysarthria have significant difficulty maintaining speech volume in addition to the inadequate generation of overall speech volume.

SL980010.PDF (From Author) SL980010.PDF (Rasterized)

TOP