Multilingual Perception and Recognition 1

Home
Full List of Titles
1: ICSLP'98 Proceedings
Keynote Speeches
Text-To-Speech Synthesis 1
Spoken Language Models and Dialog 1
Prosody and Emotion 1
Hidden Markov Model Techniques 1
Speaker and Language Recognition 1
Multimodal Spoken Language Processing 1
Isolated Word Recognition
Robust Speech Processing in Adverse Environments 1
Spoken Language Models and Dialog 2
Articulatory Modelling 1
Talking to Infants, Pets and Lovers
Robust Speech Processing in Adverse Environments 2
Spoken Language Models and Dialog 3
Speech Coding 1
Articulatory Modelling 2
Prosody and Emotion 2
Neural Networks, Fuzzy and Evolutionary Methods 1
Utterance Verification and Word Spotting 1 / Speaker Adaptation 1
Text-To-Speech Synthesis 2
Spoken Language Models and Dialog 4
Human Speech Perception 1
Robust Speech Processing in Adverse Environments 3
Speech and Hearing Disorders 1
Prosody and Emotion 3
Spoken Language Understanding Systems 1
Signal Processing and Speech Analysis 1
Spoken Language Generation and Translation 1
Spoken Language Models and Dialog 5
Segmentation, Labelling and Speech Corpora 1
Multimodal Spoken Language Processing 2
Prosody and Emotion 4
Neural Networks, Fuzzy and Evolutionary Methods 2
Large Vocabulary Continuous Speech Recognition 1
Speaker and Language Recognition 2
Signal Processing and Speech Analysis 2
Prosody and Emotion 5
Robust Speech Processing in Adverse Environments 4
Segmentation, Labelling and Speech Corpora 2
Speech Technology Applications and Human-Machine Interface 1
Large Vocabulary Continuous Speech Recognition 2
Text-To-Speech Synthesis 3
Language Acquisition 1
Acoustic Phonetics 1
Speaker Adaptation 2
Speech Coding 2
Hidden Markov Model Techniques 2
Multilingual Perception and Recognition 1
Large Vocabulary Continuous Speech Recognition 3
Articulatory Modelling 3
Language Acquisition 2
Speaker and Language Recognition 3
Text-To-Speech Synthesis 4
Spoken Language Understanding Systems 4
Human Speech Perception 2
Large Vocabulary Continuous Speech Recognition 4
Spoken Language Understanding Systems 2
Signal Processing and Speech Analysis 3
Human Speech Perception 3
Speaker Adaptation 3
Spoken Language Understanding Systems 3
Multimodal Spoken Language Processing 3
Acoustic Phonetics 2
Large Vocabulary Continuous Speech Recognition 5
Speech Coding 3
Language Acquisition 3 / Multilingual Perception and Recognition 2
Segmentation, Labelling and Speech Corpora 3
Text-To-Speech Synthesis 5
Spoken Language Generation and Translation 2
Human Speech Perception 4
Robust Speech Processing in Adverse Environments 5
Text-To-Speech Synthesis 6
Speech Technology Applications and Human-Machine Interface 2
Prosody and Emotion 6
Hidden Markov Model Techniques 3
Speech and Hearing Disorders 2 / Speech Processing for the Speech and Hearing Impaired 1
Human Speech Production
Segmentation, Labelling and Speech Corpora 4
Speaker and Language Recognition 4
Speech Technology Applications and Human-Machine Interface 3
Utterance Verification and Word Spotting 2
Large Vocabulary Continuous Speech Recognition 6
Neural Networks, Fuzzy and Evolutionary Methods 3
Speech Processing for the Speech-Impaired and Hearing-Impaired 2
Prosody and Emotion 7
2: SST Student Day
SST Student Day - Poster Session 1
SST Student Day - Poster Session 2

Author Index
A B C D E F G H I
J K L M N O P Q R
S T U V W X Y Z

Multimedia Files

Bilingual and Dialectal Adaptation and Retraining

Authors:

Ulla Uebler, Bavarian Research Center for Knowledge Based Systems (Germany)
Michael Schüßler, Bavarian Research Center for Knowledge Based Systems (Germany)
Heinrich Niemann, Bavarian Research Center for Knowledge Based Systems (Germany)

Page (NA) Paper number 337

Abstract:

In this paper, we report our investigations on the use of adaptation and retraining in our bilingual (Italian, German) and multidialectal recognition system. Our approach for bilingual speech recognition is to assume the two languages as being one, which is best suited for a task where Italian and German natives speak both languages, resulting in a variety of accents and dialects. We performed adaptation on single speakers and speaker groups built from combinations of spoken and native language. Furthermore, we performed retraining on partitions of the adaptation or training data. Our experiments led to an error rate reduction in all cases: compared to the baseline system, we achieved an overall improvement of 14, 12--14 and 7 % for speaker adaptation, speaker group adaptation and retraining, respectively. Furthermore, we found among others that performance is rather stable for Italian between adaptation and retraining, while adaptation for German outperforms retraining by far.

SL980337.PDF (From Author) SL980337.PDF (Rasterized)

TOP


Language Independent and Language Adaptive Large Vocabulary Speech Recognition

Authors:

Tanja Schultz, Interactive Systems Laboratories (Germany)
Alex Waibel, Interactive Systems Laboratories (USA)

Page (NA) Paper number 577

Abstract:

This paper describes the design of a multilingual speech recognizer using an LVCSR dictation database which has been collected under the project GlobalPhone. This project at the University of Karlsruhe investigates LVCSR systems in 15 languages of the world, namely Arabic, Chinese, Croatian, English, French, German, Italian, Japanese, Korean, Portuguese, Russian, Spanish, Swedish, Tamil, and Turkish. Based on a global phoneme set we built different multilingual speech recognition systems for five of the 15 languages. Context dependent phoneme models are created data-driven by introducing questions about languages and language groups to our polyphone clustering procedure. We apply the resulting multilingual models to unseen languages and present several recognition results in language independent and language adaptive setups. The results indicate that the method of parameter sharing should be decided depending on whether multilingual or crosslingual speech recognition is projected.

SL980577.PDF (From Author) SL980577.PDF (Rasterized)

TOP


A Method for Measuring the Intelligibility and Nonnativeness of Phone Quality in Foreign Language Pronunciation Training

Authors:

Goh Kawai, University of Tokyo (Japan)
Keikichi Hirose, University of Tokyo (Japan)

Page (NA) Paper number 782

Abstract:

The problem addressed is automatically detecting, measuring and correcting nonnative pronunciation characteristics (so-called "foreign accents") in foreign language speech. Systemic, structural and realizational differences between L1 (native language) and L2 (target language) appear as phone insertions, deletions and substitutions. A bilingual phone recognizer using native-trained acoustic models of the learner's L1 and L2 was developed to identify insertions, deletions and substitutions of L2 phones. Recognition results are combined with knowledge of phonetics, phonology and pedagogy to show learners which phones were mispronounced and to instruct how to modify their articulatory gestures for more native-sounding speech. The degree of the learner's foreign accent is measured based on the number of alternate pronunciations the learner uses; the number decreases as learning progresses. Evaluation experiments using Japanese and American English indicate that the system is an effective component technology for computer-aided pronunciation learning.

SL980782.PDF (From Author) SL980782.PDF (Rasterized)

TOP