Text-To-Speech Synthesis 2

Home
Full List of Titles
1: ICSLP'98 Proceedings
Keynote Speeches
Text-To-Speech Synthesis 1
Spoken Language Models and Dialog 1
Prosody and Emotion 1
Hidden Markov Model Techniques 1
Speaker and Language Recognition 1
Multimodal Spoken Language Processing 1
Isolated Word Recognition
Robust Speech Processing in Adverse Environments 1
Spoken Language Models and Dialog 2
Articulatory Modelling 1
Talking to Infants, Pets and Lovers
Robust Speech Processing in Adverse Environments 2
Spoken Language Models and Dialog 3
Speech Coding 1
Articulatory Modelling 2
Prosody and Emotion 2
Neural Networks, Fuzzy and Evolutionary Methods 1
Utterance Verification and Word Spotting 1 / Speaker Adaptation 1
Text-To-Speech Synthesis 2
Spoken Language Models and Dialog 4
Human Speech Perception 1
Robust Speech Processing in Adverse Environments 3
Speech and Hearing Disorders 1
Prosody and Emotion 3
Spoken Language Understanding Systems 1
Signal Processing and Speech Analysis 1
Spoken Language Generation and Translation 1
Spoken Language Models and Dialog 5
Segmentation, Labelling and Speech Corpora 1
Multimodal Spoken Language Processing 2
Prosody and Emotion 4
Neural Networks, Fuzzy and Evolutionary Methods 2
Large Vocabulary Continuous Speech Recognition 1
Speaker and Language Recognition 2
Signal Processing and Speech Analysis 2
Prosody and Emotion 5
Robust Speech Processing in Adverse Environments 4
Segmentation, Labelling and Speech Corpora 2
Speech Technology Applications and Human-Machine Interface 1
Large Vocabulary Continuous Speech Recognition 2
Text-To-Speech Synthesis 3
Language Acquisition 1
Acoustic Phonetics 1
Speaker Adaptation 2
Speech Coding 2
Hidden Markov Model Techniques 2
Multilingual Perception and Recognition 1
Large Vocabulary Continuous Speech Recognition 3
Articulatory Modelling 3
Language Acquisition 2
Speaker and Language Recognition 3
Text-To-Speech Synthesis 4
Spoken Language Understanding Systems 4
Human Speech Perception 2
Large Vocabulary Continuous Speech Recognition 4
Spoken Language Understanding Systems 2
Signal Processing and Speech Analysis 3
Human Speech Perception 3
Speaker Adaptation 3
Spoken Language Understanding Systems 3
Multimodal Spoken Language Processing 3
Acoustic Phonetics 2
Large Vocabulary Continuous Speech Recognition 5
Speech Coding 3
Language Acquisition 3 / Multilingual Perception and Recognition 2
Segmentation, Labelling and Speech Corpora 3
Text-To-Speech Synthesis 5
Spoken Language Generation and Translation 2
Human Speech Perception 4
Robust Speech Processing in Adverse Environments 5
Text-To-Speech Synthesis 6
Speech Technology Applications and Human-Machine Interface 2
Prosody and Emotion 6
Hidden Markov Model Techniques 3
Speech and Hearing Disorders 2 / Speech Processing for the Speech and Hearing Impaired 1
Human Speech Production
Segmentation, Labelling and Speech Corpora 4
Speaker and Language Recognition 4
Speech Technology Applications and Human-Machine Interface 3
Utterance Verification and Word Spotting 2
Large Vocabulary Continuous Speech Recognition 6
Neural Networks, Fuzzy and Evolutionary Methods 3
Speech Processing for the Speech-Impaired and Hearing-Impaired 2
Prosody and Emotion 7
2: SST Student Day
SST Student Day - Poster Session 1
SST Student Day - Poster Session 2

Author Index
A B C D E F G H I
J K L M N O P Q R
S T U V W X Y Z

Multimedia Files

Prosody Prediction for Speech Synthesis using Transformational Rule-based Learning

Authors:

Cameron S. Fordyce, Lernout & Hauspie Speech Products (USA)
Mari Ostendorf, Dept. Of ECE, Boston University (USA)

Page (NA) Paper number 682

Abstract:

Speech generation systems can benefit from the prediction of abstract prosodic labels from text input. Earlier methods of prosodic label prediction have relied on hand-written rules or on statistical methods such as decision trees. Statistical methods have the advantage of being automatically trainable and are portable to new domains. This research presents a new method for automatically training an abstract prosodic label predictor, transformational rule-based learning. This method is automatically trainable. Results will be presented for pitch accent location and phrase boundary prediction.

SL980682.PDF (From Author) SL980682.PDF (Rasterized)

TOP


Representing the Environments for Phonological Processes in an Accent-Independent Lexicon for Synthesis of English

Authors:

Susan Fitt, Centre for Speech Technology Research, University of Edinburgh (U.K.)
Stephen Isard, Centre for Speech Technology Research, University of Edinburgh (U.K.)

Page (NA) Paper number 850

Abstract:

This paper reports on work developing an accent-independent lexicon for use in synthesising speech in English. Developing a lexicon for a new accent is a long process, and one potential solution to this problem involves the encoding of regional variation by means of keywords; so, rather than transcribing different phonemes for 'pool' in RP and in Scottish accents, we can simply say that the word contains the same vowel as in the keyword GOOSE. However, there are a number of theoretical and practical issues, which are discussed here. It is proposed that phonemic variation within accents be encoded in the lexicon by use of keyword symbols, while allophonic differences be derived by accent-specific rules. If we wish to include some stylistic variation this makes the lexicon more comprehensive but more complex. Finally, it is noted that even in keyword synthesis exception lists cannot be avoided.

SL980850.PDF (From Author) SL980850.PDF (Rasterized)

TOP


Efficient Lexical Retrieval for English Text-to-Speech Synthesis

Authors:

Daniel Faulkner, Aculab PLC (U.K.)
Charles Bryant, Aculab PLC (U.K.)

Page (NA) Paper number 91

Abstract:

We present a first version of a filter dictionary for use in a computer-telephony text-to-speech synthesis system. The aim of the filter dictionary was to provide a lexicon that was compact, fast and had broader coverage than the standard dictionary used to create it. Correct phonemic transcriptions and lexical stress assignment were both required for a transcription to be deemed accurate. The approach taken here guarantees 100% accurate coverage of the original dictionary, but also gives 93% accurate transcription of the expected coverage of novel words. Lexical stress and the phonemic transcription were retrieved in one pass, resulting in an extremely fast system. We also allowed user-definition to retain accuracy for non-standard transcriptions. This algorithm was developed for British English, but could be applied to other languages.

SL980091.PDF (From Author) SL980091.PDF (Rasterized)

TOP