Authors:
Cameron S. Fordyce, Lernout & Hauspie Speech Products (USA)
Mari Ostendorf, Dept. Of ECE, Boston University (USA)
Page (NA) Paper number 682
Abstract:
Speech generation systems can benefit from the prediction of abstract
prosodic labels from text input. Earlier methods of prosodic label
prediction have relied on hand-written rules or on statistical methods
such as decision trees. Statistical methods have the advantage of being
automatically trainable and are portable to new domains. This research
presents a new method for automatically training an abstract prosodic
label predictor, transformational rule-based learning. This method
is automatically trainable. Results will be presented for pitch accent
location and phrase boundary prediction.
Authors:
Susan Fitt, Centre for Speech Technology Research, University of Edinburgh (U.K.)
Stephen Isard, Centre for Speech Technology Research, University of Edinburgh (U.K.)
Page (NA) Paper number 850
Abstract:
This paper reports on work developing an accent-independent lexicon
for use in synthesising speech in English. Developing a lexicon for
a new accent is a long process, and one potential solution to this
problem involves the encoding of regional variation by means of keywords;
so, rather than transcribing different phonemes for 'pool' in RP and
in Scottish accents, we can simply say that the word contains the same
vowel as in the keyword GOOSE. However, there are a number of theoretical
and practical issues, which are discussed here. It is proposed that
phonemic variation within accents be encoded in the lexicon by use
of keyword symbols, while allophonic differences be derived by accent-specific
rules. If we wish to include some stylistic variation this makes the
lexicon more comprehensive but more complex. Finally, it is noted
that even in keyword synthesis exception lists cannot be avoided.
Authors:
Daniel Faulkner, Aculab PLC (U.K.)
Charles Bryant, Aculab PLC (U.K.)
Page (NA) Paper number 91
Abstract:
We present a first version of a filter dictionary for use in a computer-telephony
text-to-speech synthesis system. The aim of the filter dictionary was
to provide a lexicon that was compact, fast and had broader coverage
than the standard dictionary used to create it. Correct phonemic transcriptions
and lexical stress assignment were both required for a transcription
to be deemed accurate. The approach taken here guarantees 100% accurate
coverage of the original dictionary, but also gives 93% accurate transcription
of the expected coverage of novel words. Lexical stress and the phonemic
transcription were retrieved in one pass, resulting in an extremely
fast system. We also allowed user-definition to retain accuracy for
non-standard transcriptions. This algorithm was developed for British
English, but could be applied to other languages.
|