Session Th3D Language-Specific Systems

Chairperson Christel Sorin CNET, Lannion, France

Home

A KEYVOWEL APPROACH TO THE SYNTHESIS OF REGIONAL ACCENTS OF ENGLISH

Authors: Briony Williams and Stephen Isard

Centre for Speech Technology Research University of Edinburgh 80 South Bridge, Edinburgh EH1 1HN, Scotland, UK. Tel. +44 131 650 2790, FAX: +44 131 650 6351, E-mail: briony@cstr.ed.ac.uk

Volume 5 pages 2435 - 2438

ABSTRACT

Most English text-to-speech synthesisers offer one of only two accents: General American or RP. Developing a new accent is laborious, since it is not possible to choose one accent as a base form and systematically translate to others. We use the approach of Wells ([1]), categorising vowels in terms of abstract keywords that encode classes of words. Thus it is unnecessary to use a phonemic transcription in either the development or the execution of a synthesiser. The "keyvowel" system can be used throughout the synthesis system, avoiding the need to make accent-specific changes manually. The same linguistic resources can be re-used for each new accent. More fundamentally, the keyvowel system functions as a meta-accent that subsumes vowel-related information in all accents of English.

A0013.pdf

TOP

Experimental Implementation of Pitch-Synchronous Synthesis Methods for the ROMVOX Text-to-Speech System

Authors: Attila Ferencz (1), Radu Arsinte (1), Istvan Nagy (3), Teodora Ratiu (1), Maria Ferencz (1), Gavril Toderean (2), Diana Zaiu (2), Tünde-Csilla Kovacs (1), Lajos Simon (1)

(1) Software ITC S.A.,109 Gh. Bilascu Street, 3400, Cluj-Napoca, Romania, Phone:+40-64-1976681, Fax: +40-64-196787, E-mail: ferencz@utcluj.ro (2) Technical University of Cluj-Napoca (3) Music Academy "Gh. Dima" of Cluj-Napoca

Volume 5 pages 2439 - 2442

ABSTRACT

The LPC-MPE synthesis method is an alternative method used for obtaining a better quality of the generated vocal signal, that can be easily implemented in vocal signal coding-decoding systems. Using the method in text-to-speech systems is more difficult because of the modification that must be done on the synthesized vocal signal in order to superimpose prosodical effects. This paper presents our steps in this direction, some researches and experimental results obtained for adapting the system to the pitch-synchronous LPC-MPE method.

A0449.pdf

TOP

THE BELL LABS GERMAN TEXT-TO-SPEECH SYSTEM: AN OVERVIEW

Authors: Bernd Mobius, Richard Sproat, Jan P. H. van Santen, Joseph P. Olive

Bell Labs – Lucent Technologies 600 Mountain Avenue, Murray Hill, NJ 07974, USA fbmo,rws,jphvs,jpog@research.bell-labs.com

Volume 5 pages 2443 - 2446

ABSTRACT

In this paper we present an overview of the German version of the Bell Labs text-to-speech system, a high-quality concatenative synthesis system with extensive text analysis capabilities. We discuss problems of text analysis, and our solutions to these problems, including: the integration of text normalization tasks into linguistic text analysis; the capability to morphologically analyze compounds and unseen words; name analysis and pronunciation. We briefly describe the prosodic components of the text-to-speech system and their underlying duration and intonation models. Finally, the phonetically motivated structure of the acoustic inventory is presented.

A0479.pdf

TOP

The Generation of Regional Pronunciations of English for Speech Synthesis

Authors: Susan Fitt

e-mail sue@cstr.ed.ac.uk Centre for Speech Technology Research University of Edinburgh 80 South Bridge Edinburgh UK

Volume 5 pages 2447 - 2450

ABSTRACT

Most speech synthesisers and recognisers for English currently use pronunciation lexicons in standard British or American accents, but as use of speech technology grows there will be more demand for the incorporation of regional accents. This paper describes the use of rules to transform existing lexicons of standard British and American pronunciations to a set of regional British and American accents. The paper briefly discusses some features describes of the regional accents in the project, and the framework used for generating pronunciations. Certain theoretical and practical problems are highlighted; for some of these, solutions are suggested, but it is shown that some difficulties cannot be resolved by automatic rules. However, although the method described cannot produce phonetic transcriptions with 100% accuracy, it is more accurate than using letter-to-sound rules, and faster than producing transcriptions by hand.

A0793.pdf

TOP

BELL LABORATORIES RUSSIAN TEXT-TO-SPEECH SYSTEM

Authors: Elena Pavlova, Yuri Pavlov, Richard Sproat, Chilin Shih, Jan P. H. van Santen

Bell Labs – Lucent Technologies 700 Mountain Avenue, Murray Hill, NJ 07974, USA fyuriy,rws,cls,jphvsg@research.bell-labs.com

Volume 5 pages 2451 - 2454

ABSTRACT

This paper describes the Bell Labs Russian text-to-speech system, a concatenative system with extensive text-analysis capabilities. The construction of Russian-specific modules will be discussed, including the text-analysis module, the acoustic inventory, the duration module, and the intonation module.

A0919.pdf

Recordings

TOP

A BILINGUAL TEXT-TO-SPEECH SYSTEM IN SPANISH AND CATALAN

Authors: Antonio Bonafonte, Ignasi Esquerra, Albert Febrer, Francesc Vallverdu

{antonio,ignasi,alfebrer,sisco}@gps.tsc.upc.es Universitat Politècnica de Catalunya C/Jordi Girona 1-3 08034 Barcelona, SPAIN

Volume 5 pages 2455 - 2458

ABSTRACT

This paper summarises the text-to-speech system that has been developed during the last years in the Speech Group of the Universitat Politècnica de Catalunya (UPC). The paper emphasises the parts of the system which are language dependent: phonetic transcription, prosodic module, and synthesis units database. One particularity of the system is the fact of being bilingual, i.e., the system is able to speak either in Spanish or in Catalan. Some effort has been done to allow the reading of bilingual texts and to reduce the computational resources needed. In particular, the Spanish and Catalan speech databases are merged to reduce the memory requirements and the development effort. The system is being used by disabled people which suffer from oral disorders. In order to give variability to the voices some experiments have been done in voice transformation using the TD-PSOLA algorithm.

A1283.pdf