Session TMb Speech Synthesis Techniques

Chairperson Rolf Carlson KTH, Sweden

Home

OPTIMISING UNIT SELECTION WITH VOICE SOURCE AND FORMANTS IN THE CHATR SPEECH SYNTHESIS SYSTEM

Authors: Wen Ding and Nick Campbell

ATR Interpreting Telecommunications Research Labs. 2-2 Hikaridai, Seika-cho, Soraku-gun, Kyoto 619-02, Japan ding@itl.atr.co.jp

Volume 2 pages 537 - 540

ABSTRACT

High quality corpus-based synthetic speech requires minimization of prosodic and acoustic distortions between an ideal phoneme sequence and the actual waveform segments used to reproduce it. Our synthesis system concatenates phoneme-sized wave- form segments, without signal processing, selected from a large-scale speech database according to both prosodic and phonetic contextual suitability. This paper describes an approach to optimising such unit selection in speech synthesis by using voice source parameters and formant information, instead of selection based on cepstral features. We present results showing that formants and voice source parameters are more effective as acoustic features in the unit selection. These features can be estimated automatically from speech waveforms using the ARX joint estimation method. Results are compared with mel-frequency cepstrum coefficients (MFCC), previously used for unit selection, and both objective and subjective experiments showed that the new features outperformed the previous ones, and confirmed that the synthesized speech sounded much more natural.

Session TMb Speech Synthesis Techniques

Chairperson Rolf Carlson KTH, Sweden

Authors: Wen Ding and Nick Campbell

ATR Interpreting Telecommunications Research Labs. 2-2 Hikaridai, Seika-cho, Soraku-gun, Kyoto 619-02, Japan ding@itl.atr.co.jp

Volume 2 pages 537 - 540

Authors: Masanobu ABE, Hideyuki MIZUNO, Satoshi TAKAHASHI and Shin'ya NAKAJIMA

NTT Human Interface Labs. 1-1 Hikarinooka Yokosuka-Shi Kanagawa 239 Japan Tel: +81 468 59 2547, Fax: +81 468 55 1054, E-mail: ave@nttspch.hil.ntt.co.jp

Volume 2 pages 541 - 544

Authors: Eduardo R. Banga, Carmen García-Mateo and Xavier Fernández-Salgado

Dpto. Tecnologías de las Comunicaciones. ETSI Telecomunicación. Universidad de Vigo. E-36200. Vigo. SPAIN e-mail: erbanga@tsc.uvigo.es carmen@tsc.uvigo.es xsalgado@tsc.uvigo.es

Volume 2 pages 545 - 548

Authors: Shaw-Hwa Hwang*, Sin-Horng Chen@, and Saga Chang*

*E000/CCL, Industrial Technology Research Institute, Chutung, Hsinchu, Taiwan, R.O.C @Department of Communication Engineering, National Chiao Tung University, Taiwan, R.O.C email:hsf@porsche.ccl.itri.org.tw Tel:+886-3-5917255, Fax:+886-3-5820098

Volume 2 pages 549 - 552

Authors: Jan P. H. van Santen 1 Adam L. Buchsbaum 2

1 Lucent Technologies – Bell Labs, 600 Mountain Ave., Murray Hill, NJ 07974, U.S.A., jphvs@research.bell-labs.com 2 AT&T Labs, 180 Park Ave., P.O. Box 971, Florham Park, NJ 07932-0971, U.S.A., alb@research.att.com

Volume 2 pages 553 - 556

Authors: Francisco M. Gimenez de los Galanes and David Talkin

Entropic Research Laboratory, Inc. 600 Pensylvannia Ave. SE, Suite 202. Washington, DC. 20003 Tel. +1 202 547 1420, FAX: +1 202 546 6648, E-mail: galanes@entropic.com

Volume 2 pages 557 - 560

Authors: O. Karaali, G. Corrigan, I. Gerson, and N. Massey

Speech Processing Laboratory Motorola, Inc. 1301 E. Algonquin Rd., Schaumburg, IL 60196, U.S.A. Tel. (847)576-2764, FAX: (847)576-8378, E-mail: karaali@mot.com

Volume 2 pages 561 - 564

Authors: Jesper Hogberg

Department of Speech, Music and Hearing, KTH, S-10044 Stockholm, Sweden Tel. +46 8 790 78 94, FAX: +46 8 790 78 54, E-mail: Jesper.Hogberg@speech.kth.se

Volume 2 pages 565 - 568

Authors: Simon King (1) Thomas Portele Florian Hofer

Volume 2 pages 569 - 572

Authors: I. Trancoso and M. C. Viana

(1) INESC/IST, (2) CLUL INESC, R. Alves Redol, 9, 1000 Lisbon, Portugal. Tel. +351 1 3100268, FAX: +351 1 3145843, E-mail: Isabel.Trancoso@inesc.pt

Volume 2 pages 573 - 576

Authors: T. Rietveld (I), J. Kerkhoff (I), M.J.W.M. Emons (2), E.J. Meijer (2), A.A. Sanderman (2) A.M.C. Sluijter (2).

(1) University of Nijmegen, the Netherlands, Erasmusplein 1, 6525 HT Nijmegen, The Netherlands, Tel. +31 24 3612905, E-mail: a.rietveld@let.kun.nl (2) KPN Research, Leidschendam, the Netherlands

Volume 2 pages 577 - 580

Authors: Bianca Angelini (*) , Claudia Barolo (**) , Daniele Falavigna (*) , Maurizio Omologo (*) and Stefano Sandri (***)

(*) IRST - Istituto per la Ricerca Scientifica e Tecnologica, 38050 Povo di Trento, Italy (**) Eikon Informatica, Via Sostegno 65/bis, 10146 Torino, Italy (***) CSELT - Centro Studi e Laboratori Telecomunicazioni S.p.A., Via G. Reiss Romoli 274, 10148 Torino, Italy

Volume 2 pages 581 - 584

Authors: Eric Keller

Laboratoire d'analyse informatique de la parole (LAIP) Faculte des Lettres, Universite de Lausanne, Switzerland eric.keller @ imm.unil.ch

Volume 2 pages 585 - 588

Authors: Georg Fries and Antje Wirth

Deutsche Telekom Berkom GmbH Forschungsgruppe Sprachverarbeitung Am Kavalleriesand 3, D-64295 Darmstadt, Germany E-mail: {friesg, wirth;@tzd.telekom.de

Volume 2 pages 589 - 592

Authors: M. Edgington

Speech Technology Unit Applied Research and Technology BT Laboratories, IPSWICH IP5 3RE, UK E-mail: mike.edgington@bt-sys.bt.co.uk

Volume 2 pages 593 - 596

Authors: Luis Miguel Teixeira de Jesus and Gavin C. Cawley

School of Information Systems, University of East Anglia, Norwich, U.K. E-mail: flmj,gccg@sys.uea.ac.uk

Volume 2 pages 597 - 600

Authors: Alan W Black and Paul Taylor

Centre for Speech Technology Research, University of Edinburgh, 80, South Bridge, Edinburgh, U.K. EH1 1HN http://www.cstr.ed.ac.uk email: awb@cstr.ed.ac.uk, Paul.Taylor@ed.ac.uk

Volume 2 pages 601 - 604

Authors: Li Jiang, Hsiao-Wuen Hon and Xuedong Huang

Microsoft Research One Microsoft Way Redmond, Washington 98052, USA

Volume 2 pages 605 - 608

Authors: MyungJin BAE, KyuHong KIM and WonCheol LEE

Dept. of Telecommunication Engr, Soongsil University, Seoul 156-743, Korea mjbae@saint.soongsil.ac.kr

Volume 2 pages 609 - 612

Authors: Yannis Stylianou, Thierry Dutoit, and Juergen Schroeter

AT&T Labs-Research 180 Park Ave, PO Box 971 Florham Park, NJ 07932-0971 email: [styliano, dutoit, jsh]@research.att.com

Volume 2 pages 613 - 616

Authors: Shaw-Hwa Hwang, Sin-Horng Chen@, and Saga Chang

Authors: Bianca Angelini (*) , Claudia Barolo (**) , Daniele Falavigna () , Maurizio Omologo () and Stefano Sandri (***)

() IRST - Istituto per la Ricerca Scientifica e Tecnologica, 38050 Povo di Trento, Italy () Eikon Informatica, Via Sostegno 65/bis, 10146 Torino, Italy () CSELT - Centro Studi e Laboratori Telecomunicazioni S.p.A., Via G. Reiss Romoli 274, 10148 Torino, Italy