Home


 
TABLE OF CONTENTS
 
VOLUME 1
 
KEYNOTE SPEECHES

 

Keynote Speech 1: Is Syntactic Structure Prosodically Recoverable? KN-1

Speaker: Mario Rossi, Institut de Phonetique d'Aix-en-Provence, Laboratoire Parole et Langage, France

Chair: Louis Pols, ESCA, Univ. of Amsterdam, The Netherlands

 

Keynote Speech 2: Conversational Interfaces: Advances and Challenges KN-9

Speaker: Victor Zue, MIT, USA

Chair: Paul Daalsgard, Aalborg University, Denmark

 

Keynote Speech 3: Prosodic Modelling in Text-to-Speech Synthesis KN-19

Speaker: Jan P. H. van Santen, Bell Labs-Lucent Technologies, USA

Chair: Hiroya Fujisaki, Science Univ. of Tokyo, Japan

 

Keynote Speech 4: Robust Speech Recognition: Review and Perspectives

Chair: Joseph Mariani, LIMSI-CNRS, France

 

Speech 4A (8:35-9:00): Impact of the Unknown Communication Channel on Automatic Speech Recognition: A Review KN-29

Speaker: Jean-Claude Junqua, Panasonic Technologies Inc., USA

 

Speech 4B (9:00-9:25): Statistical Techniques for Robust ASR: Review and Perspectives KN-33

Speaker: Jerome Bellegarda, Apple Computer, USA

 

Speech 4C (9:25-9:50): Using Missing Feature Theory to Actively Select Features for Robust Speech Recognition with Interruptions, Filtering and Noise KN-37

Speaker: Richard Lippmann, Lincoln Laboratory MIT, USA

 

Keynote Speech 5: Perspectives of Speech Technology Research Highlighted in Eurospeech '97

Speaker: Wolfgang Hess, Univ. of Bonn, Germany

Chair: George Kokkinakis, WCL, Univ. of Patras, Greece

 

SESSION: M4A

Acoustic Modelling I

Chair: Roger Moore, DRA, UK

 

M4A.1 Using Multiple Time Scales in a Multi-Stream Speech Recognition System 3

Stephane Dupont, *Herv? Bourlard

FPMs-TCTS, Belgium

*IDIAP, Switzerland

 

M4A.2 Speech Recognition Using HMM-State Confusion Characteristics 7

Wakita Yumi, Harald Singer, Yoshinori Sagisaka

ATR Interpreting Telecommunications Res. Labs., Japan

 

 

M4A.3 Bottom-up and Top-down State Clustering for Robust Acoustic Modeling 11

Cristina Chesta, Pietro Laface, *Franco Ravera

Politecnico di Torino, Italy

*Centro Studi e Laboratori Telecomunicazioni (CSELT), Italy

 

M4A.4 Comparison of Optimization Methods for Discriminative Trainining Criteria 15

Ralf Schluter, W. Macherey, S. Kanthak, Hermann Ney, Lutz Welling

RWTH Aachen, Germany

 

M4A.5 Clustering Beyond Phoneme Contexts for Speech Recognition 19

Clark Z. Lee, Douglas O'Shaughnessy

INRS Telecommunications, Canada

 

M4A.6 Influence of Outliers in Training the Parametric Trajectory Models for Speech Recognition 23

Rathinavelu Chengalvarayan

Bell Labs-Lucent Technologies, USA

 

SESSION: M4B

Dynamic Articulatory Measurements

Chair: René Carre, ENST, France

 

M4B.1 Adaptation of Natural Articulatory Movements

to the Control of the Command Parameters of a Production Model 27

Laurence Candille, Henri Meloni

CERI, France

 

M4B.2 Three-Dimensional Coarticulatory Strategies of Tongue Movement 31

Maureen Stone, *Andrew Lundberg, *Edward Davis, Rao Gullapalli, Moriel NessAiver

Univ. of Maryland, USA

*John Hopkins Univ., USA

 

M4B.3 From Laryngographic and Acoustic Signals to Voicing Gestures 35

Nathalie Parlangeau, Regine André-Obrecht

IRIT - Equipe IHMPT, France

 

M4B.4 Ultrasonographic Measurement of Cricothyroid Space in Speech 39

Erkki Vilkman, Raija Takalo, Maatta Taisto, *Anne-Maria Laukkanen, Jaana Nummenranta, Lipponen Tero

Univ. of Oulu, Finland

*Univ. of Tampere, Finland

 

M4B.5 Coarticulation and Articulatory Compensations Studied by Dynamic MRI 43

Didier Demolin, M. George, V. Lecuit, *T. Metens, /‡A. Soquet, †H. Raeymaekers

Univ. Libre de Bruxelles, Belgium

*Magnetic Resonance Unit,Hopital Erasme

†Philips Medical Systems, Belgium

‡Collaborateur Scientifique-FNRS

 

 

M4B.6 Determining Tongue Articulation: From Discrete Fleshpoints to Continuous Shadow 47

Pierre Badin, Enrico Baricchi, Anne Vilain

Institut de la Communication Parleé, France

 

SESSION: M4C

Language Identification

Chair: Marc Zissman, MIT Lincoln Laboratory, USA

 

M4C.1 Predicting, Diagnosing and Improving Automatic Language Identification Performance 51

Marc A. Zissman

MIT Lincoln Laboratory, USA

 

M4C.2 Language Identification with Language-Independent Acoustic Models 55

Cristobal Corredor-Ardoy, Jean Luc Gauvain, Martine Adda-Decker, Lori Lamel

LIMSI, France

 

M4C.3 Bayesian Methods for Language Verification 59

Eluned S. Parris, Harvey Lloyd-Thomas, Michael Carey, *Jerry H. Wright

Ensigma Ltd, UK

*Bristol Univ., UK.

 

M4C.4 Use of Recurrent Network for Unknown Language Rejection in Language Identification System 63

Hingkeung Kwan, Keikichi Hirose

Univ. of Tokyo, Japan

 

M4C.5 Language-Identification Based on Cross-Language Acoustic Models and Optimized Information Combination

Ove Andersen, Paul Dalsgaard 67

Aalborg Univ., Denmark

 

M4C.6 Phonetic-Context Mapping in Language Identification 71

Jiri Navratil, Werner Zuehlke

Technische Univ. Ilmenau, Germany

 

SESSION: M4D

Neural Networks for Speech and Language Processing

Chair: Wolfgang Hess, Univ. of Bonn, Germany

 

M4D.1 Discriminative Feature and Model Design for Automatic Speech Recognition 75

Mazin Rahim, Yoshua Bengio, Yann LeCun

AT&T Labs-Research, USA

 

M4D.2 Large Vocabulary Speech Recognition with Context Dependent MMI-Connectionist/HMM Systems Using the WSJ Database 79

Jörg Rottland, Christoph Neukirchen, Daniel Willett, Gerhard Rigoll

Gerhard-Mercator-Univ. Duisburg, Germany

 

M4D.3 Automatic Selection of Segmental Acoustic Parameters by Means of Neural-Fuzzy Networks for Reordering in N-best HMM Hypotheses 83

Thierry Moudenc, Guy Mercier

France Telecom, France

 

M4D.4 Comparison Results for Segmental Training Algorithms for Mixture Density HMMS 87

Mikko Kurimo

Helsinki Univ. of Technology, Finland

 

M4D.5 A Connectionist Approach to Machine Translation

Asuncion Castano, *Francisco Casacuberta 91

Universität Jaume I, Spain

*Univ. Politecnica de Valencia, Spain

 

M4D.6 Continuous Speech Recognition Using a Context Sensitive ANN and HMM2s 95

Nicolas Pican, Jean-Francois Mari, Dominique Fohr

CRIN-CNRS & INRIA Lorraine, France

 

SESSION: MAA

Training Techniques, Efficient Decoding in ASR

Chair: Jerome Bellegarda, Apple Computer, USA

 

MAA.1 Acoustic Modeling Based on the MDL Principle for Speech Recognition 99

Koichi Shinoda, Takao Watanabe

NEC Corporation, Japan

 

MAA.2 Discriminative Utterance Verification Using Multiple Confidence Measures 103

Piyush Modi, Mazin Rahim

AT&T, USA

 

MAA.3 Subspace Distribution Clustering for Continuous Observation Density Hidden Markov Models 107

Enrico Bocchieri, Brian Mak

AT&T Labs-Research, USA

 

MAA.4 A Comparative Study of Methods for Phonetic Decision-Tree State Clustering 111

H.J. Nock, M.J.F. Gales, Steve Young

Cambridge Univ., UK

 

MAA.5 Comparing Gaussian and Polynomial Classifica-tion in SCHMM-Based Recognition Systems 115

Alfred Kaltenmeier, Jurgen Franke

Daimler Benz AG, Germany

 

MAA.6 Maximum Likelihood Successive State Splitting Algorithm for Tied-Mixture HMNET 119

AlexAndré Girardi, Harald Singer, Kiyohiro Shikano, Satoshi Nakamura

Nara Institute of Science and Technology, Japan

 

MAA.7 String-Level MCE for Continuous Phoneme Recognition 123

Erik McDermott, Shigeru Katagiri

ATR Interpreting Telecommunications Res. Labs., Japan

 

MAA.8 HMM State Clustering Across Allophone Class Boundaries 127

Ze'ev Rivlin, Ananth Sankar, Harry Bratt

SRI International, USA

 

MAA.9 Weighted Determinization and Minimization for Large Vocabulary Speech Recognition 131

Mehryar Mohri, Michael Riley

AT&T Labs-Research, USA

 

 

MAA.10 Parallel Speech Recognition 135

Steven Phillips, Anne Rogers

AT&T Labs-Research, USA

 

MAA.11 Fast Likelihood Computation Methods for Continuous Mixture Densities in Large Vocabulary Speech Recognition 139

Stefan Ortmanns, Thorsten Firzlaff, Hermann Ney

RWTH Aachen Univ. of Technology, Germany

 

MAA.12 A Static Lexicon Network Representation for Cross-Word Context Dependent Phones 143

Kris Demuynck, Jacques Duchateau, Dirk Van Compernolle KU Leuven-ESAT, Belgium

 

MAA.13 Decision-Tree Based Quantization of the Feature Space of a Speech Recognizer 147

Mukund Padmanabhan, L.R. Bahl, D. Nahamoo, De Souza Peter

IBM, USA

 

MAA.14 Sub-Vector Clustering to Improve the Memory

and Speed Performance of the Acoustic Likelihood Computation 151

Mosur Ravishankar, *R. Bisiani, E. Thayer

Carnegie Mellon Univ., USA

*Univ. of Milan, Italy

 

MAA.15 The Incorporation of Path Merging in a Dynamic Network Recogniser 155

Simon Hovell

BT Laboratories, UK

 

MAA.16 Improvement on Connected Digits Recognition Using Duration Constraints in the Asynchronous Decoding Scheme 159

Miroslav Novak

IBM, USA

 

MAA.17 Explicit Word Error Minimization in N-Best List Rescoring 163

Andreas Stolcke, Yochai Konig, Mitchel Weintraub

SRI International, USA

 

MAA.18 Efficient 2-Pass N-Best Decoder 167

Long Nguyen, Richard Schwartz

BBN Systems and Technologies, USA

 

MAA.19 A Memory Management Method for a Large Word Network 171

Tomohiro Iwasaki, Yoshiharu Abe

Mitsubishi Electric Corporation, Japan

 

SESSION: MAB

Prosody

Chair: Nick Campbell, ATR, Japan

 

MAB.1 Persistence of Prosodic Features Between Dialectal and Standard Italian Utterances in Six Sub-Varieties of a Region of Southern Italy (Salento): First Assessments of

the Results of a Recognition Test and an Instrumental Analysis 175

Antonio Romano

Univ. Stendhal, France

 

 

MAB.2 Improving the Phonetic Annotation by Means of Prosodic Phrasing 179

Halewijn Vereecken, *Annemie Vorstermans, Jean-Pierre Martens, *Bert Van Coile

Univ. of Gent, Belgium

*L&H, Belgium

 

MAB.3 A Descriptive Study of Prosodic Phenomena in MPUR (West Papuan Phylum) 183

Cecilia Ode

Leiden Univ., The Netherlands

 

MAB.4 Automated Quantitative Analysis of Fo Contours of Utterances From a German ToBI-Labeled Speech

Database 187

HansJörg Mixdorff, *Hiroya Fujisaki

TU Dresden, Germany

*Science Univ. of Tokyo, Japan

 

MAB.5 Identification and Automatic Generation of Prosodic Contours for Text-to-Speech Synthesis System in

French 191

De Tournemire Stephanie

France Telecom, France

 

MAB.6 Quantitative Analysis and Formulation of Tone Concatenation in Chinese FO Contours 195

Jin-Fu Ni, Ren-Hua Wang, *Keikichi Hirose

Univ. of Science and Technology of China, ROChina

*Univ. of Tokyo, Japan

 

MAB.7 An Environment for the Labelling and Testing of Melodic Aspects of Speech 199

Christel Brindopke, Arno Pahde, Franz Kummert, Gerhard Sagerer

Univ. of Bielefeld, Germany

 

MAB.8 PROPAUSE: A Syntactico-Prosodic System Designed to Assign Pauses 203

David Casacuberta, Lourdes Aguilar, Rafael Marin

Universität Autonoma de Barcelona, Spain

 

MAB.9 Integrated Dialog Act Segmentation and Classification Using Prosodic Features and Language Models 207

Volker Warnke, *Ralf Kompe, Heinrich Niemann, Elmar Noeth

Univ. of Erlangen-Nüremburg, Germany

*Sony International (Europe), Germany

 

MAB.10 Evaluation of Prosodic Characteristics in Retold Stories in Dutch by Means of Semantic Scales 211

Monique E. van Donzel, Florien J. Koopmans-van Beinum

Univ. of Amsterdam, The Netherlands

 

MAB.11 Text-to-Intonation in Spontaneous Swedish 215

Gosta Bruce, Marcus Filipsson, Johan Frid, *Björn Granström, *Kjell Gustafson, Merle Horne, David House

Lund Univ., Sweden

*KTH, Sweden

 

MAB.12 Synthesing Attitudes with Global Rhythmic and Intonation Contours 219

Yann Morlec, Gerard Bailly, Veronique Auberge

Institut de la Communication Parleé, France

 

 

MAB.13 Prosody-Particle Pairs as Discourse Control

Signs 223

Dafydd Gibbon, Claudia Sassen

Univ. of Bielefeld, Germany

 

MAB.14 Focus Detection with Additional Information of Phrase Boundaries and Sentence Mode 227

Anja Elsner

Univ. of Bonn,Germany

 

MAB.15 The Role of Prosody in Infants' Native-Language Discrimination Abilities: The Case of Two Phonologically Close Languages 231

Laura Bosch, Nuria Sebastian-Galles

Universität de Barcelona, Spain

 

MAB.16 Prosodic Cycles and Interpersonal Synchrony in American English and Swedish 235

Eugene H. Buder, *Anders Eriksson

Univ. of Memphis, USA

*Umeå Univ., Sweden

 

MAB.17 Relating Prosody to Syntax: Boundary Signalling in Swedish 239

Eva Strangert

Umeå Univ., Sweden

 

MAB.18 On Representation of Fundamental Frequency of Speech for Prosody Analysis Using Reliability Function

Mitsuru Nakai, Hiroshi Shimodaira 243

Japan Advanced Institute of Science and Technology, Japan

 

MAB.19 Efficient Method of Establishing Words Tone Dictionary for Korean TTS System 247

Seong-hwan Kim, Jin-young Kim

Chonnam National Univ., South Korea

 

MAB.20 Perception of Questions and Statements in Neapolitan Italian 251

D’Imperio Mariapaola *David House

Ohio State University, USA

*Lund Univ., Sweden

 

SESSION: T1A

Keyword and Topic Spotting

Chair: Joseph Mariani, LIMSI-CNRS, France

 

T1A.1 Key-Phrase Spotting Using An Integrated Lan-guage Model of N-Grams and Finite-State Grammar 255

Lin Qiguang, Dave Lubensky, Michael Picheny, P. Srinivasa Rao

IBM, USA

 

T1A.2 Efficient Methods for Detecting Keywords in Continuous Speech 259

Jochen Junkawitsch, *Günther Ruske, Harald Höge

Siemens AG, Germany

*Munich Univ.of Technology, Germany

 

T1A.3 Providing Sublexical Constraints for Word Spotting Within the ANGIE Framework 263

Raymond Lau, Stephanie Seneff

MIT Laboratory for Computer Science, USA

 

 

T1A.4 Usefulness of Phonetic Parameters in a Rejection Procedure of an HMM Based Speech Recognition System

Katarina Bartkova, Denis Jouvet 267

France Telecom, France

 

T1A.5 Keyword Spotting Using FO Contour Matching

Yoichi Yamashita, *Riichiro Mizoguchi 271

Ritsumeikan Univ., Japan

*Osaka Univ., Japan

 

T1A.6 A Frame and Segment Based Approach for Topic Spotting 275

Elmar Noeth, Stefan Harbeck, Heinrich Niemann, Volker Warnke

Univ. of Erlangen-Nüremburg, Germany

 

SESSION: T1B

Robustness in Recognition and Signal Processing I

Chair: Helmut Mangold, Daimler Benz, Germany

 

T1B.1 Cyclic Autocorrelation-Based Linear Prediction Analysis of Speech 279

K.K. Paliwal, Yoshinori Sagisaka

ATR Interpreting Telecommunications Res. Labs., Japan

 

T1B.2 Novel Filler Acoustic Models for Connected Digit Recognition 283

Ilija Zeljkovic, Shrikanth Narayanan

AT&T Labs-Research, USA

 

T1B.3 A Non-Iterative Model-Adaptive E-CMN/PMC Approach for Speech Recognition in Car Environments

Makoto Shozakai, Satoshi Nakamura, Kiyohiro Shikano

Nara Institute of Science and Technology, Japan 287

 

T1B.4 Discriminative Feature Extraction for Speech Recognition in Noise 291

la Torre Angel de, Antonio M. Peinado, Antonio J. Rubio, Pedro Garcia

Universidad de Granada, Spain

 

T1B.5 Noise Robust Recognition Using Feature Selective Modeling 295

Michael K. Brendborg, Borge Lindberg

Aalborg Univ., Denmark

 

T1B.6 Mixture Input Transformations for Adaptation of Hybrid Connectionist Speech Recognizers 299

Victor Abrash

SRI International, USA

 

SESSION: T1C

Modelling of Prosody

Chair: Hiroya Fujisaki, Science Univ. of Tokyo, Japan

 

T1C.1 Metrical Representation of Demarcation and Constituency in Long Noun Phrases 303

Christos Malliopoulos, *George Mikros

National Technical Univ. of Athens, Greece

*ILSP, Greece

 

 

T1C.2 A System of Stylized Intonation Contours for

German 307

Hannes Pirker, *Kai Alter, Erhard Rank, John Matiasek, Harald Trost, †Germnot Kubin

Austrian Research Institute for Artificial Intelligence (OFAI), Austria

*Max-Planck-Institute of Cognitive Neuroscience, Germany

†Vienna Univ. of Technology, Austria

 

T1C.3 A Method of Representing Fundamental Frequency Contours of Japanese Using Statistical Models of Moraic Transition 311

Keikichi Hirose, Kouji Iwano

Univ. of Tokyo, Japan

 

T1C.4 Modeling Arbitrarily Long Sentence-Spanning F0 Contours by Parametric Concatenation of Word-Spanning Patterns 315

Stavroula-Evita Fotinea, *Michael Vlahakis, *George Carayannis

National Technical Univ. of Athens, Greece

*ILSP, Greece

 

T1C.5 Strong Interaction Between Factors Influencing Consonant Duration 319

R. J. J. H. van Son, *Jan P.H. van Santen

Univ. of Amsterdam, The Netherlands

*Bell Labs-Lucent Technologies, USA

 

T1C.6 Speech Timing in Slovenian TTS 323

Jerneja Gros, Nikola Pavesic, France Mihelic

Univ. of Ljubljana, Slovenia

 

SESSION: T1D

Microphone Arrays for Speech Enhancement

Chair: Alan Bradley, RMIT, Australia

 

T1D.1 Small Microphone Arrays with Optimized Directivity for Speech Enhancement 327

Matthias Dorbecker

Aachen Univ. of Technology, Germany

 

T1D.2 Microphone Array Design Measures for Hands-Free Speech Recognition 331

Masaaki Inoue, Satoshi Nakamura, Takeshi Yamada, Kiyohiro Shikano

Nara Institute of Science and Technology, Japan

 

T1D.3 Noise Reduction by Paired Microphones 335

Masato Akagi, Mitsunori Mizumachi

Japan Advanced Institute of Science and Technology, Japan

 

T1D.4 A Microphone Array for Speech Enhancement Using Multiresolution Wavelet Transform 339

Djamila Mahmoudi

LTS-DE, EPFL, Switzerland

 

T1D.5 A Two-Channel Adaptive Microphone Array with Target Tracking 343

Yoshifumi Nagata, Hiroyuki Tsuboi

Toshiba Corporation, Japan

 

 

T1D.6 Use of Different Microphone Array Configurations for Hands-Free Speech Recognition in Noisy and Reverberant Environment 347

Diego Giuliani, Marco Matassoni, Maurizio Omologo, Piergiorgio Svaizer

Istituto per la Ricerca Scientifica e Tecnologica (IRST), Italy

 

SESSION: T2A

Multilingual Recognition

Chair: Richard Lippman, MIT Lincoln Lab., USA

 

T2A.1 YINHE: A Mandarin Chinese Version of the Galaxy System 351

Chao Wang, James R. Glass, Helen Meng, Joe Polifroni, Stephanie Seneff, Victor Zue

MIT Laboratory for Computer Science, USA

 

T2A.2 Multilingual Speech Recognition for Flexible Vocabularies 355

Patrizia Bonaventura, *Filippo Gallocchio, †Giorgio Micca

Centro Studi e Laboratori Telecomunicazioni (CSELT), Consultant, Italy

†Centro Studi e Laboratori Telecomunicazioni (CSELT), Turin, Italy

*Univ.di Padova, Italy

 

T2A.3 A Study of Multilingual Speech Recognition 359

Fuliang Weng, Harry Bratt, Leonardo Neumeyer, Andreas Stolcke

SRI International, USA

 

T2A.4 Multilingual Speech Recognition: The 1996 Byblos Callhome System 363

Jayadev Billa, Kristine Ma, John W. McDonough, George Zavaliagkos, David R. Miller, Kenneth N. Ross, Amro El-Jaroudi

BBN Systems and Technologies, USA

 

T2A.5 Japanese LVCSR on the Spontaneous Scheduling Task with JANUS-3 367

Tanja Schultz, Detlef Koll, Alex Waibel

Univ. Karlsruhe, Germany

 

T2A.6 Fast Bootstrapping of LVCSR Systems with Multilingual Phoneme Sets 371

Tanja Schultz, Alex Waibel

Univ. Karlsruhe, Germany

 

SESSION: T2B

Language Specific Speech Analysis

Chair: John J. Ohala, Univ. of California, USA

 

T2B.1 Factors of Variation in the Production of the German Dorsal Fricative 375

Bernd Pompino-Marschall, Christine Mooshammer

Center for General Linguistics, Berlin, Germany

 

T2B.2 EPG and Aerodynamic Evidence for the Copro-duction and Coarticulation of Clicks in ISIZULU 379

Kimberly Thomas

Univ. of California, USA

 

T2B.3 Formant Trajectory Dynamics in Swabian

Diphthongs 383

Anja Geumann

Muenchen Univ., Germany

 

T2B.4 The Gestural Organization of Vowels and Consonants: A Cinefluorographic Study of Articulator Gestures in Greenlandic 387

Sidney A.J. Wood

Univ. of Lund, Sweden

 

T2B.5 The Perception of Coronals in Western Arrernte

Victoria B. Anderson 389

Univ. of California, USA

 

T2B.6 Acoustic Modelling of American English /r/ 393

Carol Y. Espy-Wilson, *Shrikanth Narayanan, Suzanne E. Boyce, †Abeer Alwan

Boston Univ., USA

*AT&T Labs-Research, USA

†Univ. of California, USA

 

SESSION: T2C

Feature Estimation I

Chair: Paul Daalsgard, Aalborg University, Denmark

 

T2C.1 Acoustic Parameters Optimised for Recognition of Phonetic Features 397

Anya Varnich Hansen

Aalborg Univ., Denmark

 

T2C.2 Heterogeneous Acoustic Measurements for Phonetic Classification 401

Andrew K. Halberstadt, James R. Glass

MIT, USA

 

T2C.3 Cepstral-Time Matrices and LDA for Improved Connected Digit and Sub-Word Recognition Accuracy

Ben Milner 405

BT Laboratories, UK

 

T2C.4 Data-Driven Design of Rasta-Like Filters 409

Van Vuuren Sarel , / *Hynek Hermansky

Oregon Graduate Institute of Science and Technology, USA

*International Computer Science Institute, USA

 

T2C.5 Evaluating Feature Set Performance Using the F-Ratio and J-Measures 413

Simon Nicholson, *Ben Milner, Stephen Cox

UEA, UK

*BT Laboratories, UK

 

T2C.6 Robust Speech Parameters Located in the Frequency Domain 417

Javier Hernando, Climent Nadeu

Universität Politecnica de Catalunya, Spain

 

SESSION: T2D

Speech Coding I

Chair: Isabel Trancoso, INESC, Portugal

 

T2D.1 A Simple and Efficient Algorithm for the Compression of MBROLA Segment Databases 421

van der Vrecken Olivier, Nicolas Pierret, Thierry Dutoit, Vincent Pagel, Fabrice Malfrere

TCTS Lab, Belgium

 

T2D.2 A Segmental Formant Vocoder Based On Linearly Varying Mixtures of Gaussians 425

Parham Zolfaghari, Tony A. Robinson

Cambridge Univ., UK

 

T2D.3 Voice Mimic System Using Articulatory Codebook for Estimation of Vocal Tract Shape 429

Samir Chennoukh, Daniel Sinder, Gael Richard, James Flanagan

CAIP Center, Rutgers Univ., USA

 

T2D.4 Adaptive Transform Coding for Linear Predictive Residual 433

Damith J. Mudugamuwa, Alan B. Bradley

RMIT, Australia

 

T2D.5 Performance Evaluation of Objective Quality Measures for Coded Speech 437

Akira Takahashi, Nobuhiko Kitawaki, *Paolino Usai, †David Atkinson

NTT, Human Interface Labs, Japan

*Centro Studi e Laboratori Telecomunicazioni (CSELT), Italy

†NTIA, USA

 

T2D.6 Between Recognition and Synthesis - 300 Bits/ Second Speech Coding 441

Mohamed Ismail, Keith Ponting

Speech Research Unit, UK

 

SESSION: TMA

Feature Estimation II, Pitch and Prosody

Chair: Egidio Giachin, Centro Studi e Laboratori Telecomunicazioni (CSELT), Italy

 

TMA.1 A Modified Zero-Crossing Method for Pitch Detection in Presence of Interfering Sources 445

Francois Gaillard, Frederic Berthommier, *Gang Feng, *Jean-Luc Schwartz

ICP/INPG, France

*Institut de la Communication Parleé, France

 

TMA.2 Using Simulated Annealing Expectation Maximization Algorithm for Hidden Markov Model Parameters Estimation 449

Jacques Simonin, Chafic Mokbel

France Telecom, France

 

TMA.3 Covariation of Subglottal Pressure, F0 and Glottal Parameters 453

Gunnar Fant, *Stellan Hertegard, Anita Kruckenberg, Johan Liljencrants

KTH, Sweden

*Karolinska Hospital and Huddinge Univ. Hospital, Sweden

 

TMA.4 The Fractal Behaviour of Unvoiced Plosives: A Means for Classification 457

Anastasios Delopoulos, Maria Rangoussi

NTUA, Greece

 

TMA.5 A Method for Analysis of the Local Speech Rate Using an Inventory of Reference Units 461

Sumio Ohno, Hiroya Fujisaki, Hideyuki Taguchi

Science Univ. of Tokyo, Japan

 

TMA.6 Analysis and Modeling of Fundamental Frequency Contours of Greek Utterances 465

Hiroya Fujisaki, Sumio Ohno, Takashi Yagi

Science Univ. of Tokyo, Japan

 

 

TMA.7 Characteristics of Slow, Average and Fast Speech and their Effects in Large Vocabulary Continuous Speech Recognition 469

Fernando Martinez, Daniel Tapias, Jörge Alvarez, Paloma Leon

Telefonica I+D, Spain

 

TMA.8 Analysis of Children's Speech: Duration, Pitch, and Formants 473

Sungbok Lee, Alexandros Potamianos, Shrikanth Narayanan

AT&T Labs-Research, USA

 

TMA.9 A Method of Measuring Formant Frequencies at High Fundamental Frequencies 477

Hartmut Traunmuller, *Anders Eriksson

Stockholm Univ., Sweden

*Univ. of Umeå, Sweden

 

TMA.10 Analysis of Speaking Rate Variation in Stress-Timed Languages 481

Tom Brondsted, Jens Printz Madsen

Aalborg Univ., Denmark

 

TMA.11 Automatic Identification of the Phoneme Boundaries Using a Mixed Parameter Model 485

Paul Micallef, *Ted Chilton

Univ. of Malta, Malta

*Univ. of Surrey, UK

 

TMA.12 Pitch Detection Reliability Assessment for Forensic Applications 489

Sergey Koval, Veronika Bekasova, Michael Khitrov, Andrey Raev

St Petersburg State Univ., Russia

 

TMA.13 Efficient Estimation of Perceptual Features for Speech Recognition 493

Zhihong Hu, Etienne Barnard

Oregon Graduate Institute, USA

 

TMA.14 Towards Decomposing the Sources of Variability in Speech 497

Narendranath Malayath, Hynek Hermansky, Alexander Kain

Oregon Graduate Institute of Science and Technology, USA

 

 

TMA.15 Use of Vector-Valued Dynamic Weighting Coefficients for Speech Recognition: Maximum Likelihood Approach 501

Rathinavelu Chengalvarayan

Bell Labs-Lucent Technologies, USA

 

TMA.16 Automatic Segmentation: Data-Driven Units of Speech 505

S.W. Beet, L. Baghai-Ravary

Aculab plc, UK

 

TMA.17 On Robust Time-Varying AR Speech Analysis Based on T-Distribution 509

Dejan Bajic

Institute of Applied Mathematics and Electronics, Yugoslavia

 

TMA.18 A Simple Phoneme Energy Model for the Greek Language and its Application to Speech Recognition 513

Dimitris Tambakas, Iliana Tzima, Nikos Fakotakis, George Kokkinakis

Univ. of Patras, Greece

 

TMA.19 A Macroscopic Analysis of an Emotional Speech Corpus 517

James E.H. Noad, Sandra P. Whiteside, Phil Green

Univ. of Sheffield, UK

 

TMA.20 Restoration of Pitch Pattern of Speech Based on a Pitch Generation Model 521

Hiroshi Shimodaira, Mitsuru Nakai, Akihiro Kumata

Japan Advanced Institute of Science and Technology, Japan

 

TMA.21 The Research of Correlation Between Pitch and Skin Galvanic Reaction at Change of Human Emotional State 525

A.V. Agranovski, O.Y. Berg, D.A. Lednov

Spetsvuzavtomatika Design Bureau, Russia

 

TMA.22 K-NN Versus Gaussian in HMM-Based Recognition System 529

Claude Montacie, Marie-Jose Caraty, Fabrice Lefèvre

Univ. Pierre et Marie Curie - CNRS, France

 

TMA.23 Spectral Methods for Voice Source Parameters Estimation 533

Boris Doval, Christophe D'Alessandro, Benoit Diard

LIMSI-CNRS, France

 
VOLUME 2
 
 

SESSION: TMB

Speech Synthesis Techniques

Chair: Rolf Carlson, KTH, Sweden

 

TMB.1 Optimising Unit Selection with Voice Source and Formants in the CHATR Speech Synthesis System

Wen Ding, Nick Campbell 537

ATR Interpreting Telecommunications Res. Labs., Japan

 

TMB.2 A New Framework to Provide High-Controlla-bility Speech Signal and the Development of a Workbench

for it 541

Masanobu Abe, Hideyuki Mizuno, Satoshi Takahashi, Shin'ya Nakajima

NTT, Japan

 

TMB.3 Shape-Invariant Prosodic Modification Algorithm for Concatenative Text-to-Speech Synthesis 545

Eduardo R. Banga, Carmen Garcia-Mateo, Xavier Fernandez-Salgado

Univ. of Vigo, Spain

 

TMB.4 An RNN-Based Spectral Information Generation for Mandarin Text-to-Speech 549

Hwang Shaw-Hwa, *Sin-Horng Chen, Chang Saga

Industrial Technology Research Institute (ITRI), Taiwan

*NCTU, Taiwan

 

TMB.5 Methods for Optimal Text Selection 553

Jan P.H. van Santen, *Adam L. Buchsbaum

Bell Labs-Lucent Technologies, USA

*AT&T Labs, USA

 

TMB.6 High Resolution Prosody Modification for Speech Synthesis 557

Francisco M. Gimenez de los Galanes, David Talkin

Entropic Research Lab, USA

 

TMB.7 Text-To-Speech Conversion with Neural Networks: A Recurrent TDNN Approach 561

Orhan Karaali, Gerald Corrigan, Ira Gerson, Noel Massey

Motorola, USA

 

TMB.8 Data Driven Formant Synthesis 565

Jesper Hogberg

KTH, Sweden

 

TMB.9 Speech Synthesis Using Non-Uniform Units in the Verbmobil Project 569

Simon King, *Thomas Portele, *Florian Hofer

Univ. of Edinburgh, UK

*Univ. of Bonn, Germany

 

TMB.10 On the Pronunciation Mode of Acronyms in Several European Languages 573

Isabel Trancoso, *M. Ceu Vianna

INESC, Portugal

*CLUL, Portugal

 

 

TMB.11 Evaluation of Speech Synthesis Systems for Dutch in Telecommunication Applications in GSM and PSTN Networks 577

Toni Rietveld, Joop Kerkhoff, *M.J.W.M. Emons, *E.J. Meijer, *Angelien A. Sanderman, *A.M.C. Sluijter

Univ. of Nijmegen, The Netherlands

*KPN Research, The Netherlands

 

TMB.12 Automatic Diphone Extraction for an Italian Text-to-Speech Synthesis System 581

Bianca Angelini, *Claudia Barolo, Daniele Falavigna, Maurizio Omologo, †Stefano Sandri

Istituto per la Ricerca Scientifica e Tecnologica, Italy

*Eikon Informatica, Italy

†Centro Studi e Laboratori Telecommunicazioni S.p.a, Italy

 

TMB.13 Simplification of TTS Architecture vs. Operational Quality 585

Eric Keller

Univ. of Lausanne, Switzerland

 

TMB.14 Felix - A TTS System with Improved Pre-Processing and Source Signal Generation 589

Georg Fries, Antje Wirth

Deutsche Telekom Berkom GmbH, Germany

 

TMB.15 Investigating the Limitations of Concatenative Synthesis 593

Mike Edgington

BT Laboratories, UK

 

TMB.16 Speech Coding and Synthesis Using Parametric Curves 597

Luis Miguel Teixeira de Jesus, Gavin C. Cawley

Univ. of East Anglia, UK

 

TMB.17 Automatically Clustering Similar Units for Unit Selection in Speech Synthesis 601

Alan W Black, Paul Taylor

Univ. of Edinburgh, UK

 

TMB.18 Improvements on a Trainable Letter-to-Sound Converter 605

Li Jiang, Hsiao-Wuen Hon, Xuedong Huang

Microsoft Corporation, USA

 

TMB.19 On a Cepstral Pitch Alteration Technique for Prosody Control in the Speech Synthesis System with High Quality 609

Bae MyungJin, KyuHong Kim, WonCheol Lee

Soongsil Univ., Korea

 

TMB.20 Diphone Concatenation Using a Harmonic Plus Noise Model of Speech 613

Yannis Stylianou, Thierry Dutoit, Juergen Schroeter

AT&T Labs-Research, USA

 

 

SESSION: TMC

Technology for S&L Acquisition, Speech Processing Tools

Chair: Petros Maragos, ILSP, Greece

 

TMC.1 The "Sketchboard": A Dynamic Interpretative Memory and its Use for Spoken Language

Understanding 617

Gerard Sabah

LIMSI-CNRS, France

 

TMC.2 Speech Technology Integration and Research Platform: A System Study 621

Qiru Zhou, Chin-Hui Lee, Wu Chou, Andrew Pargellis

Bell Labs, USA

 

TMC.3 Speech Recognition on SPHERIC - An IC for Command & Control Applications 625

Dieter Geller, Markus Lieb, Wolfgang Budde, Oliver Muelhens, Manfred Zinke

Philips GmbH, Germany

 

TMC.4 Muse: A Scripting Language for the Development of Interactive Speech Analysis and Recognition Tools 629

Michael K. McCandless, James R. Glass

SLS/LCS/MIT, USA

 

TMC.5 Language Learning Based on Non-Native Speech Recognition 633

Silke Witt, Steve Young

Cambridge Univ., UK

 

TMC.6 Task Modelling by Sentence Templates 637

Ute Kilian, Klaus Bader

Daimler Benz AG, Germany

 

TMC.7 Extraction And Representation Rhythmic Components of Spontaneous Speech 641

Shigeyoshi Kitazawa, Hideya Ichikawa, Satoshi Kobayashi, *Nishinuma Yukihiro

Shizuoka Univ., Japan

*Univ. de Provence, France

 

TMC.8 Automatic Pronunciation Scoring of Specific Phone Segments for Language Instruction 645

Yoon Kim, Horacio Franco, Leonardo Neumeyer

SRI International, USA

CCRMA,Stanford University,USA

 

TMC.9 Automatic Detection of Mispronunciation for Language Instruction 649

Orith Ronen, Leonardo Neumeyer, Horacio Franco

SRI International, USA

 

TMC.10 Continuous Formant-Tracking Applied to Visual Representations of the Speech and Speech Recognition

Agustin Alvarez, Rafael Martinez, Victor Nieto, Victoria Rodellar, Pedro Gomez

Universidad Politecnica de Madrid, Spain 653

 

TMC.11 A Call System Using Speech Recognition to Train the Pronunciation of Japanese Long Vowels, the Mora Nasal and Mora Obstruents 657

Goh Kawai, Keikichi Hirose

Univ. of Tokyo, Japan

 

 

TMC.12 An Educational and Experimental Workbench for Visual Processing of Speech Data 661

Jan Nouza, Miroslav Holada, Daniel Hajek

TU of Liberec, Czech Republic

 

TMC.13 A 3 Channel Digital CVSD Bit-Rate Conversion System Using a General Purpose DSP 665

Yong-Soo Choi, *Hong-Goo Kang, †Kim Sung-Youn, ‡Young-Cheol Park, †Dae-Hee Youn

Yonsei Univ., Korea

*AT&T Labs-Research, USA

†ASSP Lab.,Dept of Elec.Engin., Korea

‡Samsung Biomedical Research Institute, Korea

 

TMC.14 SLIM Prosodic Module for Learning Activities in a Foreigh Language 669

Rodolfo Delmonte, Mirela Petrea, Ciprian Bacalu

Universita Cá Garzoni-Moro Foscari, Italy

 

TMC.15 Barge-in Revised 673

Bernhard Kaspar, Karlheinz Schuhmacher, Stefan Feldes

Deutsche Telecom Berkom, GmbH, Germany

TMC.16 WaveEdit, An Interactive Speech Processing Environment for Microsoft Windows Platform 677

Mohammad Akbar

Univ. Joseph Fourier, France

 

TMC.17 Subarashii: Japanese Interactive Spoken Language Education 681

Farzad Ehsani, Jared Bernstein, Amir Najmi, Ognjen Todic

Entropic Research Labs, USA

 

TMC.18 Deploying Speech Applications Over the Web

David Goddeau, *William Goldenthal, *Chris Weikat

Digital Equipment Corp., USA 685

*Cambridge Research Laboratory,

 

TMC.19 CSLUsh: An Extendible Research Environment

Johan Schalkwyk, Jacques de Viller, Sarel van Vuuren , Pieter Vermuelen 689

Oregon Graduate Institute of Science and Technology, USA

 

TMC.20 A Flexible Client-Server Model for Multilingual CTS/TTS Development 693

Tibor Ferenczi, Ge'za Ne'meth, *Ga'bor Olaszy, Zoltan Gaspar

Technical Univ. of Budapest, Hungary

*Linguistics Institute of Hungarian Academy of Sciences, Hungary

 

TMC.21 Critically Sampled PR Filterbanks of Nonuniform Resolution Based on Block Recursive Famlet

Transform 697

Unto K. Laine

Helsinki Univ. of Technology, Finland

 

TMC.22 Automatic Detection of Accent in English Words Spoken by Japenese Students 701

Nobuaki Minematsu, Nariaki Ohashi, Seiichi Nakagawa

Toyohashi Univ. of Technology, Japan

 

TMC.23 An English Conversation and Pronunciation CAI System Using Speech Recognition Technology 705

Yasuhiro Taniguchi, Allan A. Reyes, Hideyuki Suzuki, Seiichi Nakagawa

Toyohashi Univ. of Technology, Japan

 

 

TMC.24 Bringing Spoken Language Systems to the Classroom 709

Stephen Sutton, Ed Kaiser, A. Cronk, Ron Cole

Oregon Graduate Institute, USA

 

TMC.25 Automatic Assessment of Foreign Speakers' Pronunciation of Dutch 713

Catia Cucchiarini, Lou Boves

Univ. of Nijmegen, The Netherlands

 

TMC.26 Use of Low Power EM Radar Sensors for Speech Articulator Measurements 717

John F. Holzrichter, Greg C Burnett

Lawrence Livermore National Laboratory, USA

 

TMC.27 Real Time Measurements of the Vocal Tract Resonances During Speech 721

Julien Epps, Annette Dowd, John Smith, Joe Wolfe

Univ. of New South Wales, Australia

 

SESSION: TMD

Phonetics and Phonology

Chair: Thomas Portele, Univ. of Bonn, Germany

 

TMD.1 Linguistic Criteria for Building and Recording Units for Concatenative Speech Synthesis in Brazilian

Portoguese 725

Eleonora Cavalcante Albano, Patricia Aparecida Aquino

UNICAMP, Brazil

 

TMD.2 "Four-and-Twenty,Twenty-Four".What's in a Number? 729

Knut Kvale, *Arne Kjell Foldvik

Telenor R&D, Norway

*Norwegian Univ. of Science and Technology, Norway

 

TMD.3 Vowel Nasalization in Brazilian Portuguese: An Articulatory Investigation 733

Antonio de Moraes Joao

Universidade Federal do Rio de Janeiro, Brazil

 

TMD.4 Rhythmic Organization Pecularities of the Spoken Text 737

Elena Steriopolo

Kiev State Linguistic Univ., Ukraine

 

TMD.5 Obtaining Confidence Measures from Sentence Probabilities 739

Bernhard Rueber

Philips GmbH, Germany

 

TMD.6 Sentence Design for Speech Synthesis and Speech Recognition Database by Phonetic Rules 743

Yiqing Zu

Chinese Academy of Social Sciences, China

 

TMD.7 Identification of Regional Variants of High German from Digit Sequences in German Telephone Speech 747

Christoph Draxler, Susanne Burger

Ludwig-Maximilians-Univ. Muenchen, Germany

 

TMD.8 Aerodynamic Constraints on the Production of Palatalized Trills: The Case of the Slavic Trilled [r] 751

Darya Kavitskaya

UC Berkeley, USA

 

 

TMD.9 An Experimental Phonetic Study of the Interrelationship Between Prosodic Phrase and Syntactic Structure 755

Cheol-jae Seong, *Sanghun Kim

Chungnam National Univ., Korea

*Electronics and Telecommunications Research Institute, Korea

 

TMD.10 Individual Differences Between Vowel Systems of German Speakers 759

Sebastian J.G.G. Heid

Institute fuer Phonetik und Sprachliche Kommunikation, Germany

 

TMD.11 Tempo and Its Change in Spontaneous Speech

Anton Batliner, *Andreas Kiebling, †Ralf Kompe, Heinrich Niemann, Elmar Noeth 763

Univ. of Erlangen-Nüremburg, Germany

*Ericsson Eurolab,

†Sony Stuttgart Technology Center,

 

TMD.12 A Corpus-Based Approach to Diphthong Analysis of Standard Slovenian 767

Bojan Petek, Rastislav Sustarsic

Univ. of Ljubljana, Slovenia

 

TMD.13 Catalan Vowel Duration 771

Loudres Aguilar, Julia A. Gimenez, Maria Machuca, Rafael Marin, Montse Riera

Universität Autonoma de Barcelona, Spain

 

TMD.14 The Intonation of Vocatives in Spoken Neapolitan Italian 775

Maria Rosaria Caputo

Napoli, Italy

 

TMD.15 A Comparative Acoustic Study of Spontaneous and Read Italian Speech 779

Caldognetto Emanuela Magno, Claudio Zmarich, Franco Ferrero

CNR, Italy

 

TMD.16 A Contribution to the Estimation of Naturalness in the Intonation of Italian Spontaneous Speech 783

Mario Refice, *Michelina Savino, *Martine Grice

DEE,Politecnico di Bari, Italy

*Univ. of the Saarland,Germany

 

TMD.17 Diphthongs and the Process of Monophthongization in Austria German: A First

Approach 787

Sylvia Moosmüller

Austrian Academy of Sciences, Austria

 

TMD.18 The Prosody of Broad and Narrow Focus in English: Two Experiments 791

Steve Hoskins

duPont Hospital for Children/Univ. of Delaware, USA

 

TMD.19 The Domain of Accentual Lengthening in Scottish English 795

Alice Turk, Laurence White

Edinburgh Univ., UK

 

 

TMD.20 Spontaneous Dialogue: Some Results About the F0 Predictions of a Pragmatic Model of Information

Processing 799

Mariette Bessac, Geneviève Caelen-Haumont

Univ. Joseph Fourier, France

 

TMD.21 Phonetic Characteristics of Double Articulations in Some Mangbutu-Efe Languages 803

Didier Demolin, Bernard Teston

Univ. Libre de Bruxelles, Belgium,

Univ. de Lyon 2, France and

Univ. d'Aix en Provence, France

 

TMD.22 Intonation Modeling for the Southern Dialects of the Basque Language 807

Inmaculada Hernaez, Inaki Gaminde, Borja Etxebarria, Pilartxo Etxebarria

Univ. of the Basque Country, Spain

 

SESSION: T3A

Confidence Measures in ASR

Chair: Jose M. Pardo, UPM, Spain

 

T3A.1 A Low-Cost Phonetic Transcription Method 811

Pablo Fetter, Udo Haiber, Peter Regel-Brietzmann

Daimler Benz AG, Germany

 

T3A.2 Word and Acoustic Confidence Annotation for Large Vocabulary Speech Recognition 815

Lin Chase

Carnegie Mellon Univ., USA

 

T3A.3 A Senone Based Confidence Measure for Speech Recognition 819

Zachary Bergen, *Wayne Ward

Berdy Medical Systems, USA

*Carnegie Mellon Univ., USA

 

T3A.4 OOV Utterance Detection Based on the Recognizer Response Function 823

Erica Bernstein, Ward R. Evans

The MITRE Corporation, USA

 

T3A.5 Estimating Confidence Using Word Lattices 827

Thomas Kemp, Thomas Schaaf

Univ. of Karlsruhe, Germany

 

T3A.6 Improved Estimation,Evaluation and Applications of Confidence Measures for Speech Recognition 831

Man-Hung Siu, Herbert Gish, Fred Richardson

BBN Inc, USA

 

SESSION: T3B

Speaker and Language Identification

Chair: Sadaoki Furui, NTT, Japan

 

T3B.1 Improved Speaker Verification System With Limited Training Data On Telephone Quality Speech 835

Salleh Hussain, Fergus R. McInnes, Mervyn A. Jack

Univ. of Edinburgh, UK

 

T3B.2 Verbal Information Verification 839

Qi Li, Biing-Hwang Juang, Qiru Zhou, Chin-Hui Lee

Bell Labs, USA

 

 

T3B.3 A Segment-Based Speaker Verification System Using SUMMIT 843

Sridevi V. Sarma, Victor Zue

MIT, USA

 

T3B.4 Speaker Verification on the Word Wide Web 847

Michael Sokolov

Digital Equipment Corp., USA

 

T3B.5 Text-Prompted Versus Sound-Prompted Passwords in Speaker Verification Systems 851

Johan Lindberg, Hakan Melin

KTH, Sweden

 

T3B.6 GMM Sample Statistic Log-Likelihoods for Text-Independent Speaker Recognition 855

Michael Schmidt, John Golden, Herbert Gish

BBN Systems and Technologies, USA

 

SESSION: T3C

Perception of Prosody

Chair: Joseph Olive, Bell Labs, USA

 

T3C.1 The Influence of Phrase Boundaries on Perceived Prominence in Two-Peak Intonation Contours 859

Toni Rietveld, Carlos Gussenhoven

Univ. of Nijmegen, The Netherlands

 

T3C.2 Testing the Meaning of Four Dutch Pitch Accent Types 863

Johanneke Caspers

Univ. of Leiden, The Netherlands

 

T3C.3 A Perceptual Study for Modelling Speaker-Depen-ent Intonation in TTS and Dialog Systems 867

Joachim J. Mersdorf, Thomas Domhover

Ruhr Univ., Germany

 

T3C.4 Can we Perceive Attitudes Before the End of Sentences? The Gating Paradigm for Prosodic Contours

Veronique Auberge, Tuulikki Grepillat, A. Rilliard 871

Univ. Stendhal, France

 

T3C.5 To What Extent is Perceived Focus Determined by F0-Cues? 875

Mattias Heldner, Eva Strangert

Umeå Univ., Sweden

 

T3C.6 Temporal-Alignment Categories of Accent-Lending Rises and Falls 879

David House, *Dik Hermes, †Frederic Beaugendre

Lund Univ., Sweden

*IPO, The Netherlands

†Lernout & Hauspie Speech Products, The Netherlands

 

SESSION: T3D

Applications of Speech Technology I

Chair: Klaus Fellbaum, Univ. of Cottbus, Germany

 

T3D.1 WebGalaxy -- Integrating Spoken Language And Hypertext Navigation 883

Raymond Lau, Giovanni Flammia, Christine Pao, Victor Zue

MIT, USA

 

 

T3D.2 Pitch Estimation of Singing for Re-Synthesis and Musical Transcription 887

Micheal Carey, Eluned S. Parris, *Graham D. Tattersall

Ensigma Ltd, UK

*Snape Signals Research, UK

 

T3D.3 Automated Lip Synchronisation for Human-Computer Interaction and Special Effect Animation

Christian Martyn Jones, Satnam Singh Dlay 891

Univ. of Newcastle, UK

 

T3D.4 Developing Web-Based Speech Applications 895

Charles T. Hemphill, Yeshwant Muthusamy

Texas Instruments, USA

 

T3D.5 Automatic Post-Synchronization of Speech Utterances 899

Werner Verhelst

Vrije Univ. of Brussel, Belgium

 

T3D.6 Automatic Generation of Hyperlinks Between Audio and Transcript 903

Jordi Robert-Ribes, *Rami G. Mukhtar

Advanced Computational Systems CRC, Australia

*CSIRO, Australia

 

SESSION: T4A

Spontaneous Speech Recognition

Chair: Roberto Billi, Centro Studi e Laboratori Telecomunicazioni (CSELT), Italy

 

T4A.1 Transcription of Broadcast News 907

Jean-Luc Gauvain, Lori Lamel, Gilles Adda, Martine Adda-Decker

LIMSI, France

 

T4A.2 Can Continuous Speech Recognizers Handle Isolated Speech? 911

Fil Alleva, Xuedong Huang, Mei-Yuh Hwang, Li Jiang

Microsoft Corporation, USA

 

T4A.3 Toward Automatic Transcription of Japanese Broadcast News 915

Tatsuo Matsuoka, *Yuichi Taguchi, Katsutoshi Ohtsuki, †Sadaoki Furui, *Katsuhiko Shirai

NTT, Japan

*Waseda Univ., Japan

†Tokyo Institute of Technology, Japan

 

T4A.4 Automatic Detection of Semantic Boundaries 919

Mauro Cettolo, Anna Corazza

Istituto per la Ricerca Scientifica e Tecnologica (IRST), Italy

 

T4A.5 Connected Digit Recognition in Spontaneous

Speech 923

Etienne Bauche, Bojana Gajic, Yasuhiro Minami, Tatsuo Matsuoka, *Sadaoki Furui

NTT, Japan

*Tokyo Institute of Technology, Japan

 

T4A.6 Advances in Transcription of Broadcast News 927

Francis Kubala, Hubert Jin, *Spyros Matsoukas, Long Nguyen, Richard Schwartz, John Makhoul

BBN Systems and Technologies, USA

*Northearstern Univ., USA

 

 

SESSION: T4B

Language Specific Segmental Features

Chair: Gunnar Fant, KTH, Sweden

 

T4B.1 The Domain of Final Lengthening in Production and Perception in Dutch 931

Tina Cambier-Langeveld, Marina Nespor, *Vincent J. van Heuven

Univ. of Amsterdam/HIL, The Netherlands

*Leiden Univ./HIL, The Netherlands

 

T4B.2 Voicing Assimilation as a Cue for Cluster Identification 935

Christine Meunier

Univ. de Genève, Switzerland

 

T4B.3 On the Perceptual Relevance of Degemination in Dutch 939

Saskia M.M. te Riele, Manon Loef, van O. Herwijen

Utrecht Univ., The Netherlands

 

T4B.4 Does Deletion of French SCHWA Lead to Neutralization of Lexical Distinctions? 943

Cecile Fougeron, *Donca Steriade

UCLA & Paris III, USA & France

*UCLA, USA

 

T4B.5 An Approach of the Catalan Palatals Discrimination Based on Durational Patterns of Spectral Evolution 947

Marielle Bruyninckx, Bernard Harmegnies

Univ. de Mons-Hainaut, Belgium

 

T4B.6 Syllable and Segment Duration at Different Speaking Rates in the Slovenian Language 951

Jerneja Gros, Nikola Pavesic, France Mihelic

Univ. of Ljubljana, Slovenia

 

SESSION: T4C

Speaker Recognition I

Chair: George Doddigton, SRI, USA

 

T4C.1 Hybrid Networks Based on RBFN and GMM for Speaker Recognition 955

Wei-Ying Li, Douglas O'Shaughnessy

Univ. du Quebec, Canada

 

T4C.2 A Discriminative Training Algorithm for Gaussian Mixture Speaker Models 959

Jialong He, Li Liu, Günther Palm

Univ. of Ulm, Germany

 

T4C.3 Comparison of Background Normalization Methods for Text-Independent Speaker Verification 963

Douglas A. Reynolds

MIT Lincoln Laboratory, USA

 

T4C.4 Speaker Verification with Limited Enrollment Data

Owen Kimball, Michael Schmidt, Herbert Gish, Jason Waterman 967

BBN Systems and Technologies, USA

 

 

T4C.5 Speaker Verification in the Telephone Network: Research Activities in the Cave Project 971

Frederic Bimbot, *Hans-Peter Hutter, †Cedric Jaboulet, ‡Johan W. Koolwaaij, ¤Johan Lindberg, Jean Benoit Pierrot

ENST/CNRS, France

*Ubilab-UBS, Switzerland

†IDIAP, Switzerland

‡KUN, The Netherlands

¤KTH, Sweden

 

T4C.6 Speaker Verification with GSM Coded Telephone Speech 975

Mark Kuitert, Lou Boves

Univ. of Nijmegen, The Netherlands

 

SESSION: T4D

Speech Synthesis:Linguistic Analysis

Chair: Björn Granström, KTH, Sweden

 

T4D.1 Parsers, Prominence, and Pauses 979

Nick Campbell, Tony Hebert, Ezra Black

ATR Interpreting Telecommunications Res. Labs., Japan

 

T4D.2 Automatic Assignment of Part-of-Speech to Out-of-Vocabulary Words for Text-To-Speech Processing 983

Frederic Bechet, Marc El-Beze

Univ. of Avignon, France

 

T4D.3 Text-to-Prosody Parsing in an Italian Speech Synthesizer. Recent Improvements 987

Barbara Gili Fivela, *Silvia Quazza

Scuola Normale Superiore, Italy

*Centro Studi e Laboratori Telecomunicazioni (CSELT), Italy

 

T4D.4 Tagging Syllables 991

Brigitte Krenn

Univ. of the Saarland, Germany

 

T4D.5 Assigning Phrase Breaks from Part-of-Speech Sequences 995

Alan W. Black, Paul Taylor

Univ. of Edinburgh, UK

T4D.6 Prediction of Word Prominence 999

Christina Widera, Thomas Portele, Maria Wolters

Univ. of Bonn, Germany

 

SESSION: TAA

Speech Analysis & Modelling

Chair: Pierre Badin, ICP, INPG, France

 

TAA.1 Acoustic and Perceptual Properties of Phonemes in Continuous Speech as a Function of Speaking Rate 1003

Hisao Kuwabara

Tokyo Univ. of Science & Technology, Japan

 

TAA.2 New Results in Vowel Production: MRI, EPG, and Acoustic Data 1007

Shrikanth Narayanan, *Abeer Alwan, *Yong Song

AT&T Labs-Research, USA

*Univ. of California, USA

 

TAA.3 The Temporal Properties of Spoken Japanese are Similar to those of English 1011

Takayuki Arai, Steven Greenberg

International and Computer Science Institute and

Univ. of California at Berkeley, USA

 

TAA.4 The Amplitudes of the Peaks in the Spectrum: Data from /a/ Context 1015

Anna Esposito

IIASS, Italy

 

TAA.5 Acoustical Characteristics of Speech and Voice in Speech Pathology 1019

Natalija Bolfan-Stosic, Mladen Hedjever

Univ. of Zagreb, Croatia

 

TAA.6 Pronuncation Modeling Applied to Automatic Segmentation of Spontaneous Speech 1023

Andreas Kipp, *Maria-Barbara Wesenick, *Florian Schiel

IPSK Univ. of Munich, Germany

*Ludwig-Maximilians-Univ. Muenchen, Germany

 

TAA.7 Dynamic and Static Improvements to Lexical Baseforms 1027

Simon Downey, Richard Wiseman

BT Laboratories, UK

 

TAA.8 Signal Driven Generation of Word Baseforms from Few Examples 1031

Andreas Hauenstein

pc-plus GmbH, Germany

 

TAA.9 Modeling the Acoustic Differences Between L1 and L2 Speech: The Short Vowels of Africaans and South-African English 1035

Elizabeth Botha, *Louis C.W. Pols

Univ. of Pretoria, South Africa

*Univ. of Amsterdam, The Netherlands

 

TAA.10 Laryngeal Movements and Speech Rate: An X-ray investigation 1039

Beatrice Vaxelaire, Rudolph Sock

Institut de Phonetique de Strasbourg, France

 

TAA.11 How Flexible is the Human Voice?-A Case Study of Mimicry 1043

Anders Eriksson, Par Wretling

Umea Univ., Sweden

 

TAA.12 The Effect of Low-Pass Filtering on Estimated Voice Source Parameters 1047

Helmer Strik

Univ. of Nijmegen, The Netherlands

 

TAA.13 Vowel Development of /i/ and /u/ in 15-36 Month Old Children at Risk and Not at Risk to Stutter 1051

Susan M. Fosnot

Organization UCLA, USA

 

TAA.14 Optopalatograph: Development of a Device for Measuring Tongue Movement in 3D 1055

Alan Wrench, Alan McIntosh, William Hardcastle

Queen Margaret College, UK

 

TAA.15 Speech Synthesis and Prosody Modification Using Segmentation and Modelling of the Excitation Signal

Juana M. Gutierrez-Arriola, Francisco M. Gimenez de los Galanes, Mohammed H. Savoji, Jose M. Pardo

Universidad Politecnica de Madrid, Spain 1059

 

 

TAA.16 How Can the Control of the Vocal Tract Limit the Speaker's Capability to Produce the Ultimate Perceptive Objectives of Speech? 1063

Christophe Savariaux, Louis-Jean Boe, Pascal Perrier

ICP, INPG, France

 

TAA.17 A Step Toward General Model for Symbolic Description of the Speech Signal 1067

Goran S. Jovanovic

Institute for Applied Mathematics and Electronics, Yugoslavia

 

 

TAA.18 Referring in Long Term Speech by Using Orientation Patterns Obtained from Vector Field of Spectrum Pattern 1071

Kiyoshi Furukawa, Masayuki Nakazawa, Takashi Endo, Oka Ryuichi

Real World Computing Partnership, Japan

 

 
VOLUME 3

 

 

SESSION: TAB

Robustness in Recognition and Signal Processing II

Chair: Alex Waibel, Carnegie Mellon Univ., USA

 

TAB.1 Adaptation of Time Differentiated Cepstrum for Noisy Speech Recognition 1075

Tai-Hwei Hwang, *Lee-Min Lee, Hsiao-Chuan Wang

National Tsing-Hua Univ., ROChina

*Mingchi Institute of Technology, ROChina

 

TAB.2 On The Importance of Various Modulation Frequencies for Speech Recognition 1079

†Noboru Kanedera, *Takayuki Arai, Hynek Hermansky, Misha Pavel

Oregon Graduate Institute of Science and Technology, USA

*International Computer Science Institute, USA

†Ishikawa National College of Technology,Japan

 

TAB.3 A Robust RNN-Based Pre-Classification for Noisy Mandarin Speech Recognition 1083

Wei-Tyng Hong, Sin-Horng Chen

National Chiao Tung Univ., ROChina

 

TAB.4 A Parallel Environment Model (PEM) for Speech Recognition and Adaptation 1087

Mazin Rahim

AT&T Labs, USA

 

TAB.5 Adaptive Model Combination for Robust Speech Recognition in Car Environments 1091

Volker Schless, Fritz Class

Daimler Benz AG, Germany

 

TAB.6 A Comparative Study of Speech Detection Methods

Stefaan Gerven Van, Fei Xie 1095

KULEUVEN-ESAT, Belgium

 

TAB.7 Voice Activity Detection Using Source Separation Techniques 1099

Nikos Doukas, Patrick Naylor, Tania Stathaki

Imperial College, UK

 

TAB.8 Applying Blind Signal Separation to the Recognition of Overlapped Speech 1103

Tomohiko Taniguchi, Shoji Kajita, Kazuya Takeda, Fumitada Itakura

Nagoya Univ., Japan

 

TAB.9 Multiresolution Channel Normalization for ASR in Reverberant Environments 1107

Carlos Avendano, Sangita Tibrewala, Hynek Hermansky

Oregon Graduate Institute of Science and Technology, USA

 

TAB.10 A Speech Pre-Processing Technique for End-Point Detection in Highly Non-Stationary Environments

Rafael Martinez, Agustin Alvarez, Vilda Pedro Gomez, Mercedes Perez, Victor Nieto, Victoria Rodellar

Universidad Politecnica de Madrid, Spain 1111

 

TAB.11 Application of Several Channel and Noise Compensation Techiques for Robust Speaker Recognition

Laura Docio-Fernandez, Carmen Garcia-Mateo

Univ. of Vigo, Spain 1115

 

 

TAB.12 Knowing the Wheat from the Weeds in Noisy

Speech 1119

Hany Agaiby, Thomas J. Moir

Univ. of Paisley, UK

 

TAB.13 Model-Based Approach for Robust Speech Recognition in Noisy Environements with Multiple Noise Sources 1123

Do Yeong Kim, *Nam Soo Kim, Chong Kwan Un

KAIST, Korea

*SAIT, Korea

 

TAB.14 Normalization of Speaker Variability by Spectrum Warping for Robust Speech Recognition 1127

Y.C. Chu, Charlie Jie, Vincent Tung, Ben Lin, Richard Lee

Philips Taiwan, ROChina

 

TAB.15 LPC Poles Tracker for Music/Speech/Noise Segmentation and Music Cancellation 1131

Stephane H. Maes

IBM, USA

 

TAB.16 Comparative Evaluations of Several Front-Ends for Robust Speech Recognition 1135

Doh-Suk Kim, Jae-Hoon Jeong, Soo-Young Lee, Rhee M. Kil

Korean Advanced Instiute of Science and Technology, Korea

 

TAB.17 Speaker Normalization Through Formant-Based Warping of the Frequency Scale 1139

Evandro B. Gouvea, Richard M. Stern

Carnegie Mellon Univ., USA

 

TAB.18 The Use of Cepstral Means in Conversational Speech Recognition 1143

Martin Westphal

Univ. Karlsruhe, Germany

 

TAB.19 Compensation for Environmental and Speaker Variability by Normalization of Pole Locations 1147

Juan M. Huerta, Richard M. Stern

Carnegie Mellon Univ., USA

 

TAB.20 Cellular Phone Speech Recognition: Noise Compensation vs. Robust Architectures 1151

Jean-Baptiste Puel, Regine André-Obrecht

IRIT - Universitaire Paul Sabatier, France

 

TAB.21 Speech Recognition in Noise Using On-Line HMM Adaptation 1155

TungHui Chiang

Advanced Technology Center (ATC), Computer and Communication Labs (CCL), Industrial Technology Research Instiitute (ITRI), ROChina

 

 

SESSION: TAC

Acoustic Modelling II

Chair: Vassilios Digalakis, Technical Univ. of Crete, Greece

 

TAC.1 Incorporating Linguistic Knowledge and Automatic Baseform Generation in Acoustic Subword Unit Based Speech Recognition 1159

Trym Holter, TorBjörn Svendsen

The Norwegian Univ. of Science and Technology (NTNU), Norway

 

TAC.2 Modeling and Decoding of Crossword Context Dependent Phones in the Philips Large Vocabulary Continuous Speech Recognition System 1163

Peter Beyerlein, Meinhard Ullrich, Patricia Wilcox

Philips GmbH, Germany

 

TAC.3 Modelling Inter-Frame Dependence with Preceeding and Succeeding Frames 1167

Philip Hanna, Ji Ming, Peter O'Boyle, F.Jack Smith

Queen's Univ. of Belfast, N. Ireland

 

TAC.4 Continuous Speech Recognition Using Syllables

Rhys James Jones, *Simon Downey, John S. Mason

Univ. of Wales Swansea, UK 1171

*BT Laboratories, UK

 

TAC.5 A New Approach to Generalized Mixture Tying for Continuous HMM-Based Speech Recognition 1175

Daniel Willett, Gerhard Rigoll

Gerhard-Mercator-Univ. Duisburg, Germany

 

TAC.6 State Tying for Context Dependent Phoneme

Models 1179

Klaus Beulen, Elmar Bransch, Hermann Ney

RWTH Aachen Univ. of Technology, Germany

 

TAC.7 A Novel Node Splitting Criterion in Decision Tree Construction for Semi-Continuous HMMS 1183

Jacques Duchateau, Kris Demuynck, Dirk Van Compernolle Katholieke Univ. Leuven-E.S.A.T, Belgium

 

TAC.8 Creating Unseen Triphones by Phone Concatenation in the Spectral, Cepstral and Formant Domains 1187

Mats Blomberg

KTH, Sweden

 

TAC.9 Creating Large Subword Units for Speech Recognition 1191

Thilo Pfau, Manfred Beham, *W. Reichl, Günther Ruske

Muenchen Univ. of Technology, Germany

*Bell Laboratories, USA

 

TAC.10 Segmental Modeling Using a Continuous Mixture of Non-Parametric Models 1195

Jacob Goldberger, David Burshtein, *Horacio Franco

Tel Aviv Univ., Israel

*SRI International, USA

 

TAC.11 Segmentation and Modeling in Segment-Based Recognition 1199

Jane W. Chang, James R. Glass

MIT, USA

 

 

TAC.12 Using Syllables in a Hybrid HMM-ANN Recognition System 1203

Alfred Hauenstein

Siemens AG, Germany

 

TAC.13 Noise Robust Segment-Based Word Recognition Using Vector Quantisation 1207

Ramalingam Hariharan, Juha Hakkinen, Kari Laurila, Janne Suontausta

Nokia Research Center, Finland

 

TAC.14 Viterbi Based Splitting of Phoneme HMM's 1211

Luis Javier Rodriguez, *Ines M. Torres

Universidad del Pais Vasco., Spain

*UPV/EHU, Spain

 

TAC.15 The Demiphone: An efficient Subword Unit for Continuous Speech Recognition 1215

Jose B. Marino, A. Nogueiras, Antonio Bonafonte

Universität Poliitecnica de Catalunya, Spain

 

TAC.16 Organizing Phone Models Based on Piecewise Linear Segment Lattices of Speech Samples 1219

Hiroaki Kojima, Kazuyo Tanaka

Electrotechnical Lab, Japan

 

TAC.17 Automatic Architecture Design by Likelihood-Based Context Clustering with Crossvalidation 1223

Ivica Rogina

Univ. Karlsruhe, Germany

 

TAC.18 Towards Articulatory Speech Recognition: Learning Smooth Maps to Recover Articulator Information 1227

Sam Roweis, *Abeer Alwan

California Institute of Technology, USA

*UCLA, USA

 

TAC.19 Selection of the Most Effective Set of Subword Units for an HMM-Based Speech Recognition System

Anastasios Tsopanoglou, *Nikos Fakotakis

KNOWLEDGE SA, Greece 1231

*Univ. of Patras, Greece

 

TAC.20 Multi-Band Continuous Speech Recognition

Christophe Cerisara, Jean-Paul Haton, Jean-Francois Mari, Dominique Fohr

CRIN-CNRS & INRIA Lorraine, France 1235

 

TAC.21 The Design of Acoustic Parameters for Speaker-Independent Speech Recognition 1239

Nabil N. Bitar, Carol Y. Espy-Wilson

Boston Univ., USA

 

SESSION: TAD

Speech Coding II

Chair: John Mourjopoulos, Univ. of Patras, Greece

 

TAD.1 High Quality Split-Band LPC Vocoder and its Fixed Point Real Time Implementation 1243

Stephane Villette, Milos Stefanovic, Ian Atkinson, Ahmet Kondoz

Univ. of Surrey, UK

 

 

TAD.2 Missing Packet Recovery Techniques for DM Coded Speech 1247

Wen-Whei Chang, *Hwai-Tsu Chang, *Wan-Yu Meng

National Chiao-Tung Univ., ROChina

*Industrial Technology Research Institute, ROChina

 

TAD.3 Spectral Sensitivity of LSP Parameters and Their Transformed Coefficients 1251

Vu Hai Le, Laszlo Lois

Technical Univ. of Budapest, Hungary

 

TAD.4 Reducing the Complexity of the LPC Vector Quantizer Using the K-D Tree Search Algorithm 1255

V. Ramasubramanian, K.K. Paliwal

ATR Interpreting Telecommunications Res. Labs., Japan

 

TAD.5 LPC Quantization Using Wavelet Based Temporal Decomposition of the LSF 1259

Aweke N. Lemma, *W.Bastiaan Kleijn, Ed F. Deprettere

Delft Univ. of Technology, The Netherlands

*KTH, Sweden

 

TAD.6 A Novel 1.7/2.4 KB/S DCT Based Prototype Interpolation Speech Coding System 1263

Costas S. Xydeas, Gokhan H. Ilk

Univ. of Manchester, UK

 

TAD.7 Improved Regular Pulse VSELP Coding of Speech at Low Bit-Rates 1267

Yong-Soo Choi, *Hong-Goo Kang, Sang-Wook Park, †Jae-Ha Yoo, Dae-Hee Youn

Yonsei University, Korea

*AT&T Labs, USA

†LG Electronic Inc., Korea

 

TAD.8 Joint Estimation of Pitch,Band Magnitudes, and V\UV Decisions for MBE Vocoder 1271

Yong Duk Cho, Hong Kook Kim, Moo Young Kim, Sang Ryong Kim

Samsung Advanced Institute of Technology, South Korea

 

TAD.9 A New Distance Measure in LPC Coding: Application for Real Time Situations 1275

Balazs Kovesi, Samir Saoudi, Jean Marc Boucher, *Gabor Horvath

ENST-Br, France

*Technological Univ. of Budapest, Hungary

 

TAD.10 Consideration of Processing Strategies for Very-Low-Rate Compression of Wideband Speech Signals with known Text Transcription 1279

Peter Vepyek, Alan B. Bradley

RMIT, Australia

 

TAD.11 Zero-Redundancy Error Protection for CELP Speech Codecs 1283

Norbert Gortz

Univ. of Kiel, Germany

 

TAD.12 Low Bit Rate Speech Coding Using an Improved HSX Model 1287

Ridha Matmti, Milan Jelinek, Jean-Pierre Adoul

Univ. of Sherbrooke, Canada

 

TAD.13 Phonetic Vocoding with Speaker Adaptation

Carlos M. Ribeiro, Isabel Trancoso

INESC, Portugal 1291

 

 

TAD.14 Quantization of Spectral Sequences Using Variable Length Spectral Segments for Speech Coding at Very Low Bit Rate 1295

Geneviève Baudoin, *Jan Cernocky, †Gerard Chollet

ESIEE, France

*FEIVUT, France

†ENST, France

 

TAD.15 On Modeling Event Functions in Temporal Decomposition Based Speech Coding 1299

Shahrokh Ghaemmaghami, Mohamed Deriche, Boualem Boashash

Queensland Univ. of Technology, Australia

 

TAD.16 Phase Quantization by Pitch-Cycle Waveform Coding in Low Bit Rate Sinusoidal Coders 1303

Soledad Torres, *Javier F Casajus-Quiros

Universidad de Valladolid, Spain

*Universidad Politecnica de Madrid, Spain

 

TAD.17 A Perceptual Study of the Greek Vowel Space Using Synthetic Stimuli 1307

Antonis Botinis, *Marios Fourakis, †John W. Hawks

Athens Univ., Greece

*The Ohio State Univ., USA

†Kent State Univ., USA

 

TAD.18 Mixed Multi-Band Excitation Coder Using Frequency Domain Mixture Function (FDMF) for a Low-Bit Rate Speech Coding 1311

Woo-Jin Han, Sung-Joo Kim, Yung-Hwan Oh

KAIST, Korea

 

TAD.19 Robust GSM Speech Decoding Using the Channel Decoder's Soft Output 1315

Tim Fingscheidt, Olaf Scheufen

Aachen Univ. of Technology, Germany

 

TAD.20 A Low-Bit-Rate Speech Coder Using Adaptive Line Spectral Frequency Prediction 1319

Carl W. Seymour, Tony A. Robinson

Cambridge Univ., UK

 

SESSION: W1A

Dialogue Systems:Applications

Chair: Norman Fraser, Univ. of Surrey, UK

 

W1A.1 Experiments in Spoken Queries for Document Retrieval 1323

James Barnett, Steve Anderson, *John Broglio, Mona Singh, †R. Hudson, †S.W. Kuo

Dragon Systems, USA

*Univ. of Massachusetts, USA

†Intermetrics Inc., USA

 

W1A.2 Towards an Automated Directory Information System 1327

Frank Seide, *Andreas Kellner

Philips Research Laboratories Taipei, Taiwan

*Philips GmbH Aachen, Germany

 

W1A.3 A Strategy for Mixed-Initiative Dialogue Control

Lars Bo Larsen

Aalborg Univ., Denmark 1331

 

 

W1A.4 On the Design of Effective Speech-Based Interfaces for Desktop Applications 1335

Jim Hugunin, Victor Zue

MIT, USA

 

W1A.5 Dialogue Strategies Guiding Users to their Communicative Goals 1339

Matthias Denecke, Alex Waibel

Carnegie Mellon Univ., USA

 

W1A.6 A Speech Interface for Forms on WWW 1343

Sunil Issar

Carnegie Mellon Univ., USA

 

SESSION: W1B

Speech Production Modelling

Chair: Michael D. Riley, AT&T Labs, USA

 

W1B.1 Voice Conversion by Codebook Mapping of Line Spectral Frequencies and Excitation Spectrum 1347

Levent M Arslan, David Talkin

Entropic Research Laboratory, USA

 

W1B.2 Optimal State Dependent Spectral Represetation for HMM Modeling: A New Theoretical Framework

Chafic Mokbel, *Guillaume Gravier, *Gerard Chollet

France Telecom, France

*ENST, France 1351

 

W1B.3 Speech Analysis and Systems Using an AM-FM Molulation Model 1355

Alexandros Potamianos, *Petros Maragos

AT&T Labs-Research, USA

*ILSP & Georgia Tech, Greece & USA

 

W1B.4 Synthesis of Fricative Consonants by Audiovisual-to-Articulatory Inversion 1359

Khaled Mawass, Pierre Badin, Gerard Bailly

ICP, INPG, France

 

W1B.5 New Transformations of Cepstral Parameters for Automatic Vocal Tract Length Normalization in Speech Recognition 1363

Tom Claes, *Ioannis Dologlou, Louis ten Bosch, Dirk Van Compernolle

Lernout & Hauspie Speech Products, Belgium

*K.U Leuven-E.S.A.T, Belgium

 

W1B.6 A Multiresolutionally Oriented Approach for Determination of Cepstral Features in Speech Recognition

Simon Dobrisek, France Mihelic, Nikola Pavesic

Univ. of Ljubljana, Slovenia 1367

 

SESSION: W1C

Speaker Recognition II

Chair: Aaron Rosenberg, AT & T Labs, USA

 

W1C.1 Speaker Identification with User-Selected Password Phrases 1371

Aaron E. Rosenberg, S. Parthasarathy

AT&T Labs, USA

 

W1C.2 Speaker Verification Based on Phonetic Decision Making 1375

Jesper Olsen

Aalborg Univ., Denmark

 

W1C.3 Analysis and Comparison of Score Normalisation Methods for Text-Dependent Speaker Verification 1379

A.M. Ariyaeeinia, P. Sivakumaran

Univ. of Hertfordshire, UK

 

W1C.4 Automatic Speaker Recognition on a Vocoder

Link 1383

Frederic Jauquet, Patrick Verlinde, Claude Vloeberghs

Royal Military Academy, Belgium

 

W1C.5 Likelihood Ratio Adjustment for the Compensa-tion of Model Mismatch in Speaker Verification 1387

Frederic Bimbot, *Dominique Genoud

ENST/CNRS, France

*IDIAP, Switzerland

 

W1C.6 A Lognormal Tied Mixture Model of Pitch for Prosody-Based Speaker Recognition 1391

Kemal M. Sonmez, Larry Heck, Mitchel Weintraub, Elizabeth Shriberg

SRI International, USA

 

SESSION: W1D

Speech Enhancement I

Chair: Hynek Hermansky, Oregon Graduate Inst. of Science and Tech., USA

 

W1D.1 Residual Noise Suppression Using Psychoacoustic Criteria 1395

Tim Haulick, Klaus Linhard, Peter Schrogmeier

Daimler Benz AG, Germany

 

W1D.2 Processing Linear Prediction Residual for Speech Enhancement 1399

*B. Yegnanarayana, Carlos Avendano, Hynek Hermansky, *P.Satyanarayana Murthy

Oregon Graduate Institute of Science and Technology, USA

*ITT MADRAS, India

 

W1D.3 Combined Acoustic Echo Control and Noise Reduction for Mobile Communications 1403

Stefan Gustafsson, Rainer Martin

Aachen Univ. of Technology, Germany

 

W1D.4 A Nonstationary Autoregressive HMM and its Application to Speech Enhancement 1407

Ki Yong Lee, Yeol Rheem

Changwon National Univ., Korea

 

W1D.5 Spectral Subtraction and Mean Normalization in the Context of Weighted Matching Algorithms 1411

Nestor Becerra Yoma, Fergus R. McInnes, Mervyn A. Jack

Univ. of Edinburgh, UK

 

W1D.6 Improving the Intelligibility of Noisy Speech Using an Audible Noise Suppression Technique 1415

Dionysios Tsoukalas, John Mourjopoulos, George Kokkinakis

Univ. of Patras, Greece

 

 

SESSION: W2A

Spoken Language Understanding

Chair: Ioannis Dologlou, ESAT-MI2,KU-Leuven, Belgium

 

W2A.1 Automatic Acquisition of Salient Grammar Fragments for Call-Type Classification 1419

Jerry H. Wright, Allen L. Gorin, Giuseppe Riccardi

AT&T Labs-Research, USA

 

W2A.2 Stochastically-Based Natural Language Understanding Across Tasks and Languages 1423

Minker Wolfgang

LIMSI, France

 

W2A.3 Transducer Composition for Context-Dependent Network Expansion 1427

Michael Riley, Fernando Pereira, Mehryar Mohri

AT&T Labs, USA

 

W2A.4 Giving Prosody a Meaning 1431

Christian Lieske, *Johan Bos, †Martin Emele, ‡Björn Gamback, *C.J. Rupp

Swiss Federal Institute of Technology Lausanne, Switzerland

*Univ. of Saarland, Germany

†Univ. of Stuttgart, Germany

‡Royal Institute of Technology, Sweden

 

W2A.5 Feature-Based Language Understanding 1435

Kishore A. Papineni, Salim Roukos, Todd R. Ward

IBM, USA

 

W2A.6 Speech Translation Based on Automatically Trainable Finite-State Models 1439

Juan Carlos Amengual, *Jose Miguel Benedi, †Klaus Beulen, *Francisco Casacuberta, Asuncion Castano, Antonio Castellanos, *Victor M. Jimenez, *David Llorens, Andres Marzal, †Hermann Ney, Federico Prat, *Enrique Vidal, Juan Miguel Vilar

Universität Jaume I, Spain

*Universidad Politecnica de Valencia, Spain

†RWTH, Germany

 

SESSION: W2B

Language Model Adaptation

Chair: Herman Ney, RWTH, Germany

 

W2B.1 Document Space Models Using Latent Semantic Analysis 1443

Yoshihiko Gotoh, Steve Renals

Univ. of Sheffield, UK

 

W2B.2 Adaptive Topic-Dependent Language Modelling Using Word-Based Varigrams 1447

Sven C. Martin, Jörg Liermann, Hermann Ney

RWTH Aachen, Germany

 

W2B.3 A Latent Semantic Analysis Framework for Large-Span Language Modeling 1451

Jerome R. Bellegarda

Apple Computer Inc, USA

 

 

W2B.4 A Maximum Likelihood Model for Topic Classification of Broadcast News 1455

Richard Schwartz, *Toru Imai, Francis Kubala, Long Nguyen, John Makhoul

BBN Systems and Technologies, USA

*NHK, Japan

 

W2B.5 Language Modelling for Task-Oriented Domains

Cosmin Popovici, *Paolo Baggia

ICI-Instituto de Cercetari in Informatica, Romania

*Centro Studi e Laboratori Telecomunicazioni

(CSELT), Italy 1459

 

W2B.6 Chinese Language Model Adaptation Based on Document Classification and Multiple Domain-Specific Language Models 1463

Sung-Chien Lin, Chi-Lung Tsai, *Lee-Feng Chien, *Ker-Jiann Chen, *Lin-Shan Lee

National Taiwan Univerisity, ROChina

*Academia Sinica, ROChina

 

SESSION: W2C

Prosody and Speech Recognition/ Understanding

Chair: Jan van Santen, Bell Labs-Lucent Technologies, USA

 

W2C.1 Estimating Prosodic Weights in a Syntactic-Rhythmical Prediction System 1467

Langlais Philippe

CERI, France

 

W2C.2 Syntactic Information Contained in Prosodic Features of Japanese Utterances 1471

Kazuhiko Ozeki, Kazuyuki Kousaka, Yujie Zhang

The Univ. of Electro-Communications, Japan

 

W2C.3 Hierarchical Duration Modelling for Speech Recognition Using the ANGIE Framework 1475

Grace Chung, Stephanie Seneff

MIT Laboratory for Computer Science, USA

 

W2C.4 On the Use of Prosody in a Speech-to-Speech Translator 1479

Volker Strom, Anja Elsner, Wolfgang Hess, *Walter Kasper, †Alexandra Klein, *Hans Ulrich Krieger, ‡Jörg Spilker, ‡Hans Weber, ‡Günther Gorz

Univ. of Bonn, Germany

*German Research Center for AI,Germany

†Univ. of Wein,Austria

‡Univ.of Erlangen-Nurnberg,Germany

 

W2C.5 Automatic Recognition of Sentence Type from Prosody in Dutch 1483

Vincent J. van Heuven, *Judith Haan, Jos J.A. Pacilly

Leiden University, The Netherlands

*Nijmegen University, The Netherlands

 

W2C.6 Automatic Word Demarcation Based on Prosody

Paul Munteanu, Bertrand Caillaud, Jean-Francois Serignat, Geneviève Caelen-Haumont

CLIPS/IMAG,France 1487

 

 

SESSION: W2D

Wideband Speech Coding

Chair: Jean Pierre Martens, Univ. of Gent, Belgium

 

W2D.1 A 16-KBIT/S Wideband Speech Codec Scalable with G.729 1491

Akitoshi Kataoka, Sachiko Kurihara, Shigeaki Sasaki, Shinji Hayashi

NTT, Japan

 

W2D.2 Comparison of Auditory Masking Models for Speech Coding 1495

Michelle E. Lynch, Eliathamby Ambikairajah, *Andy Davis

RTC, Ireland

*BT Labs, UK

 

W2D.3 Wideband Speech Coding Based on the MBE Structure 1499

Anne Amodio, Gang Feng

Univ. Stendhal/INPG, France

 

W2D.4 Perceptual Filter Comparisons for Wideband and FM Bandwidth Audio Coders 1503

Marcos Perreau-Guimaraes, *Nicolas Moreau, Madeleine Bonnet

Univ. Renè Descartes - Paris 5, France

*ENST, France

 

W2D.5 Wideband Coding of Speech Using Neural Network Gain Adaptation 1507

Cheung-Fat Chan, Man-Tak Chu

City Univ. of Hong Kong, Hong Kong

 

W2D.6 Wideband-Speech APVQ Coding From 16 To 32 KBPS 1511

Josep M. Salavedra

Universität Politecnica de Catalunya, Spain

 

SESSION: WMA

Speech Recognition in Adverse Environments, CSR and Error Analysis

Chair: Lori Lamel, LIMSI-CNRS, France

 

WMA.1 A Comparative Analysis of Blind Channel Equalization Methods for Telephone Speech Recognition

Wei-Wen Hung, Hsiao-Chuan Wang

National Tsing Hua Univ., ROChina 1515

 

WMA.2 HMM Retraining Based on State Duration Alignment for Noisy Speech Recognition 1519

Wei-Wen Hung, Hsiao-Chuan Wang

National Tsing Hua Univ., ROChina

 

WMA.3 Fast Parallel Model Combination Noise Adaptation Processing 1523

Yasuhiro Komori, Tetsuo Kosaka, Hiroki Yamamoto, Masayuki Yamada

Canon Inc., Japan

 

WMA.4 Speech Recognition Module for CSCW Using a Microphone Array 1527

Takashi Endo, Shigeki Nagaya, Masayuki Nakazawa, Kiyoshi Furukawa, Ryuuichi Oka

Real World Computing Partnership, Japan

 

 

WMA.5 Relative Mel-Frequency Cepstral Coefficients Compensation for Robust Telephone Speech Recognition

*Jiqing Han, Munsung Han, Gyu-Bong Park, Jeongue Park, *Wen Gao

Systems Engineering Research Institute, ETRI, Korea

*Harbin Institute of Technology, ROChina 1531

 

WMA.6 Robust Speech Detection Method for Speech Recognition System for Telecommunications Networks and ITS Field Trial 1535

Seiichi Yamamoto, Naito Masaki, Shingo Kuroiwa

KDD R&D Labs, Japan

 

WMA.7 The Tuning of Speech Detection in the Context of a Global Evaluation of a Voice Response System 1539

Laurent Mauuary, Lamia Karray

France Telecom, France

 

WMA.8 New Methods in Continuous Mandarin Speech Recognition 1543

C. Julian Chen, Ramesh A. Gopinath, Michael D. Monkowski, Michael Picheny, Katherine Shen

IBM, USA

 

WMA.9 Automanic Transcription of General Audio Data: Effect of Environment Segmentation on Phonetic

Recognition 1547

Michelle S. Spina, Victor Zue

MIT, USA

 

WMA.10 Automatic Recognition of Continuous Cantonese Speech with Very Large Vocabulary 1551

Alfred Ying Pang NG, L.W. Chan, P.C. Ching

Chinese Univ. of Hong Kong, Hong Kong

 

WMA.11 Source Normalization Training for HMM Applied to Noisy Telephone Speech Recognition 1555

Yifan Gong

Texas Instruments, USA

 

WMA.12 The Development of a Speaker Independent Continuous Speech Recognizer for Portuguese 1559

Joao P. Neto, *Ciro A. Martins, *Luis B. Almeida

IST, Portugal

*INESC, Portugal

 

WMA.13 Blame Assignment for Errors Made by Large Vocabulary Speech Recognizers 1563

Lin Chase

Carnegie Mellon Univ.,USA

 

WMA.14 Predicting Speech Recognition Performance

Atsushi Nakamura

ATR ITL, Japan 1567

 

WMA.15 A Voice Activity Detector for the ITU-T 8kbit/s Speech Coding Standard G.729 1571

Scott D. Watson, Barry M.G. Cheetham, *P.A. Barret, *W.T.K. Wong, *A.V. Lewis

The Univ. of Liverpool, UK

*BT Laboratories, UK

 

WMA.16 Vocabulary-Independent Recognition of American Spanish Phrases and Digit Strings 1575

Yeshwant K. Muthusamy, John J. Godfrey

Texas Instruments, USA

 

 

WMA.17 Recognition of Spoken and Spelled Proper

Names 1579

Michael Meyer, Hermann Hild

Univ. Karlsruhe, Germany

 

WMA.18 HMM Compensation for Noisy Speech Recognition Based on Cepstral Parameter Generation

Takao Kobayashi, Takashi Masuko, *Keiichi Tokuda

Tokyo Institute of Technology, Japan

*Nagoya Institute of Technology, Japan 1583

 

WMA.19 On the Robustness of the Critical-Band Adaptive Filtering Method for Multi-Source Noisy Speech

Recognition 1587

George Nokas, Evangelos Dermatas, George Kokkinakis

Univ. of Patras, Greece

 

WMA.20 A Space Transformation Approach for Robust Speech Recognition in Noisy Environments 1591

Cun-tai Guan, Shu-hung Leung, Wing-hong Lau

City Univ. of Hong Kong, Hong Kong

 

WMA.21 Robust Isolated Word Recognition Using the WSP-PMC Combination 1595

Tzur Vaich, Arnon Cohen

Ben Gurion Univ., Israel

 

SESSION: WMB

Multimodal Speech Processing, Emerging Techniques and Applications

Chair: Giorgio Micca, Centro Studi e Laboratori Telecomunicazioni (CSELT), Italy

 

WMB.1 Fuzzy Logic for Rule-Based Formant Speech Synthesis 1599

Spyros Raptis, George Carayannis

ILSP, Greece

 

WMB.2 Integrating Acoustic and Labial Information for Speaker Identification and Verification 1603

Pierre Jourlin, *Juergen Luettin, *Dominique Genoud, *Hubert Wassner

LIA/IDIAP, France

*IDIAP, Switzerland

 

WMB.3 Subword Unit Representations for Spoken Document Retrieval 1607

Kenney Ng, Victor Zue

MIT, USA

 

WMB.4 Non-Linear Representations, Sensor Reliability Estimation and Context-Dependent Fusion in the Audiovisual Recognition of Speech in Noise 1611

Pascal Teissier, Jean-Luc Schwartz, *Anne Guerin-Dugue

ICP, INPG, France

*Laboratoire de Traitement D`Images et de Reconnaissance des Formes, France

 

WMB.5 Securized Flexible Vocabulary Voice Messaging System on Unix Workstation with ISDN Connection

Philippe Renevey, Andrzej Drygajlo 1615

Swiss Federal Institute of Technology Lausanne, Switzerland

 

WMB.6 Automatic Deriving of Multiple Variants of Phonetic Trancriptions from Acoustic Signals 1619

Houda Mokbel, Denis Jouvet

France Telecom, France

 

WMB.7 Improved Bimodal Speech Recognition Using Tied-Mixture HMMs and 5000 Word Audio - Visual Synchronous Database 1623

Satoshi Nakamura, Ron Nagai, Kiyohiro Shikano

Nara Institute of Science and Technology, Japan

 

WMB.8 On the Use of Phone Duration and Segmental Processing to Label Speech Signal 1627

Philippe Depambour, Regine André-Obrecht, *Bernard Delyon

IRIT - Equipe IHMPT, France

*IRISA, France

 

WMB.9 Automatic Detection of Disturbing Robot Voice- and Ping Pong-Effects in GSM Transmitted Speech

Martin Paping, Thomas Fahnle

Ascom Systec AG, Switzerland 1631

 

WMB.10 Speech Synthesis Using Phase Vocoder

Techniques 1635

Joseph Di Martino

Univ. Henri Poincaré Nancy I, France

 

WMB.11 Integration of Eye Fixation Information with Speech Recognition Systems 1639

Ramesh R. Sarukkai, *Craig Hunter

Univ. of Rochester, USA

*Univ. of Rochester,

 

WMB.12 Generation of Broadband Speech from Narrowband Speech Using Piecewise Linear Mapping

Yoshihisa Nakatoh, M. Tsushima, T. Norimatsu

Matsushita Electric Industrial Co, Ltd, Japan 1643

 

WMB.13 An Assessment of the Benefits Active Noise Reduction Systems Provide to Speech Intelligibility in Aircraft Noise Environments 1647

Ian E.C. Rogers

Defence Evaluation and Research Agency, UK

 

WMB.14 OLGA - A Dialogue System with an Animated Talking Agent 1651

Jonas Beskow, Kjell Elenius, *Scott McGlashan

KTH, Sweden

*Swedish Institute for Computer Science, Sweden

 

WMB.15 Towards Usable Multimodal Command Languages: Definition and Ergonomic Assessment of Constraints on Users' Spontaneous Speech and Gestures

Sandrine Robbe, Noelle Carbonell, *Claude Valot

CRIN, France

*IMASSA-CERMA, France 1655

 

WMB.16 Exploiting Repair Context in Interactive Error Recovery 1659

Bernhard Suhm, Alex Waibel

Carnegie Mellon Univ., USA

 

WMB.17 An Hybrid Image Processing Approach to LipTracking Independent of Head Orientation 1663

Lionel Reveret, *Frederique Garcia, †Christian Benoit, *Eric Vatikiotis-Bateson

INPG/ENSERG, France

*ATR, Japan

†INPG, France

 

 

WMB.18 Automatic Modeling of Coartriculation in Text-To Visual Speech Synthesis 1667

Bertrand Le Goff

Univ. of Stendhal, France

 

WMB.19 A Multimedia Platform for Audio-Visual Speech Processing 1671

Ali Adjoudani, Thierry Guiard-Marigny, Bertrand Le Goff, Lionel Reveret, Christian Benoit

Univ. of Stendhal, France

 

WMB.20 An Intelligent System for Information Retrieval Over the Internet Through Spoken Dialogue 1675

Hiroya Fujisaki, *Hiroyuki Kameda, Sumio Ohno, Takuya Ito, Ken Tajima, Kenji Abe

Science Univ. of Tokyo, Japan

*Tokyo Engineering Univ., Japan

 

 

WMB.21 Data Hiding in Speech Using Phase Coding

Yasemin Yardimci, *Enis A Cetin, *Rashid Ansari

Bilkent University, Turkey

*Univ. of Illinois, USA 1679

 

WMB.22 CAVE: An On-Line Procedure for Creating and Running Auditory-Visual Speech Perception Experiments-Hardware, Software, and Advantages 1683

Denis Burnham, John Fowler, Michelle Nicol

Univ. of NSW, Australia

 
VOLUME 4
 

SESSION: WMC

Databases, Tools and Evaluations

Chair: Khalid Choukri, ELRA

 

WMC.1 The Bavarian Archive for Speech Signals: Resources for the Speech Community 1687

Florian Schiel, Christoph Draxler, Hans G. Tillmann

Univ. of Muenchen, Germany

 

WMC.2 WWWTranscribe - A Modular Transcription System Based on the Word Wide Web 1691

Christoph Draxler

Univ. of Munich, Germany

 

WMC.3 Design, Recording and Verification of a Danish Emotional Speech Database 1695

Inger S. Engberg, Anya Varnich Hansen, Ove Andersen, Paul Dalsgaard

Aalborg Univ., Denmark

 

WMC.4 Issues in Database Creation: Recording New Populations, Faster and Better Labeling 1699

Maxine Eskenazi, C. Hogan, J. Allen, R. Frederking

Carnegie Mellon Univ., USA

 

WMC.5 Design and Analysis of a German Telephone Speech Database for Phoneme Based Training 1703

Stefan Feldes, Bernhard Kaspar, *Denis Jouvet

Deutsche Telekom Berkom, Germany

*France Telekom CNET, France

 

WMC.6 The Design of a Large Vocabulary Speech Corpus for the Portuguese 1707

Joao P. Neto, Ciro A. Martins, Hugo Meinedo, Luis B. Almeida

INESC, Portugal

 

WMC.7 Continued Investigations of Laryngectomee Speech in Noise - Measurements and Intelligibility Tests

Lennart Nord, Britta Hammarberg, Elisabet Lunström

KTH, Sweden 1711

 

WMC.8 An Appreciation Study of an ASR Inquiry System

L.J.M. Rothkrantz, W.A.Th. Manintveld, M.M.M. Rats, R.J. van Vark, J.P.M. de Vreught, H. Koppelaar

Delft Univ. of Technology, The Netherlands 1715

 

WMC.9 Object-Oriented Modeling of Articulatory Data for Speech Research Information Systems 1719

Kamel Bensaber, Paul Munteanu, Jean-Francois Serignat, Pascal Perrier

Institut de la Communication Parleé (ICP), France

 

WMC.10 A Korean Speech Corpus for Train Ticket Reservation Aid System Based on Speech Recognition

Kim Woosung, Koo Myoung-Wan

Korea Telecom, Korea 1723

 

WMC.11 Recall Memory for Earcons 1727

Dawn Dutton, *Candace Kamm, Susan Boyce

AT&T, USA

*AT&T Labs-Research, USA

 

 

WMC.12 Semi-Automatic Phonetic Labelling of Large Corpora 1731

Odile Mella, Dominique Fohr

CRIN-CNRS & INRIA Lorraine, France

 

WMC.13 Corpora-Speech Database for Polish Diphones

Stefan Grocholewski

Poznan Univ. of Technology, Poland 1735

 

WMC.14 Multilingual Speech Interfaces (MSI) and Dialogue Design Evironments for Computer Telephony Services 1739

Christel Muller, Thomas Ziem

Deutsche Telecom Berkom GmbH, Germany

 

WMC.15 Getting Started with SUSAS: A Speech Under Simulated and Actual Stress Database 1743

John H.L Hansen, Sahar E. Bou-Ghazale

Duke Univ., USA

 

WMC.16 A Markup Language for Text-To-Speech Synthesis 1747

Richard Sproat, *Paul Taylor, Michael Tanenblatt, *Amy Isard

Bell Labs, USA

*Univ. of Edinburgh, UK

 

WMC.17 Several Measures for Selecting Suitable Speech Corpora 1751

Shuichi Itahashi, Naoko Ueda, Mikio Yamamoto

Univ. of Tsukuba, Japan

 

WMC.18 Greek Speech Database for Creation of Voice Driven Teleservices 1755

IRenè Chatzi, *Nikos Fakotakis, *George Kokkinakis

KNOWLEDGE SA, Greece

*Univ. of Patras, Greece

 

SESSION: WMD

Applications of Speech Technology

Chair: Richard Winski, Vocalis, UK

 

WMD.1 Analysis of Infant Cries for the Early Detection of Hearing Impairment 1759

Sebastian Möller, *Rainer Sconweiler

Ruhr-Universität Bochum, Germany

*Hannover Med. School, Germany

 

WMD.2 Optical Logo-Therapy (OLT): A Computer-Based Real Time Visual Feedback Application for Speech

Training 1763

A. Hatzis, Phil Green, S.J. Howard

Univ. of Sheffield, UK

 

WMD.3 Intelligent Retrieval of Very Large Chinese Dictionaries with Speech Queries 1767

Sung-Chien Lin, *Lee-Feng Chien, *Ming-Chiuan Chen, *Lin-Shan Lee, *Ker-Jiann Chen

National Taiwan Univerisity, ROChina

*Academia Sinica, ROChina

 

 

WMD.4 Preliminary Results of a Multilingual Interactive Voice Activated Telephone Service for People-On -The-

Move 1771

Fulvio Leonardi, Giorgio Micca, Sheyla Militello, Mario Nigra

Centro Studi e Laboratori Telecomunicazioni (CSELT), Italy

 

WMD.5 Assessment of an Operational Dialogue System Used by a Blind Telephone Switchboard Operator 1775

Jean-Christophe Dubois, *Yolande Anglade, Dominique Fohr

CRIN-CNRS, France

*IRISA-LLI, France

 

WMD.6 STACC: An Automatic Service for Information Access Using Continuous Speech Recognition Through Telephone Line 1779

Antonio J. Rubio, Pedro Garcia, la Torre Angel de, Jose C. Segura, Jesus Diaz-Verdejo, Maria C. Benitez, Victoria Sanchez, Antonio M. Peinado, Juan M. Lopez-Soler, Jose L. Perez-Cordoba

Universidad de Granada, Spain

 

WMD.7 A Voiced Activated Dialog System for Fast-Food Restaurant Applications 1783

Ramon Lopez-Cozar, Pedro Garcia, J. Diaz, Antonio J. Rubio

Universidad de Granada, Spain

 

WMD.8 Multi-Microphone Sub-band Adaptive Signal Processing for Improvement of Hearing Aid Performance

Paul W. Shields, Douglas R. Campbell

Univ. of Paisley, Scotland 1787

 

WMD.9 Tactile Transmission of Intonation and Stress

Hans Georg Piroth, Thomas Arnhold

Univ. of Munich, Germany 1791

 

WMD.10 Hearing Impairment Simulation: An Interactive Multimedia Programme on the Internet for Students of Speech Therapy 1795

Kerttu Huttunen, Pentti Korkko, Martti Sorri

Univ. of Oulu, Finland

 

WMD.11 Analysis of Dysarthric Speech by Means of Formant-to-Area Mapping 1799

Sorin Ciocea, Jean Schoentgen, *Lisa Crevier-Buchman

Univ. Libre de Bruxelles, Belgium

*Lainnec Hospital, France

 

WMD.12 An Intelligent Telephone Answering System Using Speech Recognition 1803

Boris M. Lobanov, Simon V. Brickle, Andrey V. Kubashin, Tatiana V. Levkovskaja

Academy of Sciences of Belarus, ROBelarus

 

WMD.13 Speedata: A Prototype for Multilingual Spoken Data-Entry 1807

Ulla Ackermann, *Bianca Angelini, *Fabio Brugnara, *Marcello Federico, *Diego Giuliani, *Roberto Gretter, Heinrich Niemann

FORWISS, Germany

*Istituto per la Ricerca Scientifica e Tecnologica (IRST), Italy

 

WMD.14 Applications for the Hearing-Impaired: Evaluation of Finnish Phoneme Recognition Methods

Matti Karjalainen, *Peter Boda, Panu Somervuo, Toomas Altosaar

Helsinki Univ. of Technology, Finland

*Nokia Research Centre, Finland 1811

 

 

WMD.15 Applications for the Hearing-Impaired: Comprehension of Finnish Text with Phoneme Errors

Nina Alarotu, Mietta Lennes, *Toomas Altosaar, †Anja Malm, *Matti Karjalainen

Univ. of Helsinki, Finland

*Helsinki Univ. of Technology, Finland

†Finnish Association of the Deaf, Finland 1815

 

WMD.16 ACCeSS - Automated Call Center Through Speech Understanding System 1819

Ute Ehrlich, *Gerhard Hanrieder, †Ludwig Hitzenberger, Paul Heisterkamp, Klaus Mecklenburg, Peter Regel-Brietzmann

Daimler Benz AG, Germany

*Daimler Benz Aerospace, Germany

†Univ. of Regensburg, Germany

 

WMD.17 Integrating a Radio Model with a Spoken Language Interface for Military Simulations 1823

E. Richard Anthony, Charles Bowen, Margot T. Peet, Susan Tammaro

The MITRE Corporation, USA

 

WMD.18 On Field Experiments of Continuous Digit Recognition Over the Telephone Network 1827

Daniele Falavigna, Roberto Gretter

Istituto per la Ricerca Scientifica e Tecnologica (IRST), Italy

 

WMD.19 An HMM-Based Phoneme Recognizer Applied to Assessment of Dysarthric Speech 1831

Xavier Menendez-Pidal, *James B. Polikoff, *H.Timothy Bunnell

SONY Electronics Inc., USA

*Applied Science &Engineering Laboratories, USA

 

WMD.20 Multiapplication Platform Based on Technology for Mobile Telephone Network Services 1835

Celinda de la Torre, Gonzalo Alonso

Telefonica I+D, Spain

 

WMD.21 Field Test of a Calling Card Service Based on Speaker Verification and Automatic Speech Recognition

Els den Os, *Lou Boves, †David James, ‡Richard Winski, ¤Kurt Fridh 1839

KPN Research, The Netherlands

*KUN, The Netherlands

†Ubilab, Switzerland

‡Vocalis, England

¤Telia, Sweden

 

WMD.22 Speech: A Privileged Modality 1843

Luc E. Julia, Adam J Cheyer

SRI International, USA

 

SESSION: Th1A

Speaker Adaptation I

Chair: Harald Hoege, Siemens AG, Germany

 

Th1A.1 Combined On-line Model Adaptation and Bayesian Predictive Classification for Robust Speech

Recognition 1847

Qiang Huo, *Chin-Hui Lee

ATR Interpreting Telecommunications Res. Labs., Japan

*Multimedia Communications Research Lab., USA

 

 

Th1A.2 Speaker Adaptive Training Applied to Continuous Mixture Density Modeling 1851

Xavier Aubert, Eric Thelen

Philips GmbH, Germany

 

Th1A.3 Speaker Normalization Training for Mixture Stochastic Trajectory Model 1855

Irina Illina, *Yifan Gong

CRIN-CNRS, France

*Texas Instruments, USA

 

Th1A.4 On-line Adaptation of Hidden Markov Models Using Incremental Estimation Algorithms 1859

Vassilios Digalakis

Technical Univ. of Crete, Greece

 

Th1A.5 Modeling Dependency in Adaptation of Acoustic Models Using Multiscale Tree Processes 1863

Ashvin Kannan, Mari Ostendorf

Boston Univ., USA

 

Th1A.6 Acoustic Clustering and Adaptation for Robust Speech Recognition 1867

Larry Heck, Ananth Sankar

SRI International, USA

 

SESSION: Th1B

Dialogue Systems:Design and Applications

Chair: Lou Boves, Univ. of Nijmegen, The Netherlands

 

Th1B.1 Learning The Structure of Mixed Initiative Dialo-gues Using A Corpus of Annotated Conversations 1871

Giovanni Flammia, Victor Zue

MIT, USA

 

Th1B.2 AMICA: The AT&T Mixed Initiative Conversational Architecture 1875

Roberto Pieraccini, Esther Levin, Wieland Eckert

AT&T, USA

 

Th1B.3 Generating Semantically Consistent Inputs to a Dialog Manager 1879

Alicia Abella, Allen L. Gorin

AT&T Research Labs., USA

 

Th1B.4 A Stochastic Model of Computer-Human Interaction for Learning Dialogue Strategies 1883

Esther Levin, Roberto Pieraccini

AT&T Labs-Research, USA

 

Th1B.5 Semantic Processing of Out-of-Vocabulary Words in a Spoken Dialogue System 1887

Manuela Boros, *Maria Aretoulaki, *Florian Gallwitz, *Elmar Noeth, *Heinrich Niemann

FORWISS, Germany

*Univ. of Erlangen-Nüremburg, Germany

 

Th1B.6 Clarification Dialogues in VERBMOBIL 1891

Elisabeth Maier

DFKI GmbH, Germany

 

 

SESSION: Th1C

Assessment Methods

Chair: John Makhoul, BBN Systems and Techs, USA

 

Th1C.1 The DET Curve in Assessment of Detection Task Performance 1895

Alvin Martin, George Doddington, Terri Kamm, Mark Ordowski, Mark Przybocki

Natural Institute of Standards and Technology, SRI, Dept. of Defense, USA

 

Th1C.2 Speech Quality Evaluation of Hands-Free

Terminals 1899

Harald Klaus, Ekkehard Diedrich, Astrid Dehnel, Jens Berger

Deutsche Telekom Berkom GmbH, Germany

 

Th1C.3 Use of Broadcast News Materials for Speech Recognition Benchmark Tests 1903

David S. Pallett, Jonathan G. Fiscus, William Fisher, John S. Garofolo,

National Institute of Standards and Technology (NIST), USA

 

Th1C.4 Spoken Dialogue Evaluation: A First Framework for Reporting Results 1907

Norman Fraser

Univ. of Surrey, UK

 

Th1C.5 Generality and Transferability. Two Issues in Putting a Dialogue Evaluation Tool into Practical Use

Niels Ole Bernsen, Hans Dybkjær, Laila Dybkjær, Vytautas Zinkevicius 1911

Odense Univ., Denmark

Prolog Development Center A/S, Denmark

 

Th1C.6 Within-Speaker Variability of the Word Error Rate for a Continuous Speech Recognition System

David A.Leeuwen Van, Herman J. M. Steeneken

TNO, The Netherlands 1915

 

SESSION: Th1D (SPECIAL SESSION)

Education for Language and Speech Communication I

Chair: Gerrit Bloothoft, Utrecht Univ., The Netherlands

 

Th1D.1 Opportunities for Computer-Aided Instruction in Phonetics and Speech Communication Provided by the Internet 1919

Mark Huckvale, *C. Benoit, †C. Bowerman, ‡Anders Eriksson, ¤M. Rosner, **M. Tatham, ††Briony Williams

University College London, UK

*ICP, France

†Univ. of Sunderland, UK

‡Univ. of Umeå, Sweden

¤Univ. of Malta, Malta

**Univ. of Essex, UK

††Univ. of Edinburgh, UK

 

Th1D.2 The Landscape of Future Education in Speech Communication Sciences 1923

Gerrit Bloothooft

Utrecht Univ., The Netherlands

 

Th1D.3 An Integrated System for Teaching Spoken Dialogue Systems Technology 1927

Kare Sjolander, Joakim Gustafson

KTH, Sweden

 

 

Th1D.4 Communication Science within Education for Logopedics/ Speech and Language Therapy in Europe: The State of the Art 1931

Janet Beck, *Bernard Camilleri, †Hilde Chantrain, ‡Anu Klippi, ¤Marianne Leterme, **Matti Lehtihalmes, ††Peter Schneider, ‡‡Wilhelm Vieregge, ¤¤Eva Wigforss

Queen Margaret College, UK

*Hogeschool Antwerpen,

†Univ. of Malta, Malta

‡Univ. of Helsinki, Finland

¤CPLOL,

**Univ. of Oulu, Finland

††RWTH Aachen,

‡‡Univ. of Nijmegen, The Netherlands

¤¤Lund University, Sweden

 

Th1D.5 Education in Spoken Language Engineering in Europe 1935

Phil Green, *Carlos Espain

Univ. of Sheffield, UK

*Univ. of Porto, UK

 

Th1D.6 A Survey of Phonetics Education in Europe 1939

Valerie Hazan, *Dommelen van Wim

Univ. College London, UK

*Norwegian Univ. of Science and Technology, Norway

 

SESSION: Th2A

Hybrid Systems for ASR

Chair: Shigeki Sagayama, NTT Human Interface Labs, Japan

 

Th2A.1 Matching Training and Testing Criteria in Hybrid Speech Recognition Systems 1943

Xin Tu, Yonghong Yan, Ron Cole

Oregon Graduate Institute of Science and Technology, USA

 

Th2A.2 Context Independent and Context Dependent Hybrid HMM/ANN Systems for Vocabulary Independent

Tasks 1947

Stephane Dupont, Christophe Ris, Olivier Deroo, Vincent Fontaine, Jean-Marc Boite, L. Zanoni

FPMs-TCTS, Belgium

 

Th2A.3 Estimation of Global Posteriors and Forward-Backward Training of Hybrid HMM/ANN Systems 1951

J. Hennebert, *Christophe Ris, †Hervè Bourlard, ‡Steve Renals, Nelson Morgan

ICSI, USA

*TCTS, Belgium

†IDIAP, Switzerland

‡Univ. of Sheffield, UK

 

Th2A.4 Confidence Measures for Hybrid HMM/ANN Speech Recognition 1955

Gethin Williams, Steve Renals

Univ. of Sheffield, UK

 

Th2A.5 Ensemble Methods for Connectionist Acoustic Modelling 1959

Gary D. Cook, Steve R. Waterhouse, Tony A. Robinson

Cambridge Univ., UK

 

 

Th2A.6 Improving Performance on Switchboard by Combining Hybrid HME/HMM and Mixture of Gaussians Acoustic Models 1963

Jurgen Fritsch, *Michael Finke

Univ. of Karlsruhe, Germany

*Carnegie Mellon Univ., USA

 

SESSION: Th2B

Topic and Dialogue Dependent Language Modelling

Chair: Frederic Jelinek, Johns Hopkins Univ. Baltimore, MD, USA

 

Th2B.1 Experiments in Adaptation of Language Models for Commercial Applications 1967

Petra Witschel, Harald Höge

Siemens AG, Germany

 

Th2B.2 Language Model Adaptation Using Dynamic Marginals 1971

Reinhard Kneser, Jochen Peters, Dietrich Klakow

Philips GmbH, Germany

 

Th2B.3 Transforming Out-of-Domain Estimates to Improve In-Domain Language Models 1975

Rukmini Iyer, Mari Ostendorf

Boston Univ., USA

 

Th2B.4 MDI Adaptation of Language Models Across Corpora 1979

P. Srinivasa Rao, Satya Dharanipragada, Salim Roukos

IBM, USA

 

Th2B.5 A Class Based Approach to Domain Adaptation and Constraint Integration for Empiral M-Gram Models

Klaus Ries

Carnegie Mellon Univ., USA 1983

 

Th2B.6 Using Story Topics for Language Model

Adaptation 1987

Kristie Seymore, Ronald Rosenfeld

Carnegie Mellon Univ., USA

 

SESSION: Th2C

Lipreading

Chair: Christian Benoit, ICP, Univ. Stendhal, France

 

Th2C.1 Towards Speaker Independent Continuous Speechreading 1991

Juergen Luettin

IDIAP, Switzerland

 

Th2C.2 Driving Synthetic Mouth Gestures: Phonetic Recognition for FaceMe! 1995

William Goldenthal, Keith Waters, Thong Jean-Manuel Van, Oren Glickman

Digital Equipment Corp., USA

 

Th2C.3 Continuous Visual Speech Recognition Using Geometric Lip-Shape Models and Neural Networks

Alexandrina Rogozan, Paul Deleglise

Laboratoire d' Informatique de l'Univ. du Maine (LIUM), France 1999

 

 

Th2C.4 The Teleface Project Multi-Modal Speech-Communication for the Hearing Impaired 2003

Jonas Beskow, Martin Dahlquist, Björn Granström, Magnus Lundeberg, Karl-Erik Spens, Tobias Ohman

KTH, Sweden

 

Th2C.5 Real-Time Lip-Tracking for Lipreading 2007

Rainer Stiefelhagen, Uwe Meier, *Jie Yang

Univ. Karlsruhe, Germany

*Carnegie Mellon Univ., USA

 

Th2C.6 From Raw Image of the Lips to Articulatory Parameters: A Viseme-Based Prediction 2011

Lionel Reveret

Univ. of Stendhal, France

 

SESSION: Th2D

Articulatory Modelling

Chair: Luis Pols, Univ. of Amsterdam, The Netherlands

 

Th2D.1 Adaptation of Maeda's Model for Acoustic to Articulatory Inversion 2015

Bruno Mathieu, Yves Laprie

CRIN-CNRS & INRIA, France

 

Th2D.2 Why Should Speech Control Studies Based on Kinematics be Considered with Caution? Insights from a 2D Biomechanical Model of the Tongue 2019

Yohan Payan, Pascal Perrier

Institut de la Communication Parleé, France

 

Th2D.3 An Integrated Model of the Biomechanics and Neural Control of the Tongue, Jaw, Hyoid and Larynx System 2023

Vittorio Sanguineti, *Rafael Laboissiere, †David J. Ostry

Universita' di Genova, Italy

*Institut de la Communication Parleé, France

†McGill Univ., Canada

 

Th2D.4 Using MRI to Image the Moving Vocal Tract During Speech 2027

M. Mohammad, *E. Moore, J.N. Carter, C.H. Shadle, S.J Gunn

Univ. of Southampton, UK

*Southampton General Hospital, UK

 

Th2D.5 Unified Physiological Model of Audible-Visible Speech Production 2031

Eric Vatikiotis-Bateson, Hani Yehia

ATR, Japan

 

Th2D.6 Motor Control Information Recovering from the Dynamics with the EP Hypothesis 2035

Helene Loevenbruck, Pascal Perrier

ICP, INPG, France

 

SESSION: ThMA

Front-Ends and Adaptation to Acoustics, Speaker Adaptation

Chair: Hervè Bourlard, IDIAP, Belgium

 

ThMA.1 Speaker Adaptation for Context-Dependent HMM Using Spatial Relation of Both Phoneme Context Hierarchy and Speakers 2039

Yasuhiro Komori, Tetsuo Kosaka, Masayuki Yamada, Hiroki Yamamoto

Canon Inc., Japan

 

ThMA.2 Fast Algorithm for Speech Recognition Using Speaker Cluster HMM 2043

Masayuki Yamada, Yasuhiro Komori, Tetsuo Kosaka, Hiroki Yamamoto

Canon Inc., Japan

 

ThMA.3 A Comparison of Novel Techniques for Instantaneous Speaker Adaptation 2047

Timothy J. Hazen, James R. Glass

MIT, USA

 

ThMA.4 Fast Adaptation of Acoustic Models to Environmental Noise Using Jacobian Adaptation

Algorithm 2051

Yoshikazu Yamaguchi, Satoshi Takahashi, Shigeki Sagayama

NTT, Japan

 

ThMA.5 Unsupervised HMM Adaptation Based on Speech-Silence Discrimination 2055

Ilija Zeljkovic, Shrikanth Narayanan, Alexandros Potamianos

AT&T Labs-Research, USA

 

ThMA.6 Correlation Based Predictive Adaptation of Hidden Markov Models 2059

Mohamed Afify, Yifan Gong, Jean-Paul Haton

CRIN-CNRS, France

 

ThMA.7 Adaptation of Hidden Markov Models Using Multiple Stochastic Transformations 2063

Vassilios Diakoloukas, Vassilios Digalakis

Technical Univ. of Crete, Greece

 

ThMA.8 Transformation Smoothing for Speaker and Environmental Adaptation 2067

M.J.F. Gales

Cambridge Univ., UK

 

ThMA.9 Nonlinear Discriminant Analysis for Improved Speech Recognition 2071

Vincent Fontaine, Christophe Ris, Jean-Marc Boite

FPMs-TCTS, Belgium

 

ThMA.10 On the Interplay Between Auditory - Based Features and Locally Recurrent Neural Networks for Robust Speech Recognition In Noise 2075

Jurgen Tchorz, *Klaus Kasper, *Herbert Reininger, Bilger Kollmeier

Carl von Ossietzky-Universität, Germany

*Johan Wolfgang-Goethe-Universität, Germany

 

ThMA.11 Speech Recognition Using On-Line Estimation of Speaking Rate 2079

Nelson Morgan, Eric Fosler, Nikki Mirghafori

International Computer Science Institute, USA

 

ThMA.12 Using Formant Frequencies in Speech

Recognition 2083

John N. Holmes, *Wendy J. Holmes, *Philip N. Garner

Speech Technology Consultant, UK

*SRU/DRA, UK

 

ThMA.13 Speaker Normalization and Speaker Adaptation -- A Combination for Conversational Speech

Recognition 2087

Puming Zhan, *Martin Westphal, *Michael Finke, Alex Waibel

Carnegie Mellon Univ., USA

*Univ. of Karlsruhe, Germany

 

ThMA.14 Speaker Adaptation Based on Pre-Clustering Training Speakers 2091

Gao Yuqing, Padmanabhan Mukund, Michael Picheny

IBM T.J, USA

 

ThMA.15 A Fast Method of Speaker Normalisation Using Formant Estimation 2095

Mike Lincoln, Stephen Cox, *Simon Ringland

Univ. of East Anglia, UK

*British Telecom, UK

 

ThMA.16 Acoustic Front--End Oprimization for Large Vocabulary Speech Recognition 2099

Lutz Welling, N. Haberland, Hermann Ney

RWTH Aachen-Univ. of Technology, Germany

 

ThMA.17 Improving Autoregressive Hidden Markov Model Recognition Accuracy Using a Non-Linear Frequency Scale with Application to Speech Enhancement

B.T. Logan, A.J. Robinson

Cambridge Univ., UK 2103

 

ThMA.18 Designing a Reduced Feature-Vector Set for Speech Recognition by Using KL/GPD Competitive Training 2107

Tsuneo Nitta, Akinori Kawamura

Toshiba Multimedia Eng. Lab., Japan

 

ThMA.19 Speaker Adaptation by Correlation (ABC)

Shaobing Chen Scott, Peter DeSouza

IBM, USA 2111

 

SESSION: ThMB

Speech Perception

Chair: Paul Taylor, Univ. of Edinburgh, UK

 

ThMB.1 Preliminary Experiments on the Perception of Double Semivowels 2115

William A. Ainsworth, Georg F. Meyer

Keele Univ., UK

 

ThMB.2 Does Syllable Frequency Affect Production Time in a Delayed Naming Task? 2119

Niels Olaf Schiller

Max Planck Institute , The Netherlands

 

ThMB.3 Human and Machine Identification of Consonantal Place of Articulation from Vocalic

Transition 2123

Segments

Andrew C. Morris, *Gerrit Bloothooft, †William J. Barry, †Bistra Andreeva, †Jacques Koreman

Univ. of Sheffield, UK

*Utrecht Institute of Linguistics, The Netherlands

†Univ. of Saarbrucken, Germany

 

ThMB.4 Modelling the Recognition of Spectrally Reduced Speech 2127

Jon Barker, Martin Cooke

Univ. of Sheffield, UK

 

ThMB.5 Prosodic Structure and Phonetic Processing: A Cross-Linguistic Study 2131

*Christophe Pallier, Anne Cutler, *Nuria Sebastian-Galles

Max Planck Institute for Psycholinguistics,The Netherlands

*Univ. of Barcelona, Spain

 

 

ThMB.6 The Correlation Between Consonant Identification and The Amount of Acousttic Consonant Reduction 2135

R. J. J. H. van Son, Louis C.W. Pols

Univ. of Amsterdam, The Netherlands

 

ThMB.7 Relevant Spectral Information for the Identification of Vowel Features from Bursts 2139

Anne Bonneau

CRIN-CNRS, France

 

ThMB.8 Perceptual Study of Intersyllabic Formant Transitions in Synthesized V1-V2 in Standard Chinese

Aijun Li

Chinese Academy of Social Sciences, China 2143

 

ThMB.9 Role of Perception of Rhythmically Organized Speech in Consolidation Process of Long-Term Memory Traces (LTM-Traces) and in Speech Production

Controlling 2147

Oleg P. Skljarov

Research Institute of Otolaryngology and Speech Pathology, Russia

 

ThMB.10 Sequential Probabilities as a Cue for

Segmentation 2151

Arie H. van der Lugt

Max Planck Institute for Psycholinguistics, The Netherlands

 

ThMB.11 Perception and Acoustics of Emotions in Singing

Susan Jansens, Gerrit Bloothooft, Guus de Krom

Utrecht Univ., The Netherlands 2155

 

ThMB.12 Phonemes and Syllables in Speech Perception: Size of Attentional Focus in French 2159

Christophe Pallier

Max Planck Institute for Psycholinguistics, The Netherlands

 

ThMB.13 Quality of a Vowel with Formant Undershoot: a Preliminary Perceptual Study 2163

Shinichi Tokuma

Sophia University, Japan

 

ThMB.14 Segmental and Suprasegmental Contributions to Spoken-Word Recognition in Dutch 2167

Mariette Koster, Anne Cutler

Max Planck Institute for Psycholinguistics, The Netherlands

 

ThMB.15 Perception of Vowel Duration and Spectral Characteristics in Swedish 2171

Dawn M. Behne, *Peter E. Czigler, *Kirk P.H. Sullivan

Norwegian Univ. of Sceince and Technology, Norway

*Umeå Univ., Sweden

 

ThMB.16 Relative Contributions of Noise Burst and Vocalic Transitions to the Perceptual Identification of Stop Consonants 2175

Adrien Neagu, Gerard Bailly

Institut de la Communication Parleé, France

 

ThMB.17 Effect of Speaker Familiarity and Background Noise on Acoustic Features Used in Speaker Identification

Satoshi Kitagawa, Makoto Hashimoto, Norio Higuchi

ATR ITL, Japan 2179

 

ThMB.18 Dynamic Versus Static Specification for the Perceptual Identity of a Coarticulated Vowel 2183

Michel Piterman

Univ. de Provence, France

 

ThMB.19 Asymmetries in Consonant Confusion 2187

Madelaine C. Plauche, *Cristina Delogu, John J. Ohala

Univ. of California at Berkeley, USA

*Fondazione Ugo Bordoni, Italy

 

ThMB.20 Rime and Syllabic Effects in Phonological Priming Between French Spoken Words 2191

Nicolas Dumay, *Monique Radeau

Univ. de Genève, Switzerland

*Univ. Libre de Bruxelles, Belgium

 

ThMB.21 Roles of Static and Dynamic Features of Formant Trajectories in the Perception of Talk Individuality 2195

Weizhong Zhu, Hideki Kasuya

Utsunomiya Univ., Japan

 

SESSION: ThMC

Dialogue Systems:Linguistic Structures, Modelling and Evaluation

Chair: Niels Ole Bernsen, Roskilde Univ., Denmark

 

ThMC.1 Database Management and Analysis for Spoken Dialog Systems: Methodology and Tools 2199

Chih-mei Lin, Shrikanth Narayanan, Russell Ritenour

AT&T Labs-Research, USA

 

ThMC.2 Evaluating Spoken Dialog Systems for Telecommunication Services 2203

Candace Kamm, Shrikanth Narayanan, Dawn Dutton, Russell Ritenour

AT&T Labs-Research, USA

 

ThMC.3 Robust Spoken Dialogue Management for Driver Information Systems 2207

Xavier Pouteau, Emiel Krahmer, Jan Landsbergen

IPO/TUE, The Netherlands

 

ThMC.4 Using Acoustic and Prosodic Cues to Correct Chinese Speech Repairs 2211

Yue-Shi Lee, Hsin-Hsi Chen

National Taiwan Univerisity, ROChina

 

ThMC.5 Integrating Domain Specific Focusing In Dialogue Models 2215

Arne Jonsson, Nils Dahlback

Linkoping Univ., Sweden

 

ThMC.6 Evaluating Competing Agent Strategies for a Voice Email Agent 2219

Marilyn Walker, Donald Hindle, Jeanne Fromer, Giuseppe Di Fabbrizio, Craig Mestel

AT&T Labs-Research, USA

 

ThMC.7 Discourse Marker Use in Task-Oriented Spoken Dialog 2223

Donna K. Byron, *Peter A. Heeman

Univ. of Rochester, USA

*France Telecom, France

 

ThMC.8 From Interface to Content: Translingual Access and Delivery of On-line Information 2227

Victor Zue, Stephanie Seneff, James R. Glass, Lee Hetherington, Edward Hurley, Helen Meng, Christine Pao, Joseph Polifroni, Rafael Schloming, Philipp Schmid

MIT, USA

 

ThMC.9 Learning Dialogue Structures From a Corpus

Jan Alexandersson, Norbert Reithinger

DFKI GmbH, Germany 2231

 

ThMC.10 Dialogue Act Classification Using Language Models 2235

Norbert Reithinger, Martin Klesen

DFKI GmbH, Germany

 

ThMC.11 User's Multiple Goals in Spoken Dialogue

Didier Pernel

Thomson-CSF / LCR, France 2239

 

ThMC.12 Chatting with Interactive Agent 2243

Noriko Suzuki, Seiji Inokuchi, K. Ishii, Michio Okada

ATR Media Intergration & Communications Research Laboratories, Japan

 

ThMC.13 Generic Template for the Evaluation of Dialogue Management Systems 2247

Gavin E. Churcher, Eric S. Atwell, Clive Souter

Centre for Computer Analysis of Language and Speech, UK

 

ThMC.14 Analysis of Interactive Strategy to Recover from Misrecognition of Utterances Including Multiple Information Items 2251

Yasuhisa Niimi, Takuya Nishimoto, Yutaka Kobayashi

Kyoto Institute of Technology, Japan

 

ThMC.15 A Referential Approach to Reduce Perplexity in the Vocal Command System Comppa 2255

Francois-Arnould Mathieu, Bertrand Gaiffe, Jean-Marie Pierrel

CRIN-CNRS&INRIA-Lorraine, France

 

ThMC.16 Linguistic Processor for a Spoken Dialogue System Based on Island Parsing Techniques 2259

Aristomenis Thanopoulos, Nikos Fakotakis, George Kokkinakis

Univ. of Patras, Greece

 

ThMC.17 Modelling of Speech-Based User Interfaces

Brian Mellor, *Chris Baber

Speech Research Unit, UK

*Univ. of Birmingham, UK 2263

 

ThMC.18 Can You Predict Responses to Yes/No Questions? Yes, No, and Stuff 2267

Beth Ann Hockey, *Deborah Rossen-Knill, Beverly Spejewski, Matthew Stone, †Stephen Isard

Univ. of Pennsylvania, USA

*Philadelphia College of Textiles and Science, USA

†Univ. of Edingburgh, UK

 

ThMC.19 DIA-MOLE: An Unsupervised Learning Approach to Adaptive Dialogue Models for Spoken Dialogue Systems 2271

Jens-Uwe Moeller

Univ. of Hamburg, Germany

 

ThMC.20 How Do System Questions Influence Lexical Choices in User Answers? 2275

Joakim Gustafson, Anette Larsson, Rolf Carlson, *K. Hellman

KTH, Sweden

*Stockholm Univ., Sweden

VOLUME 5
 

 

SESSION: ThMD

Speaker Recognition and Language Identification

Chair: Douglas Reynolds, MIT, USA

 

ThMD.1 Gaussian Mixture Models with Common Principal Axes and Their Application in Text-Independent Speaker Identification 2279

Kuo-Hwei Yuo, Hsiao-Chuan Wang

National Tsing Hua Univ., ROChina

 

ThMD.2 Speaker Models Designed from Complete Data Sets: A New Approach to Text-Independent Speaker Verification 2283

Dominik R. Dersch, *Robin W. King

Univ. of Sydney, Australia

*Univ. of South Australia, Australia

 

ThMD.3 A Double Gaussian Mixture Modeling Approach To Speaker Recognition 2287

Vergin Rivarol, Douglas O'Shaughnessy

INRS Telecommunications, Canada

 

ThMD.4 An Acoustic Subword Unit Approach to Non-Linguistic Speech Feature Identification 2291

Mohamed Afify, Yifan Gong, Jean-Paul Haton

CRIN-CNRS, France

 

ThMD.5 N-Best GMM's for Speaker Identification 2295

Chakib Tadj, *Pierre Dumouchel, †Yu Fang

Ecole de Technologie Superieure, Canada

*Centre de Recherche Informatique, Canada

†Institut Universitaire de Technologie, Canada

 

ThMD.6 Model Dependent Spectral Representations for Speaker Recognition 2299

Guillaume Gravier, *Chafic Mokbel, Gerard Chollet

ENST/SIG, France

*CNET-DIH/RCP, France

 

ThMD.7 Equalizing Sub-Band Error Rates in Speaker Recognition 2303

Roland Auckenthaler, *John S. Mason

Technical Univ. Graz, Austria

*Univ. of Wales Swansea, UK

 

ThMD.8 Automatic Gender Identification Under Adverse Conditions 2307

Stefan Slomka, Sridha Sridharan

Queensland Univ. of Technology, Australia

 

ThMD.9 Acoustic Features and Perceptive Processes in the Identification of Familiar Voices 2311

Yizhar Lavner, Isak Gath, Judith Rosenhouse

Israel Institute of Technology, Israel

 

ThMD.10 On the Use Acoustic Segmentation in Speaker Identification 2315

Leandro Rodriguez-Linares, Carmen Garcia-Mateo

Univ. of Vigo, Spain

 

ThMD.11 Speaker Recognition by Humans and Machines

Herman J.M. Steeneken, David A. Van Leeuwen

TNO-HFRI, The Netherlands 2319

 

 

ThMD.12 Foreign Speaker Accent Classification Using Phoneme-Dependent Accent Discrimination Models and Comparisons with Human Perception Benchmarks

Karsten Kumpf, *Robin W. King

Univ. of Sydney, Australia

*Univ. of South Australia, Australia 2323

 

ThMD.13 A Comparison of Human and Machine In Speaker Recognition 2327

Li Liu, Jialong He, Günther Palm

Univ. of Ulm, Germany

 

ThMD.14 Evaluation of Second Language Learners' Pronunciation Using Hidden Markov Models 2331

Simo M.A. Goddijn, *Guus de Krom

Forensic Science Laboratory, Rijswijk

*Univ. of Utrecht, The Netherlands

 

ThMD.15 Delta Vector Taylor Series Environment Compensation for Speaker Recognition 2335

Brian Eberman, Pedro J. Moreno

Digital Equipment Corp., USA

 

ThMD.16 Wavelet-Like Regression Features in the Cepstral Domain for Speaker Recognition 2339

Jonathan Hume

Univ. of Wales Swansea, UK

 

ThMD.17 Minimum Classification Error Linear Regression (MCELR) for Speaker Adaptation Using HMM with Trend Functions 2343

Rathinavelu Chengalvarayan

Bell Labs-Lucent Technologies, USA

 

ThMD.18 A Continuous HMM Text Independent Speaker Recognition System Based on Vowel Spotting 2347

Nikos Fakotakis, *Anastasios Tsopanoglou, Kallirroi Georgila

Univ. of Patras, Greece

*KNOWLEDGE SA, Greece

 

ThMD.19 On the Independence of Digits in Connected Digit Strings 2351

Johan W. Koolwaaij, Lou Boves

Nijmegen University, The Netherlands

 

ThMD.20 A New Procedure for Classifying Speakers in Speaker Verification Systems 2355

Johan W. Koolwaaij, Lou Boves

Nijmegen University, The Netherlands

 

ThMD.21 Sound Channel Video Indexing 2359

Claude Montacie, Marie-Jose Caraty

Univ. Pierre et Marie Curie - CNRS, France

 

ThMD.22 CDHMM Speaker Recognition By Means of Frequency Filtering of Filter-Bank Energies 2363

Javier Hernando, Climent Nadeu

Universität Politecnica de Catalunya, Spain

 

 

SESSION: Th3A

Style and Accent Recognition

Chair: Gerard Chollet, ENST/SIG, Switzerland

 

Th3A.1 Using Accent-Specific Pronunciation Modelling for Improved Large Vocabulary Continuous Speech

Recognition 2367

J. J. Humphries, P. C. Woodland

Cambridge Univ., UK

 

Th3A.2 Automatic Speech Recognition for Children

Alexandros Potamianos, Shrikanth Narayanan, Sungbok Lee

AT&T Labs-Research, USA 2371

 

Th3A.3 Recognition of Non-Native Accents 2375

Carlos Teixeira, Isabel Trancoso, Antonio Serralheiro

INESC, Portugal

 

Th3A.4 Speaking Mode Dependent Pronunciation Modeling in Large Vocabulary Conversational Speech Recognition 2379

Michael Finke, Alex Waibel

Carnegie Mellon Univ., USA

 

Th3A.5 A Prosody-Only Decision-Tree Model for Disfluency Detection 2383

Elizabeth Shriberg, *Rebecca Bates, Andreas Stolcke

SRI International, USA

*Boston Univ., USA

 

Th3A.6 A Novel Training Approach for Improving Speech Recognition Under Adverse Stressful Conditions 2387

Sahar E. Bou-Ghazale, John H.L Hansen

Duke Univ., USA

 

SESSION: Th3B

Phonetics

Chair: Joaquim Llisterri, Univ. of Barcelona, Spain

 

Th3B.1 From Phone Indentification to Phone Clustering Using Mutual Information 2391

Peter O'Boyle, Ji Ming, Marie Owens, F.Jack Smith

Queen's Univ. of Belfast, N. Ireland

 

Th3B.2 Phonetic Code Emergence in a Society of Speech Robots: Explaining Vowel Systems and the MUAF

Principle 2395

Ahmed-Reda Berrah, Rafael Laboissiere

Institut de la Communication Parleé, France

 

Th3B.3 Effects of Voicing on /t,d/ Tongue/Palate Contact in English and Norwegian 2399

Inger Moen, Hanne Gram Simonsen

Univ. of Oslo, Norway

 

Th3B.4 Fieldwork Techniques for Relating Formant Frequency, Amplitude and Bandwidth 2403

Peter Ladefoged, *Gunnar Fant

UCLA, USA

*KTH, Sweden

 

Th3B.5 Word Juncture Modelling Based on the TIMIT Database 2407

Xue Wang, Louis C.W. Pols

Univ. of Amsterdam, The Netherlands

 

 

Th3B.6 The Phonology and Phonetics of Second Language Intonation: The Case of "Japanese English" 2411

Motoko Ueyama

UCLA, USA

 

SESSION: Th3C (SPECIAL SESSION)

Towards Robust ASR for Car and Telephone Applications

Chair: Jean-Claud Junqua, Panasonic Technologies Inc., California, USA

 

Th3C.1 Methods for Microphone Equalization in Speech Recognition 2415

L. Fissore, Giorgio Micca, C. Vair

Centro Studi e Laboratori Telecomunicazioni (CSELT), Italy

 

Th3C.2 Room Acoustics and Reverberation: Impact on Hands-Free Recognition 2419

Satoshi Nakamura, Kiyohiro Shikano

Nara Institute of Science and Technology, Japan

 

Th3C.3 Echo and Noise Reduction for Hands-Free Terminals -State of the Art- 2423

Gerard Faucon, Regine Le Bouquin-Jeannes

Univ. de Rennes I, France

 

Th3C.4 Robust Speech Recognition for Wireless Networks and Mobile Telephony 2427

Reinhold Haeb-Umbach

Philips GmbH, Germany

 

Th3C.5 Robust ASR for the Cellular Environment -

Jay Naik

Nynex, USA

(Not arrived in time to be included in the Proceedings)

 

Th3C.6 Speech Recognition in the Car From Phone Dialing to Car Navigation 2431

Dirk Van Compernolle

Lernout & Hauspie Speech Products NV, Belgium

 

SESSION: Th3D

Language Specific Systems

Chair: Christel Sorin, CNET, Lannion, France

 

Th3D.1 A Keyvowel Approach to the Synthesis of Regional Accents of English 2435

Briony Williams, Stephen Isard

Univ. of Edinburgh, UK

 

Th3D.2 Experimental Implementation of Pitch-Synchronous Synthesis Methods for the ROMVOX Text-to-Speech System 2439

Attila Ferencz, Radu Arsinte, *Istvan Nagy, Teodora Ratiu, Maria Ferencz, †Gavril Toderean, †Diana Zaiu, Tunde-Csilla Kovacs, Lajos Simon

Software ITC SA, Romania

*Music Academy Gh.Dima, Romania

†Technical Univ. of Cluj-Napoca, Romania

 

Th3D.3 The Bell Labs German Text-to-Speech System: An Overview 2443

Bernd Mobius, Richard Sproat,Jan P.H van Santen , Joseph P. Olive

Bell Labs-Lucent Technologies, USA

 

 

Th3D.4 The Generation of Regional Pronunciations of English for Speech Synthesis 2447

Susan Fitt

Univ. of Edinburgh, UK

 

Th3D.5 Bell Laboratories Russian Text-To-Speech System

Elena Pavlova, Yuri Pavlov, Richard Sproat, Chilin Shih, Jan P.H. van Santen

Bell Labs-Lucent Technologies, USA 2451

 

Th3D.6 A Bilingual Text-To-Speech System in Spanish and Catalan 2455

Antonio Bonafonte, Ignasi Esquerra, Albert Febrer, Francesc Vallverdu

Universität Politecnica de Catalunya, Spain

 

SESSION: Th4A

Pronunciation Models

Chair: Jean-Paul Haton, CRIN/CNRS-INRIA, France

 

Th4A.1 Automatic Rule-based Generation of Word Pronunciation Networks 2459

Nick Cremelie, Jean-Pierre Martens

Univ. of Gent, Belgium

 

Th4A.2 Creating User Defined New Vocabularies for Voice Dialing 2463

Jose Maria Elvira, Juan Carlos Torrecilla, Javier Caminero

Telefonica I+D, Spain

 

Th4A.3 Automatic Generation of Context-Dependent Pronunciations 2467

Mosur Ravishankar, Maxine Eskenazi

Carnegie Mellon Univ., USA

 

Th4A.4 Automatic Generation of a Pronunciation Dictionary Based on a Pronunciation Network 2471

Toshiaki Fukada, Yoshinori Sagisaka

ATR ITL, Japan

 

Th4A.5 What is Wrong with the Lexicon-An Attempt to Model Pronunciations Probabilistically 2475

Uwe Jost, Henrik Heine, Gunnar Evermann

Hamburg Univ., Germany

 

Th4A.6 Lexical Tuning Based on Triphone Confidence Estimation 2479

Kevin L. Markey, *Wayne Ward

Berdy Medical Systems, USA

*Carnegie Mellon Univ., USA

 

SESSION: Th4B

Auditory Modelling and Psychoacoustics

Chair: William Ainsworth, Keele Univ., UK

 

Th4B.1 Improving of Amplitude Modulation Maps for FO-Dependent Segregation of Harmonic Sounds 2483

Frederic Berthommier, *Georg Meyer

ICP, INPG, France

*Univ. of Keele, UK

 

Th4B.2 Psychophysical Evaluation of PSOLA: Natural Versus Synthetic Speech 2487

Reinier Kortekaas, Armin Kohlrausch

IPO, The Netherlands

 

 

Th4B.3 Perception of Noised Words by Normal Children and Children with Speech and Language Impairments

Valentina V. Lublinskaja, *Inna V. Koroleva, A.N. Kornev, Elena V. Iagounova 2491

Pavlov Institute of Physiology, Russia

*Institute of Ear, Throat, Nose and Speech Pathology, Russia

 

Th4B.4 Modeling the Perception of Simultaneous Semi-Vowels 2495

Georg F. Meyer, William.A Ainsworth

Keele Univ., UK

 

Th4B.5 Properties of Auditory Model Representations

Fernando Santos Perdigao, Luis V. Sá

Universidade de Coimbra, Portugal 2499

 

Th4B.6 Impact of "Ascending Sequence" AI (Auditory Primary Cortex) Cells on Stop Consonant Perception"

Marta Eduardo Sa, de Sa Luis Vieira

Universidade de Coimbra, Portugal 2503

 

SESSION: Th4C

Voice Conversion and Data Driven F0-Models

Chair: Yoshinori Sagisaka, ATR Interpret. Telecom. Res. Labs., Japan

 

Th4C.1 Application-Dependent Prosodic Models for Text-To-Speech Synthesis and Automatic Design of Learning Database Corpus Using Genetic Algorithm 2507

Olivier Boeffard,Emerard F.

France Telecom-CNET, France

 

Th4C.2 Combinatorial Issues in Text-To-Speech Synthesis

Jan P.H. van Santen

Bell Labs-Lucent Technologies, USA 2511

 

Th4C.3 Automatic Corpus-Based Training of Rules for Prosodic Generation in Text-To-Speech 2515

Eduardo Lopez-Gonzalo, Jose M. Rodriguez-Garcia, Luis Hernandez-Gomez, Juan M. Villar

ETSIT-UPM, Spain

 

Th4C.4 Hidden Markov Model Based Voice Conversion Using Dynamic Characteristics of Speaker 2519

Eun-Kyoung Kim, Sangho Lee, Yung-Hwan Oh

KAIST, Korea

 

Th4C.5 Speaker Interpolation in HMM-Based Speech Synthesis System 2523

Takayoshi Yoshimura, *Takashi Masuko, Keiichi Tokuda, *Takao Kobayashi, Tadashi Kitamura

Nagoya Institute of Technology, Japan

*Tokyo Institute of Technology, Japan

 

Th4C.6 Designing a Speaker Adaptable Formant-Based Text-To-Speech System 2527

Vassilios Darsinos, Dimitrios Galanis, George Kokkinakis

Univ. of Patras, Greece

 

 

SESSION: Th4D

Vocal Tract Analysis

Chair: Antreas Paoloni, Fondazione Ugo Bordoni, Italy

 

Th4D.1 On Using Fractal Features of Speech Sounds in Automatic Speech Recognition 2531

Petros Maragos, *Alexandros Potamianos

ILSP & Georgia Tech, Greece & USA

*AT&T Labs-Research, USA

 

Th4D.2 Dynamic Constraint Weighting in the Context of Articulatory Parameter Estimation 2535

Hywel B. Richards, John S. Bridle, Melvyn J. Hunt, *John S. Mason

Dragon Systems, UK

*Univ. of Wales Swansea, UK

 

Th4D.3 Estimation of Vocal Tract Front Cavity Resonance in Unvoiced Fricative Speech 2539

Minkyu Lee, Donald G. Childers

Univ. of Florida, USA

 

Th4D.4 A Software Tool to Study Portuguese Vowels

Antonio Teixeira, Francisco Vaz, *Jose Carlos Principe

INESC, Portugal

*Univ. of Florida, USA 2543

 

Th4D.5 Post-Synchronization Via Formant-to-Area Mapping of Asynchronously Recorded Speech Signals and Area Functions 2547

Jean Schoentgen, Sorin Ciocea

Univ. Libre de Bruxelles, Belgium

 

Th4D.6 Geometrically and Acoustically Optimized Codebook for Unique Mapping from Formants to Vocal-Tract Shape 2551

Zhenli L. Yu, P.C. Ching

The Chinese Univ. of Hong Kong, Hong Kong

 

SESSION: ThAA

Noise Mitigation, Speech Enhancement II

Chair: Bayya Yegnanarayana, IIT MADRAS, India

 

ThAA.1 Noisy Speech Enhancement by Fusion of Auditory and Visual Information: A Study of Vowel

Transitions 2555

Laurent Girin, Gang Feng, Jean-Luc Schwartz

Univ. of Stendhal, France

 

ThAA.2 Spectral Subtraction Using a Non-Critically Decimated Discrete Wavelet Transform 2559

Andreas Engelberg, Thomas Gulzow

Univ. of Kiel, Germany

 

ThAA.3 Bayesian Affine Trasformation of HMM Parameters for Instantaneous and Supervised Adaptation in Telephone Speech Recognition 2563

Jen-Tzung Chien, Hsiao-Chuan Wang, *Chin-Hui Lee

National Tsing Hua Univ., ROChina

*Bell Labs, USA

 

ThAA.4 Integrated Bias Removal Techniques for Robust Speech Recognition 2567

Craig Lawrence, Mazin Rahim

Univ. of Maryland, USA

 

 

ThAA.5 Acoustic Front Ends for Speaker-Independent Digit Recognition in Car Environments 2571

Detlev Langmann, Alexander Fischer, Friedhelm Wuppermann, Reinhold Haeb-Umbach, Thomas Eisele

Philips GmbH, Germany

 

ThAA.6 Signal Bias Removal Using the Multi-Path Stochastic Equalization Technique 2575

Lionel Delphin-Poulat, Chafic Mokbel

France Telecom, France

 

ThAA.7 Subband Echo Cancellation in Automatic Speech Dialog Systems 2579

Andrej Miksic, Bogomir Horvat

Univ. of Maribor, Slovenia

 

ThAA.8 Speech Enhancencement Via Energy Separation

Hesham Tolba, Douglas O'Shaughnessy

Univ. du Quebec, Canada 2583

 

ThAA.9 A Method of Signal Extraction from Noisy Signal

Masashi Unoki, Masato Akagi 2587

Japan Advanced Institute of Science and Technology, Japan

 

ThAA.10 Multi-Channel Noise Reduction Using Wavelet Filter Bank 2591

Jiri Sika, Vratislav Davidek

Czech Technical Univ., Czech Republic

 

ThAA.11 Speech Signal Detection in Noisy Environment Using a Local Entropic Criterion 2595

Imad Abdallah, Silvio Montresor, Marc Baudry

Laboratoire d'Informatique de l'Univ. du Maine, France

 

ThAA.12 A New Algorithm for Robust Speech Recognition: The Delta Vector Taylor Series Approach

Pedro J. Moreno, Brian Eberman

Digital Equipment Corp., USA 2599

 

ThAA.13 Robust Enhancement of Reverberant Speech Using Iterative Noise Removal 2603

David Cole, Miles Moody, Sridha Sridharan

Queensland Univ. of Technology, Australia

 

ThAA.14 A Network Speech Echo Canceller with Comfort Noise 2607

David J. Jones, Scott D. Watson, Kenneth G. Evans, Barry M.G. Cheetham, *Robert A. Reeves

Univ. of Liverpool, UK

*BT Laboratories, UK

 

ThAA.15 A New Metric for Selecting Sub-Band Processing in Adaptive Speech Enhancement Systems 2611

Amir Hussain, Douglas R. Campbell, Thomas J. Moir

Univ. of Paisley, UK

 

ThAA.16 Estimation of LPC Cepstrum Vector of Speech Contaminated by Additive Noise and its Application to Speech Enhancement 2615

Hidefumi Kobatake, Hideta Suzuki

Tokyo Univ. of Agriculture & Technology, Japan

 

ThAA.17 Multi-Band and Adaptation Approaches to Robust Speech Recognition 2619

Sangita Tibrewala, Hynek Hermansky

Oregon Graduate Institute of Science and Technology, USA

 

 

ThAA.18 Non-Quadratic Criterion Algorithms for Speech Enhancement 2623

Enrique Masgrau, Eduardo Lleida, Luis Vicente

Universidad de Zaragoza, Spain

 

SESSION: ThAB

F0 and Duration Modelling, Spoken language processing

Chair: Richard Schwartz, BBN Systems and Techs, USA

 

ThAB.1 Modeling Segmental Duration with Multivariate Adaptive Regression Splines 2627

Marcel Riedi

ETH Zentrum TIK, Switzerland

 

ThAB.2 High Quality Speech Synthesis for Phonetic Speech Segmentation 2631

Fabrice Malfrere, Thierry Dutoit

Circuits Theory and Signal Processing Lab, Belgium

 

ThAB.3 Factors Affecting Perceived Quality and Intelligibility in the CHATR Concatenative Speech Synthesiser 2635

Nick Campbell, Itoh Yoshiharu, Wen Ding, Norio Higuchi

ATR Interpreting Telecommunications Res. Labs., Japan

 

ThAB.4 Reduced Lexicon Trees for Decoding in a MMI-Connectionist/HMM Speech Recognition System 2639

Christoph Neukirchen, Daniel Willett, Gerhard Rigoll

Gerhard-Mercator-Univ. Duisburg, Germany

 

ThAB.5 A Stochastic Model of Intonation for French Text-to-Speech Synthesis 2643

Jean Veronis, Philippe Di Cristo, Fabienne Courtois, Benoit Lagrue

Univ. de Provence & CNRS, France

 

ThAB.6 Phonetic Rules for a Phonetic-to-Speech System

Angelien A. Sanderman, *Renè Collier 2647

KPN Research, The Netherlands

*Institute for Perception Research, The Netherlands

 

ThAB.7 Multi-Lingual Duration Modeling 2651

Jan P.H van Santen, Chilin Shih, Bernd Mobius, Evelyne Tzoukermann, Michael Tanenblatt

Bell Labs-Lucent Technologies, USA

 

ThAB.8 A Model of Segment (and Pause) Duration Generation for Brazilian Portuguese Text-to-Speech Synthesis 2655

Plinio A. Barbosa

State Univ. of Campinas, Brazil

 

ThAB.9 Parsing Strategy for Spoken Language Interfaces with a Lexicalized Tree Grammar 2659

Ariane Halber, David Roussel

Thomson-CSF, France

 

ThAB.10 What's in a Word Graph -- Evaluation and Enhancement of Word Lattices 2663

Jan W. Amtrup, Henrik Heine, Uwe Jost

Hamburg Univ., Germany

 

 

ThAB.11 Accelerated DP Based Search for Statistical Translation 2667

Christoph Tillmann, Stefan Vogel, Hermann Ney, A. Zubiaga, H. Sawaf

RWTH Aachen, Germany

 

ThAB.12 Use of Pitch Pattern Improvement in the CHATR Speech Synthesis System 2671

Ken Fujisawa, Toshio Hirai, Norio Higuchi

ATR ITL, Japan

 

ThAB.13 Generating Segment Durations in a Text-to-Speech System: A Hybrid Rule-Based/Neural Network

Approach 2675

Gerald Corrigan, Noel Massey, Orhan Karaali

Motorola, USA

 

ThAB.14 On the Global F0 Shape Model Using a Transition Network for Japanese Text-To-Speech Systems

Yasushi Ishikawa, Takashi Ebihara

Mitsubishi Electric Corporation, Japan 2679

 

ThAB.15 An Alternative and Flexible Approach in Robust Information Retrieval Systems 2683

Jose Colas, Juan M. Montero, Javier Ferreiros, Jose M. Pardo

Universidad Politecnica de Madrid, Spain

 

ThAB.16 A Probalistic Approach to Analogical Speech Translation 2687

Keiko Horiguchi, Alexander Franz

Sony, Japan

 

ThAB.17 Dynamic Lexicon for a Very Large Vocabulary Vocal Dictation 2691

Marie-Jose Caraty, Claude Montacie, Fabrice Lefèvre

Univ. Pierre et Marie Curie - CNRS, France

 

SESSION: ThAC

Language Modelling

Chair: Ronald Rosenfeld, Carnegie Mellon Univ., USA

 

ThAC.1 Construction of Language Models Using the Morphic Generator Grammatical Inference (MGGI) Methodology 2695

Encarna Segarra, Luis Hurtado

Universidad Politecnica de Valencia, Spain

 

ThAC.2 An Integrated Language Modeling with N-Gram Model and WA Model for Speech Recognition 2699

Shuwu Zhang, Taiyi Huang

Chinese Academy of Sciences, China

 

ThAC.3 Statistical Analysis of Dialogue Structure 2703

Ye-Yi Wang, Alex Waibel

Carnegie Mellon Univ., USA

 

ThAC.4 Statistical Language Modeling Using the CMU-Cambridge Toolkit 2707

Philip Clarkson, *Ronald Rosenfeld

Cambridge Univ., UK

*Carnegie Mellon Univ., USA

 

ThAC.5 Text Normalization and Speech Recognition in French 2711

Gilles Adda, Martine Adda-Decker, Jean-Luc Gauvain, Lori Lamel

LIMSI, France

 

ThAC.6 A Novel Tree-Based Clustering Algorithm for Statistical Language Modeling 2715

Geraldine Damnati, Jacques Simonin

France Telecom, France

 

ThAC.7 Variable-Length Language Modeling Integrating Global Constraints 2719

Shoichi Matsunaga, Shigeki Sagayama

NTT, Japan

 

ThAC.8 An Hybrid Language Model for Continuous Dictation Prototype 2723

Kamel Smaili, Imed Zitouni, Francois Charpillet, Jean-Paul Haton

CRIN-CNRS & INRIA, Lorraine, France

 

ThAC.9 Dealing with Pronunciation Variants at the Language Model Level for the Continuous Automatic Speech Recognition of French 2727

Laure Pousse, Guy Perennou

IRIT - Equipe IHMPT, France

 

ThAC.10 Rational Interpolation of Maximum Likelihood Predictors in Stochastic Language Modeling 2731

Ernst Günter Schukat-Talamazzini, *Florian Gallwitz, *Stefan Harbeck, *Volker Warnke

Univ. of Jena, Germany

*Univ. of Erlangen, Germany

 

ThAC.11 N-Gram Language Model Adaptation Using Small Corpus for Spoken Dialog Recognition 2735

Akinori Ito, Hideyuki Saitoh, Masaharu Katoh, Masaki Kohda

Yamagata Univ., Japan

 

ThAC.12 Variable N-Gram Language Modeling and Extensions for Conversational Speech 2739

Man-Hung Siu, *Mari Ostendorf

BBN Inc, USA

*Boston Univ., USA

ThAC.13 Fuzzy Class Rescoring: A Part-of-Speech Language Model 2743

Petra Geutner

Univ. of Karlsruhe, Germany

 

ThAC.14 Speech Understanding Based on Integrating Concepts By Conceptual Dependency 2747

Akito Nagai, Yasushi Ishikawa

Mitsubishi Electric Corporation, Japan

 

ThAC.15 Dynamic Language Models for Interactive Speech Applications 2751

Fabio Brugnara, Marcello Federico

Istituto per la Ricerca Scientifica e Tecnologica (IRST), Italy

 

ThAC.16 Large-Scale Lexical Semantics for Speech Recognition Support 2755

George Demetriou, Eric Atwell, Clive Souter

Univ. of Leeds, UK

 

ThAC.17 Integration of Grammar and Statistical Language Constraints for Partial Word-Sequence Recognition 2759

Hajime Tsukada, Hirofumi Yamamoto, Yoshinori Sagisaka

ATR Interpreting Telecommunications Res. Labs., Japan

 

 

ThAC.18 Using Intonation to Constrain Language Models in Speech Recognition 2763

Paul Taylor, Simon King, Stephen Isard, Helen Wright, Jaqueline Kowtko

Univ. of Edinburgh, UK

 

ThAC.19 Incorporating POS Tagging into Language Modeling 2767

Peter A. Heeman, *James F. Allen

France Telecom, France

*Univ. of Rochester, USA

 

ThAC.20 Confidence Metrics Based on N-Gram Language Model Backoff Behaviors 2771

Carl Uhrik, *Wayne Ward

Berdy Medical Systems, USA

*Carnegie Mellon Univ., USA

 

ThAC.21 Structure and Performance of a Dependency Language Model 2775

Ciprian Chelba, *David Engle, Frederick Jelinek, †Victor M. Jimenez, Sanjeev Khudanpur, Lidia Mangu, ¤Harry Printz, **Eric Ristad, ††Ronald Rosenfeld, ‡‡Andreas Stolcke, ¤¤Dekai Wu

John Hopkins Univ., USA

*Dept.of Defense Fort Meade,MD,USA

†Universität Politecnica de Valencia, Spain

¤IBM, USA

**Princeton Univ., USA

††Carnegie Mellon Pittsburgh,PA, USA

‡‡SRI International, USA

¤¤Hong Kong Tech University, Hong Kong

 

ThAC.22 Modeling Linguistic Segment and Turn Boundaries for N-Best Rescoring of Spontaneous Speech

Andreas Stolcke

SRI International, USA 2779

 

ThAC.23 Hybrid Language Models: Is Simpler better?

Peter E. Kenne, Mary O'Kane

Univ. of Adelaide, Australia 2783

 

ThAC.24 Internal and External Tagsets in Part-of-Speech Tagging 2787

Thorsten Brants

Univ. of the Saarland, Germany

 

SESSION: ThAD

Auditory Modelling and Psychoacoustics, Neural Networks for Speech Processing and Recognition

Chair: Phil D. Green, Univ. of Sheffield, UK

 

ThAD.1 A Probabilistic Model of Double-Vowel

Segregation 2791

Laurent Varin, Frederic Berthommier

ICP, INPG, France

 

ThAD.2 Stimulus Signal Estimation From Auditory-Neural Transduction Inverse Processing 2795

Houshang Habibzadeh Vaneghi, Shigeyoshi Kitazawa

Shizuoka Univ., Japan

 

 

ThAD.3 FDVQ Based Keyword Spotter Which Incorporates A Semi-Supervised Learning for Primary Processing 2799

Chakib Tadj, Pierre Dumouchel, *Franck Poirier

Ecole de Technologie Superieure, Canada

*Institut Universitaire Professionnalise, France

 

ThAD.4 The Initial Time Span of Auditory Processing Used for Speaker Attribution of the Speech Signal 2803

Valentina V. Lublinskaja, *Christian Sappok

Pavlov Institute of Physiology, Russia

*Ruhr Universität, Germany

 

ThAD.5 Sparse Connection and Pruning in Large Dynamic Artificial Neural Networks 2807

Nikko Strom

KTH, Sweden

 

ThAD.6 A Modular Initialization Scheme for Better Speech Recognition Performance Using Hybrid Systems of MLPs\HMMs 2811

Roxana Teodorescu,Dirk Van Compernolle, Ioannis Dologlou

K.U Leuven-ESAT, Belgium

 

ThAD.7 Lateralization for Auditory Perception of Foreign Words 2815

Tatiana Chernigovskaya

Russian Academy of Sciences, Russia

 

ThAD.8 The Structural Weighted Sets Method for Continuous Speech and Text Recognition 2819

Yuri Kosarev, Pavel Jarov, Alexander Osipov

Russian Academy of Sciences, Russia

 

ThAD.9 Lateral Inhibitory Networks for Auditory Processing 2823

Christian J. Sumner, Duncan F. Gillies

Imperial College, UK

 

ThAD.10 Missing Fundamentals:A Problem of Auditory or Mental Processing? 2827

Henning Reetz

Univ. of Konstanz, Germany

 

 

ThAD.11 Predictive Neural Networks Applied to Phoneme Recognition 2831

Felix Freitag, Enric Monte, Josep M. Salavedra

Polytechnic University of Catalunya, Spain

 

ThAD.12 Empirical Comparison of Two Multilayer Perceptron-Based Keyword Speech Recognition Algorithms 2835

Suhardi, *Klaus Fellbaum

Technical Univ. of Berlin, Germany

*Brandenburg Technical Univ. of Cottbus, Germany

 

ThAD.13 Segment Boundary Estimation Using Recurrent Neural Networks 2839

Toshiaki Fukada, *Sophie Aveline, Mike Schuster, Yoshinori Sagisaka

ATR Interpreting Telecommunications Res. Labs.,

*ENST, France

 

ThAD.14 Incorporation of HMM Output Constraints in Hybrid NN/HMM Systems During Training 2843

Mike Schuster

ATR ITL, Japan

 

ThAD.15 Principles of the Hearing Periphery Fuctioning in New Methods of Pitch Detection and Speech

Enhancement 2847

Ludmila Babkina, *Sergey Koval, Alexander Molchanov

Research Institute of Ear,Nose, Throat and Speech Disorders, Russia

*Speech Technology Centre, Russia

 

ThAD.16 The Locus of the Syllable Effect: Prelexical or Lexical? 2851

Christine Meunier, *Alain Content, Uli H. Frauenfelder, †Ruth Kearns

Univ. of Geneva, Switzerland

*Univ. Libre de Bruxelles, Belgium

†Medical Research Council, UK

 

ThAD.17 On Not Remembering Disfluencies 2855

Ellen Gurman Bard, Robin J. Lickley

Univ. of Edinburgh, UK

 

ThAD.18 Using an Auditory Model and Leaky Autocorrelators to Tune In to Speech 2859

Tjeerd Andringa

Univ. of Groningen, The Netherlands