Keynote Speech 1: Is Syntactic Structure Prosodically Recoverable? KN-1
Speaker: Mario Rossi, Institut de Phonetique d'Aix-en-Provence, Laboratoire Parole et Langage, France
Chair: Louis Pols, ESCA, Univ. of Amsterdam, The Netherlands
Keynote Speech 2: Conversational Interfaces: Advances and Challenges KN-9
Speaker: Victor Zue, MIT, USA
Chair: Paul Daalsgard, Aalborg University, Denmark
Keynote Speech 3: Prosodic Modelling in Text-to-Speech Synthesis KN-19
Speaker: Jan P. H. van Santen, Bell Labs-Lucent Technologies, USA
Chair: Hiroya Fujisaki, Science Univ. of Tokyo, Japan
Keynote Speech 4: Robust Speech Recognition: Review and Perspectives
Chair: Joseph Mariani, LIMSI-CNRS, France
Speech 4A (8:35-9:00): Impact of the Unknown Communication Channel on Automatic Speech Recognition: A Review KN-29
Speaker: Jean-Claude Junqua, Panasonic Technologies Inc., USA
Speech 4B (9:00-9:25): Statistical Techniques for Robust ASR: Review and Perspectives KN-33
Speaker: Jerome Bellegarda, Apple Computer, USA
Speech 4C (9:25-9:50): Using Missing Feature Theory to Actively Select Features for Robust Speech Recognition with Interruptions, Filtering and Noise KN-37
Speaker: Richard Lippmann, Lincoln Laboratory MIT, USA
Keynote Speech 5: Perspectives of Speech Technology Research Highlighted in Eurospeech '97
Speaker: Wolfgang Hess, Univ. of Bonn, Germany
Chair: George Kokkinakis, WCL, Univ. of Patras, Greece
SESSION: M4A
Acoustic Modelling I
Chair: Roger Moore, DRA, UK
M4A.1 Using Multiple Time Scales in a Multi-Stream Speech Recognition System 3
Stephane Dupont, *Herv? Bourlard
FPMs-TCTS, Belgium
*IDIAP, Switzerland
M4A.2 Speech Recognition Using HMM-State Confusion Characteristics 7
Wakita Yumi, Harald Singer, Yoshinori Sagisaka
ATR Interpreting Telecommunications Res. Labs., Japan
M4A.3 Bottom-up and Top-down State Clustering for Robust Acoustic Modeling 11
Cristina Chesta, Pietro Laface, *Franco Ravera
Politecnico di Torino, Italy
*Centro Studi e Laboratori Telecomunicazioni (CSELT), Italy
M4A.4 Comparison of Optimization Methods for Discriminative Trainining Criteria 15
Ralf Schluter, W. Macherey, S. Kanthak, Hermann Ney, Lutz Welling
RWTH Aachen, Germany
M4A.5 Clustering Beyond Phoneme Contexts for Speech Recognition 19
Clark Z. Lee, Douglas O'Shaughnessy
INRS Telecommunications, Canada
M4A.6 Influence of Outliers in Training the Parametric Trajectory Models for Speech Recognition 23
Rathinavelu Chengalvarayan
Bell Labs-Lucent Technologies, USA
SESSION: M4B
Dynamic Articulatory Measurements
Chair: René Carre, ENST, France
M4B.1 Adaptation of Natural Articulatory Movements
to the Control of the Command Parameters of a Production Model 27
Laurence Candille, Henri Meloni
CERI, France
M4B.2 Three-Dimensional Coarticulatory Strategies of Tongue Movement 31
Maureen Stone, *Andrew Lundberg, *Edward Davis, Rao Gullapalli, Moriel NessAiver
Univ. of Maryland, USA
*John Hopkins Univ., USA
M4B.3 From Laryngographic and Acoustic Signals to Voicing Gestures 35
Nathalie Parlangeau, Regine André-Obrecht
IRIT - Equipe IHMPT, France
M4B.4 Ultrasonographic Measurement of Cricothyroid Space in Speech 39
Erkki Vilkman, Raija Takalo, Maatta Taisto, *Anne-Maria Laukkanen, Jaana Nummenranta, Lipponen Tero
Univ. of Oulu, Finland
*Univ. of Tampere, Finland
M4B.5 Coarticulation and Articulatory Compensations Studied by Dynamic MRI 43
Didier Demolin, M. George, V. Lecuit, *T. Metens, /‡A. Soquet, †H. Raeymaekers
Univ. Libre de Bruxelles, Belgium
*Magnetic Resonance Unit,Hopital Erasme
†Philips Medical Systems, Belgium
‡Collaborateur Scientifique-FNRS
M4B.6 Determining Tongue Articulation: From Discrete Fleshpoints to Continuous Shadow 47
Pierre Badin, Enrico Baricchi, Anne Vilain
Institut de la Communication Parleé, France
SESSION: M4C
Language Identification
Chair: Marc Zissman, MIT Lincoln Laboratory, USA
M4C.1 Predicting, Diagnosing and Improving Automatic Language Identification Performance 51
Marc A. Zissman
MIT Lincoln Laboratory, USA
M4C.2 Language Identification with Language-Independent Acoustic Models 55
Cristobal Corredor-Ardoy, Jean Luc Gauvain, Martine Adda-Decker, Lori Lamel
LIMSI, France
M4C.3 Bayesian Methods for Language Verification 59
Eluned S. Parris, Harvey Lloyd-Thomas, Michael Carey, *Jerry H. Wright
Ensigma Ltd, UK
*Bristol Univ., UK.
M4C.4 Use of Recurrent Network for Unknown Language Rejection in Language Identification System 63
Hingkeung Kwan, Keikichi Hirose
Univ. of Tokyo, Japan
M4C.5 Language-Identification Based on Cross-Language Acoustic Models and Optimized Information Combination
Ove Andersen, Paul Dalsgaard 67
Aalborg Univ., Denmark
M4C.6 Phonetic-Context Mapping in Language Identification 71
Jiri Navratil, Werner Zuehlke
Technische Univ. Ilmenau, Germany
SESSION: M4D
Neural Networks for Speech and Language Processing
Chair: Wolfgang Hess, Univ. of Bonn, Germany
M4D.1 Discriminative Feature and Model Design for Automatic Speech Recognition 75
Mazin Rahim, Yoshua Bengio, Yann LeCun
AT&T Labs-Research, USA
M4D.2 Large Vocabulary Speech Recognition with Context Dependent MMI-Connectionist/HMM Systems Using the WSJ Database 79
Jörg Rottland, Christoph Neukirchen, Daniel Willett, Gerhard Rigoll
Gerhard-Mercator-Univ. Duisburg, Germany
M4D.3 Automatic Selection of Segmental Acoustic Parameters by Means of Neural-Fuzzy Networks for Reordering in N-best HMM Hypotheses 83
Thierry Moudenc, Guy Mercier
France Telecom, France
M4D.4 Comparison Results for Segmental Training Algorithms for Mixture Density HMMS 87
Mikko Kurimo
Helsinki Univ. of Technology, Finland
M4D.5 A Connectionist Approach to Machine Translation
Asuncion Castano, *Francisco Casacuberta 91
Universität Jaume I, Spain
*Univ. Politecnica de Valencia, Spain
M4D.6 Continuous Speech Recognition Using a Context Sensitive ANN and HMM2s 95
Nicolas Pican, Jean-Francois Mari, Dominique Fohr
CRIN-CNRS & INRIA Lorraine, France
SESSION: MAA
Training Techniques, Efficient Decoding in ASR
Chair: Jerome Bellegarda, Apple Computer, USA
MAA.1 Acoustic Modeling Based on the MDL Principle for Speech Recognition 99
Koichi Shinoda, Takao Watanabe
NEC Corporation, Japan
MAA.2 Discriminative Utterance Verification Using Multiple Confidence Measures 103
Piyush Modi, Mazin Rahim
AT&T, USA
MAA.3 Subspace Distribution Clustering for Continuous Observation Density Hidden Markov Models 107
Enrico Bocchieri, Brian Mak
AT&T Labs-Research, USA
MAA.4 A Comparative Study of Methods for Phonetic Decision-Tree State Clustering 111
H.J. Nock, M.J.F. Gales, Steve Young
Cambridge Univ., UK
MAA.5 Comparing Gaussian and Polynomial Classifica-tion in SCHMM-Based Recognition Systems 115
Alfred Kaltenmeier, Jurgen Franke
Daimler Benz AG, Germany
MAA.6 Maximum Likelihood Successive State Splitting Algorithm for Tied-Mixture HMNET 119
AlexAndré Girardi, Harald Singer, Kiyohiro Shikano, Satoshi Nakamura
Nara Institute of Science and Technology, Japan
MAA.7 String-Level MCE for Continuous Phoneme Recognition 123
Erik McDermott, Shigeru Katagiri
ATR Interpreting Telecommunications Res. Labs., Japan
MAA.8 HMM State Clustering Across Allophone Class Boundaries 127
Ze'ev Rivlin, Ananth Sankar, Harry Bratt
SRI International, USA
MAA.9 Weighted Determinization and Minimization for Large Vocabulary Speech Recognition 131
Mehryar Mohri, Michael Riley
AT&T Labs-Research, USA
MAA.10 Parallel Speech Recognition 135
Steven Phillips, Anne Rogers
AT&T Labs-Research, USA
MAA.11 Fast Likelihood Computation Methods for Continuous Mixture Densities in Large Vocabulary Speech Recognition 139
Stefan Ortmanns, Thorsten Firzlaff, Hermann Ney
RWTH Aachen Univ. of Technology, Germany
MAA.12 A Static Lexicon Network Representation for Cross-Word Context Dependent Phones 143
Kris Demuynck, Jacques Duchateau, Dirk Van Compernolle KU Leuven-ESAT, Belgium
MAA.13 Decision-Tree Based Quantization of the Feature Space of a Speech Recognizer 147
Mukund Padmanabhan, L.R. Bahl, D. Nahamoo, De Souza Peter
IBM, USA
MAA.14 Sub-Vector Clustering to Improve the Memory
and Speed Performance of the Acoustic Likelihood Computation 151
Mosur Ravishankar, *R. Bisiani, E. Thayer
Carnegie Mellon Univ., USA
*Univ. of Milan, Italy
MAA.15 The Incorporation of Path Merging in a Dynamic Network Recogniser 155
Simon Hovell
BT Laboratories, UK
MAA.16 Improvement on Connected Digits Recognition Using Duration Constraints in the Asynchronous Decoding Scheme 159
Miroslav Novak
IBM, USA
MAA.17 Explicit Word Error Minimization in N-Best List Rescoring 163
Andreas Stolcke, Yochai Konig, Mitchel Weintraub
SRI International, USA
MAA.18 Efficient 2-Pass N-Best Decoder 167
Long Nguyen, Richard Schwartz
BBN Systems and Technologies, USA
MAA.19 A Memory Management Method for a Large Word Network 171
Tomohiro Iwasaki, Yoshiharu Abe
Mitsubishi Electric Corporation, Japan
SESSION: MAB
Prosody
Chair: Nick Campbell, ATR, Japan
MAB.1 Persistence of Prosodic Features Between Dialectal and Standard Italian Utterances in Six Sub-Varieties of a Region of Southern Italy (Salento): First Assessments of
the Results of a Recognition Test and an Instrumental Analysis 175
Antonio Romano
Univ. Stendhal, France
MAB.2 Improving the Phonetic Annotation by Means of Prosodic Phrasing 179
Halewijn Vereecken, *Annemie Vorstermans, Jean-Pierre Martens, *Bert Van Coile
Univ. of Gent, Belgium
*L&H, Belgium
MAB.3 A Descriptive Study of Prosodic Phenomena in MPUR (West Papuan Phylum) 183
Cecilia Ode
Leiden Univ., The Netherlands
MAB.4 Automated Quantitative Analysis of Fo Contours of Utterances From a German ToBI-Labeled Speech
Database 187
HansJörg Mixdorff, *Hiroya Fujisaki
TU Dresden, Germany
*Science Univ. of Tokyo, Japan
MAB.5 Identification and Automatic Generation of Prosodic Contours for Text-to-Speech Synthesis System in
French 191
De Tournemire Stephanie
France Telecom, France
MAB.6 Quantitative Analysis and Formulation of Tone Concatenation in Chinese FO Contours 195
Jin-Fu Ni, Ren-Hua Wang, *Keikichi Hirose
Univ. of Science and Technology of China, ROChina
*Univ. of Tokyo, Japan
MAB.7 An Environment for the Labelling and Testing of Melodic Aspects of Speech 199
Christel Brindopke, Arno Pahde, Franz Kummert, Gerhard Sagerer
Univ. of Bielefeld, Germany
MAB.8 PROPAUSE: A Syntactico-Prosodic System Designed to Assign Pauses 203
David Casacuberta, Lourdes Aguilar, Rafael Marin
Universität Autonoma de Barcelona, Spain
MAB.9 Integrated Dialog Act Segmentation and Classification Using Prosodic Features and Language Models 207
Volker Warnke, *Ralf Kompe, Heinrich Niemann, Elmar Noeth
Univ. of Erlangen-Nüremburg, Germany
*Sony International (Europe), Germany
MAB.10 Evaluation of Prosodic Characteristics in Retold Stories in Dutch by Means of Semantic Scales 211
Monique E. van Donzel, Florien J. Koopmans-van Beinum
Univ. of Amsterdam, The Netherlands
MAB.11 Text-to-Intonation in Spontaneous Swedish 215
Gosta Bruce, Marcus Filipsson, Johan Frid, *Björn Granström, *Kjell Gustafson, Merle Horne, David House
Lund Univ., Sweden
*KTH, Sweden
MAB.12 Synthesing Attitudes with Global Rhythmic and Intonation Contours 219
Yann Morlec, Gerard Bailly, Veronique Auberge
Institut de la Communication Parleé, France
MAB.13 Prosody-Particle Pairs as Discourse Control
Signs 223
Dafydd Gibbon, Claudia Sassen
Univ. of Bielefeld, Germany
MAB.14 Focus Detection with Additional Information of Phrase Boundaries and Sentence Mode 227
Anja Elsner
Univ. of Bonn,Germany
MAB.15 The Role of Prosody in Infants' Native-Language Discrimination Abilities: The Case of Two Phonologically Close Languages 231
Laura Bosch, Nuria Sebastian-Galles
Universität de Barcelona, Spain
MAB.16 Prosodic Cycles and Interpersonal Synchrony in American English and Swedish 235
Eugene H. Buder, *Anders Eriksson
Univ. of Memphis, USA
*Umeå Univ., Sweden
MAB.17 Relating Prosody to Syntax: Boundary Signalling in Swedish 239
Eva Strangert
Umeå Univ., Sweden
MAB.18 On Representation of Fundamental Frequency of Speech for Prosody Analysis Using Reliability Function
Mitsuru Nakai, Hiroshi Shimodaira 243
Japan Advanced Institute of Science and Technology, Japan
MAB.19 Efficient Method of Establishing Words Tone Dictionary for Korean TTS System 247
Seong-hwan Kim, Jin-young Kim
Chonnam National Univ., South Korea
MAB.20 Perception of Questions and Statements in Neapolitan Italian 251
D’Imperio Mariapaola *David House
Ohio State University, USA
*Lund Univ., Sweden
SESSION: T1A
Keyword and Topic Spotting
Chair: Joseph Mariani, LIMSI-CNRS, France
T1A.1 Key-Phrase Spotting Using An Integrated Lan-guage Model of N-Grams and Finite-State Grammar 255
Lin Qiguang, Dave Lubensky, Michael Picheny, P. Srinivasa Rao
IBM, USA
T1A.2 Efficient Methods for Detecting Keywords in Continuous Speech 259
Jochen Junkawitsch, *Günther Ruske, Harald Höge
Siemens AG, Germany
*Munich Univ.of Technology, Germany
T1A.3 Providing Sublexical Constraints for Word Spotting Within the ANGIE Framework 263
Raymond Lau, Stephanie Seneff
MIT Laboratory for Computer Science, USA
T1A.4 Usefulness of Phonetic Parameters in a Rejection Procedure of an HMM Based Speech Recognition System
Katarina Bartkova, Denis Jouvet 267
France Telecom, France
T1A.5 Keyword Spotting Using FO Contour Matching
Yoichi Yamashita, *Riichiro Mizoguchi 271
Ritsumeikan Univ., Japan
*Osaka Univ., Japan
T1A.6 A Frame and Segment Based Approach for Topic Spotting 275
Elmar Noeth, Stefan Harbeck, Heinrich Niemann, Volker Warnke
Univ. of Erlangen-Nüremburg, Germany
SESSION: T1B
Robustness in Recognition and Signal Processing I
Chair: Helmut Mangold, Daimler Benz, Germany
T1B.1 Cyclic Autocorrelation-Based Linear Prediction Analysis of Speech 279
K.K. Paliwal, Yoshinori Sagisaka
ATR Interpreting Telecommunications Res. Labs., Japan
T1B.2 Novel Filler Acoustic Models for Connected Digit Recognition 283
Ilija Zeljkovic, Shrikanth Narayanan
AT&T Labs-Research, USA
T1B.3 A Non-Iterative Model-Adaptive E-CMN/PMC Approach for Speech Recognition in Car Environments
Makoto Shozakai, Satoshi Nakamura, Kiyohiro Shikano
Nara Institute of Science and Technology, Japan 287
T1B.4 Discriminative Feature Extraction for Speech Recognition in Noise 291
la Torre Angel de, Antonio M. Peinado, Antonio J. Rubio, Pedro Garcia
Universidad de Granada, Spain
T1B.5 Noise Robust Recognition Using Feature Selective Modeling 295
Michael K. Brendborg, Borge Lindberg
Aalborg Univ., Denmark
T1B.6 Mixture Input Transformations for Adaptation of Hybrid Connectionist Speech Recognizers 299
Victor Abrash
SRI International, USA
SESSION: T1C
Modelling of Prosody
Chair: Hiroya Fujisaki, Science Univ. of Tokyo, Japan
T1C.1 Metrical Representation of Demarcation and Constituency in Long Noun Phrases 303
Christos Malliopoulos, *George Mikros
National Technical Univ. of Athens, Greece
*ILSP, Greece
T1C.2 A System of Stylized Intonation Contours for
German 307
Hannes Pirker, *Kai Alter, Erhard Rank, John Matiasek, Harald Trost, †Germnot Kubin
Austrian Research Institute for Artificial Intelligence (OFAI), Austria
*Max-Planck-Institute of Cognitive Neuroscience, Germany
†Vienna Univ. of Technology, Austria
T1C.3 A Method of Representing Fundamental Frequency Contours of Japanese Using Statistical Models of Moraic Transition 311
Keikichi Hirose, Kouji Iwano
Univ. of Tokyo, Japan
T1C.4 Modeling Arbitrarily Long Sentence-Spanning F0 Contours by Parametric Concatenation of Word-Spanning Patterns 315
Stavroula-Evita Fotinea, *Michael Vlahakis, *George Carayannis
National Technical Univ. of Athens, Greece
*ILSP, Greece
T1C.5 Strong Interaction Between Factors Influencing Consonant Duration 319
R. J. J. H. van Son, *Jan P.H. van Santen
Univ. of Amsterdam, The Netherlands
*Bell Labs-Lucent Technologies, USA
T1C.6 Speech Timing in Slovenian TTS 323
Jerneja Gros, Nikola Pavesic, France Mihelic
Univ. of Ljubljana, Slovenia
SESSION: T1D
Microphone Arrays for Speech Enhancement
Chair: Alan Bradley, RMIT, Australia
T1D.1 Small Microphone Arrays with Optimized Directivity for Speech Enhancement 327
Matthias Dorbecker
Aachen Univ. of Technology, Germany
T1D.2 Microphone Array Design Measures for Hands-Free Speech Recognition 331
Masaaki Inoue, Satoshi Nakamura, Takeshi Yamada, Kiyohiro Shikano
Nara Institute of Science and Technology, Japan
T1D.3 Noise Reduction by Paired Microphones 335
Masato Akagi, Mitsunori Mizumachi
Japan Advanced Institute of Science and Technology, Japan
T1D.4 A Microphone Array for Speech Enhancement Using Multiresolution Wavelet Transform 339
Djamila Mahmoudi
LTS-DE, EPFL, Switzerland
T1D.5 A Two-Channel Adaptive Microphone Array with Target Tracking 343
Yoshifumi Nagata, Hiroyuki Tsuboi
Toshiba Corporation, Japan
T1D.6 Use of Different Microphone Array Configurations for Hands-Free Speech Recognition in Noisy and Reverberant Environment 347
Diego Giuliani, Marco Matassoni, Maurizio Omologo, Piergiorgio Svaizer
Istituto per la Ricerca Scientifica e Tecnologica (IRST), Italy
SESSION: T2A
Multilingual Recognition
Chair: Richard Lippman, MIT Lincoln Lab., USA
T2A.1 YINHE: A Mandarin Chinese Version of the Galaxy System 351
Chao Wang, James R. Glass, Helen Meng, Joe Polifroni, Stephanie Seneff, Victor Zue
MIT Laboratory for Computer Science, USA
T2A.2 Multilingual Speech Recognition for Flexible Vocabularies 355
Patrizia Bonaventura, *Filippo Gallocchio, †Giorgio Micca
Centro Studi e Laboratori Telecomunicazioni (CSELT), Consultant, Italy
†Centro Studi e Laboratori Telecomunicazioni (CSELT), Turin, Italy
*Univ.di Padova, Italy
T2A.3 A Study of Multilingual Speech Recognition 359
Fuliang Weng, Harry Bratt, Leonardo Neumeyer, Andreas Stolcke
SRI International, USA
T2A.4 Multilingual Speech Recognition: The 1996 Byblos Callhome System 363
Jayadev Billa, Kristine Ma, John W. McDonough, George Zavaliagkos, David R. Miller, Kenneth N. Ross, Amro El-Jaroudi
BBN Systems and Technologies, USA
T2A.5 Japanese LVCSR on the Spontaneous Scheduling Task with JANUS-3 367
Tanja Schultz, Detlef Koll, Alex Waibel
Univ. Karlsruhe, Germany
T2A.6 Fast Bootstrapping of LVCSR Systems with Multilingual Phoneme Sets 371
Tanja Schultz, Alex Waibel
Univ. Karlsruhe, Germany
SESSION: T2B
Language Specific Speech Analysis
Chair: John J. Ohala, Univ. of California, USA
T2B.1 Factors of Variation in the Production of the German Dorsal Fricative 375
Bernd Pompino-Marschall, Christine Mooshammer
Center for General Linguistics, Berlin, Germany
T2B.2 EPG and Aerodynamic Evidence for the Copro-duction and Coarticulation of Clicks in ISIZULU 379
Kimberly Thomas
Univ. of California, USA
T2B.3 Formant Trajectory Dynamics in Swabian
Diphthongs 383
Anja Geumann
Muenchen Univ., Germany
T2B.4 The Gestural Organization of Vowels and Consonants: A Cinefluorographic Study of Articulator Gestures in Greenlandic 387
Sidney A.J. Wood
Univ. of Lund, Sweden
T2B.5 The Perception of Coronals in Western Arrernte
Victoria B. Anderson 389
Univ. of California, USA
T2B.6 Acoustic Modelling of American English /r/ 393
Carol Y. Espy-Wilson, *Shrikanth Narayanan, Suzanne E. Boyce, †Abeer Alwan
Boston Univ., USA
*AT&T Labs-Research, USA
†Univ. of California, USA
SESSION: T2C
Feature Estimation I
Chair: Paul Daalsgard, Aalborg University, Denmark
T2C.1 Acoustic Parameters Optimised for Recognition of Phonetic Features 397
Anya Varnich Hansen
Aalborg Univ., Denmark
T2C.2 Heterogeneous Acoustic Measurements for Phonetic Classification 401
Andrew K. Halberstadt, James R. Glass
MIT, USA
T2C.3 Cepstral-Time Matrices and LDA for Improved Connected Digit and Sub-Word Recognition Accuracy
Ben Milner 405
BT Laboratories, UK
T2C.4 Data-Driven Design of Rasta-Like Filters 409
Van Vuuren Sarel , / *Hynek Hermansky
Oregon Graduate Institute of Science and Technology, USA
*International Computer Science Institute, USA
T2C.5 Evaluating Feature Set Performance Using the F-Ratio and J-Measures 413
Simon Nicholson, *Ben Milner, Stephen Cox
UEA, UK
*BT Laboratories, UK
T2C.6 Robust Speech Parameters Located in the Frequency Domain 417
Javier Hernando, Climent Nadeu
Universität Politecnica de Catalunya, Spain
SESSION: T2D
Speech Coding I
Chair: Isabel Trancoso, INESC, Portugal
T2D.1 A Simple and Efficient Algorithm for the Compression of MBROLA Segment Databases 421
van der Vrecken Olivier, Nicolas Pierret, Thierry Dutoit, Vincent Pagel, Fabrice Malfrere
TCTS Lab, Belgium
T2D.2 A Segmental Formant Vocoder Based On Linearly Varying Mixtures of Gaussians 425
Parham Zolfaghari, Tony A. Robinson
Cambridge Univ., UK
T2D.3 Voice Mimic System Using Articulatory Codebook for Estimation of Vocal Tract Shape 429
Samir Chennoukh, Daniel Sinder, Gael Richard, James Flanagan
CAIP Center, Rutgers Univ., USA
T2D.4 Adaptive Transform Coding for Linear Predictive Residual 433
Damith J. Mudugamuwa, Alan B. Bradley
RMIT, Australia
T2D.5 Performance Evaluation of Objective Quality Measures for Coded Speech 437
Akira Takahashi, Nobuhiko Kitawaki, *Paolino Usai, †David Atkinson
NTT, Human Interface Labs, Japan
*Centro Studi e Laboratori Telecomunicazioni (CSELT), Italy
†NTIA, USA
T2D.6 Between Recognition and Synthesis - 300 Bits/ Second Speech Coding 441
Mohamed Ismail, Keith Ponting
Speech Research Unit, UK
SESSION: TMA
Feature Estimation II, Pitch and Prosody
Chair: Egidio Giachin, Centro Studi e Laboratori Telecomunicazioni (CSELT), Italy
TMA.1 A Modified Zero-Crossing Method for Pitch Detection in Presence of Interfering Sources 445
Francois Gaillard, Frederic Berthommier, *Gang Feng, *Jean-Luc Schwartz
ICP/INPG, France
*Institut de la Communication Parleé, France
TMA.2 Using Simulated Annealing Expectation Maximization Algorithm for Hidden Markov Model Parameters Estimation 449
Jacques Simonin, Chafic Mokbel
France Telecom, France
TMA.3 Covariation of Subglottal Pressure, F0 and Glottal Parameters 453
Gunnar Fant, *Stellan Hertegard, Anita Kruckenberg, Johan Liljencrants
KTH, Sweden
*Karolinska Hospital and Huddinge Univ. Hospital, Sweden
TMA.4 The Fractal Behaviour of Unvoiced Plosives: A Means for Classification 457
Anastasios Delopoulos, Maria Rangoussi
NTUA, Greece
TMA.5 A Method for Analysis of the Local Speech Rate Using an Inventory of Reference Units 461
Sumio Ohno, Hiroya Fujisaki, Hideyuki Taguchi
Science Univ. of Tokyo, Japan
TMA.6 Analysis and Modeling of Fundamental Frequency Contours of Greek Utterances 465
Hiroya Fujisaki, Sumio Ohno, Takashi Yagi
Science Univ. of Tokyo, Japan
TMA.7 Characteristics of Slow, Average and Fast Speech and their Effects in Large Vocabulary Continuous Speech Recognition 469
Fernando Martinez, Daniel Tapias, Jörge Alvarez, Paloma Leon
Telefonica I+D, Spain
TMA.8 Analysis of Children's Speech: Duration, Pitch, and Formants 473
Sungbok Lee, Alexandros Potamianos, Shrikanth Narayanan
AT&T Labs-Research, USA
TMA.9 A Method of Measuring Formant Frequencies at High Fundamental Frequencies 477
Hartmut Traunmuller, *Anders Eriksson
Stockholm Univ., Sweden
*Univ. of Umeå, Sweden
TMA.10 Analysis of Speaking Rate Variation in Stress-Timed Languages 481
Tom Brondsted, Jens Printz Madsen
Aalborg Univ., Denmark
TMA.11 Automatic Identification of the Phoneme Boundaries Using a Mixed Parameter Model 485
Paul Micallef, *Ted Chilton
Univ. of Malta, Malta
*Univ. of Surrey, UK
TMA.12 Pitch Detection Reliability Assessment for Forensic Applications 489
Sergey Koval, Veronika Bekasova, Michael Khitrov, Andrey Raev
St Petersburg State Univ., Russia
TMA.13 Efficient Estimation of Perceptual Features for Speech Recognition 493
Zhihong Hu, Etienne Barnard
Oregon Graduate Institute, USA
TMA.14 Towards Decomposing the Sources of Variability in Speech 497
Narendranath Malayath, Hynek Hermansky, Alexander Kain
Oregon Graduate Institute of Science and Technology, USA
TMA.15 Use of Vector-Valued Dynamic Weighting Coefficients for Speech Recognition: Maximum Likelihood Approach 501
Rathinavelu Chengalvarayan
Bell Labs-Lucent Technologies, USA
TMA.16 Automatic Segmentation: Data-Driven Units of Speech 505
S.W. Beet, L. Baghai-Ravary
Aculab plc, UK
TMA.17 On Robust Time-Varying AR Speech Analysis Based on T-Distribution 509
Dejan Bajic
Institute of Applied Mathematics and Electronics, Yugoslavia
TMA.18 A Simple Phoneme Energy Model for the Greek Language and its Application to Speech Recognition 513
Dimitris Tambakas, Iliana Tzima, Nikos Fakotakis, George Kokkinakis
Univ. of Patras, Greece
TMA.19 A Macroscopic Analysis of an Emotional Speech Corpus 517
James E.H. Noad, Sandra P. Whiteside, Phil Green
Univ. of Sheffield, UK
TMA.20 Restoration of Pitch Pattern of Speech Based on a Pitch Generation Model 521
Hiroshi Shimodaira, Mitsuru Nakai, Akihiro Kumata
Japan Advanced Institute of Science and Technology, Japan
TMA.21 The Research of Correlation Between Pitch and Skin Galvanic Reaction at Change of Human Emotional State 525
A.V. Agranovski, O.Y. Berg, D.A. Lednov
Spetsvuzavtomatika Design Bureau, Russia
TMA.22 K-NN Versus Gaussian in HMM-Based Recognition System 529
Claude Montacie, Marie-Jose Caraty, Fabrice Lefèvre
Univ. Pierre et Marie Curie - CNRS, France
TMA.23 Spectral Methods for Voice Source Parameters Estimation 533
Boris Doval, Christophe D'Alessandro, Benoit Diard
LIMSI-CNRS, France
SESSION: TMB
Speech Synthesis Techniques
Chair: Rolf Carlson, KTH, Sweden
TMB.1 Optimising Unit Selection with Voice Source and Formants in the CHATR Speech Synthesis System
Wen Ding, Nick Campbell 537
ATR Interpreting Telecommunications Res. Labs., Japan
TMB.2 A New Framework to Provide High-Controlla-bility Speech Signal and the Development of a Workbench
for it 541
Masanobu Abe, Hideyuki Mizuno, Satoshi Takahashi, Shin'ya Nakajima
NTT, Japan
TMB.3 Shape-Invariant Prosodic Modification Algorithm for Concatenative Text-to-Speech Synthesis 545
Eduardo R. Banga, Carmen Garcia-Mateo, Xavier Fernandez-Salgado
Univ. of Vigo, Spain
TMB.4 An RNN-Based Spectral Information Generation for Mandarin Text-to-Speech 549
Hwang Shaw-Hwa, *Sin-Horng Chen, Chang Saga
Industrial Technology Research Institute (ITRI), Taiwan
*NCTU, Taiwan
TMB.5 Methods for Optimal Text Selection 553
Jan P.H. van Santen, *Adam L. Buchsbaum
Bell Labs-Lucent Technologies, USA
*AT&T Labs, USA
TMB.6 High Resolution Prosody Modification for Speech Synthesis 557
Francisco M. Gimenez de los Galanes, David Talkin
Entropic Research Lab, USA
TMB.7 Text-To-Speech Conversion with Neural Networks: A Recurrent TDNN Approach 561
Orhan Karaali, Gerald Corrigan, Ira Gerson, Noel Massey
Motorola, USA
TMB.8 Data Driven Formant Synthesis 565
Jesper Hogberg
KTH, Sweden
TMB.9 Speech Synthesis Using Non-Uniform Units in the Verbmobil Project 569
Simon King, *Thomas Portele, *Florian Hofer
Univ. of Edinburgh, UK
*Univ. of Bonn, Germany
TMB.10 On the Pronunciation Mode of Acronyms in Several European Languages 573
Isabel Trancoso, *M. Ceu Vianna
INESC, Portugal
*CLUL, Portugal
TMB.11 Evaluation of Speech Synthesis Systems for Dutch in Telecommunication Applications in GSM and PSTN Networks 577
Toni Rietveld, Joop Kerkhoff, *M.J.W.M. Emons, *E.J. Meijer, *Angelien A. Sanderman, *A.M.C. Sluijter
Univ. of Nijmegen, The Netherlands
*KPN Research, The Netherlands
TMB.12 Automatic Diphone Extraction for an Italian Text-to-Speech Synthesis System 581
Bianca Angelini, *Claudia Barolo, Daniele Falavigna, Maurizio Omologo, †Stefano Sandri
Istituto per la Ricerca Scientifica e Tecnologica, Italy
*Eikon Informatica, Italy
†Centro Studi e Laboratori Telecommunicazioni S.p.a, Italy
TMB.13 Simplification of TTS Architecture vs. Operational Quality 585
Eric Keller
Univ. of Lausanne, Switzerland
TMB.14 Felix - A TTS System with Improved Pre-Processing and Source Signal Generation 589
Georg Fries, Antje Wirth
Deutsche Telekom Berkom GmbH, Germany
TMB.15 Investigating the Limitations of Concatenative Synthesis 593
Mike Edgington
BT Laboratories, UK
TMB.16 Speech Coding and Synthesis Using Parametric Curves 597
Luis Miguel Teixeira de Jesus, Gavin C. Cawley
Univ. of East Anglia, UK
TMB.17 Automatically Clustering Similar Units for Unit Selection in Speech Synthesis 601
Alan W Black, Paul Taylor
Univ. of Edinburgh, UK
TMB.18 Improvements on a Trainable Letter-to-Sound Converter 605
Li Jiang, Hsiao-Wuen Hon, Xuedong Huang
Microsoft Corporation, USA
TMB.19 On a Cepstral Pitch Alteration Technique for Prosody Control in the Speech Synthesis System with High Quality 609
Bae MyungJin, KyuHong Kim, WonCheol Lee
Soongsil Univ., Korea
TMB.20 Diphone Concatenation Using a Harmonic Plus Noise Model of Speech 613
Yannis Stylianou, Thierry Dutoit, Juergen Schroeter
AT&T Labs-Research, USA
SESSION: TMC
Technology for S&L Acquisition, Speech Processing Tools
Chair: Petros Maragos, ILSP, Greece
TMC.1 The "Sketchboard": A Dynamic Interpretative Memory and its Use for Spoken Language
Understanding 617
Gerard Sabah
LIMSI-CNRS, France
TMC.2 Speech Technology Integration and Research Platform: A System Study 621
Qiru Zhou, Chin-Hui Lee, Wu Chou, Andrew Pargellis
Bell Labs, USA
TMC.3 Speech Recognition on SPHERIC - An IC for Command & Control Applications 625
Dieter Geller, Markus Lieb, Wolfgang Budde, Oliver Muelhens, Manfred Zinke
Philips GmbH, Germany
TMC.4 Muse: A Scripting Language for the Development of Interactive Speech Analysis and Recognition Tools 629
Michael K. McCandless, James R. Glass
SLS/LCS/MIT, USA
TMC.5 Language Learning Based on Non-Native Speech Recognition 633
Silke Witt, Steve Young
Cambridge Univ., UK
TMC.6 Task Modelling by Sentence Templates 637
Ute Kilian, Klaus Bader
Daimler Benz AG, Germany
TMC.7 Extraction And Representation Rhythmic Components of Spontaneous Speech 641
Shigeyoshi Kitazawa, Hideya Ichikawa, Satoshi Kobayashi, *Nishinuma Yukihiro
Shizuoka Univ., Japan
*Univ. de Provence, France
TMC.8 Automatic Pronunciation Scoring of Specific Phone Segments for Language Instruction 645
Yoon Kim, Horacio Franco, Leonardo Neumeyer
SRI International, USA
CCRMA,Stanford University,USA
TMC.9 Automatic Detection of Mispronunciation for Language Instruction 649
Orith Ronen, Leonardo Neumeyer, Horacio Franco
SRI International, USA
TMC.10 Continuous Formant-Tracking Applied to Visual Representations of the Speech and Speech Recognition
Agustin Alvarez, Rafael Martinez, Victor Nieto, Victoria Rodellar, Pedro Gomez
Universidad Politecnica de Madrid, Spain 653
TMC.11 A Call System Using Speech Recognition to Train the Pronunciation of Japanese Long Vowels, the Mora Nasal and Mora Obstruents 657
Goh Kawai, Keikichi Hirose
Univ. of Tokyo, Japan
TMC.12 An Educational and Experimental Workbench for Visual Processing of Speech Data 661
Jan Nouza, Miroslav Holada, Daniel Hajek
TU of Liberec, Czech Republic
TMC.13 A 3 Channel Digital CVSD Bit-Rate Conversion System Using a General Purpose DSP 665
Yong-Soo Choi, *Hong-Goo Kang, †Kim Sung-Youn, ‡Young-Cheol Park, †Dae-Hee Youn
Yonsei Univ., Korea
*AT&T Labs-Research, USA
†ASSP Lab.,Dept of Elec.Engin., Korea
‡Samsung Biomedical Research Institute, Korea
TMC.14 SLIM Prosodic Module for Learning Activities in a Foreigh Language 669
Rodolfo Delmonte, Mirela Petrea, Ciprian Bacalu
Universita Cá Garzoni-Moro Foscari, Italy
TMC.15 Barge-in Revised 673
Bernhard Kaspar, Karlheinz Schuhmacher, Stefan Feldes
Deutsche Telecom Berkom, GmbH, Germany
TMC.16 WaveEdit, An Interactive Speech Processing Environment for Microsoft Windows Platform 677
Mohammad Akbar
Univ. Joseph Fourier, France
TMC.17 Subarashii: Japanese Interactive Spoken Language Education 681
Farzad Ehsani, Jared Bernstein, Amir Najmi, Ognjen Todic
Entropic Research Labs, USA
TMC.18 Deploying Speech Applications Over the Web
David Goddeau, *William Goldenthal, *Chris Weikat
Digital Equipment Corp., USA 685
*Cambridge Research Laboratory,
TMC.19 CSLUsh: An Extendible Research Environment
Johan Schalkwyk, Jacques de Viller, Sarel van Vuuren , Pieter Vermuelen 689
Oregon Graduate Institute of Science and Technology, USA
TMC.20 A Flexible Client-Server Model for Multilingual CTS/TTS Development 693
Tibor Ferenczi, Ge'za Ne'meth, *Ga'bor Olaszy, Zoltan Gaspar
Technical Univ. of Budapest, Hungary
*Linguistics Institute of Hungarian Academy of Sciences, Hungary
TMC.21 Critically Sampled PR Filterbanks of Nonuniform Resolution Based on Block Recursive Famlet
Transform 697
Unto K. Laine
Helsinki Univ. of Technology, Finland
TMC.22 Automatic Detection of Accent in English Words Spoken by Japenese Students 701
Nobuaki Minematsu, Nariaki Ohashi, Seiichi Nakagawa
Toyohashi Univ. of Technology, Japan
TMC.23 An English Conversation and Pronunciation CAI System Using Speech Recognition Technology 705
Yasuhiro Taniguchi, Allan A. Reyes, Hideyuki Suzuki, Seiichi Nakagawa
Toyohashi Univ. of Technology, Japan
TMC.24 Bringing Spoken Language Systems to the Classroom 709
Stephen Sutton, Ed Kaiser, A. Cronk, Ron Cole
Oregon Graduate Institute, USA
TMC.25 Automatic Assessment of Foreign Speakers' Pronunciation of Dutch 713
Catia Cucchiarini, Lou Boves
Univ. of Nijmegen, The Netherlands
TMC.26 Use of Low Power EM Radar Sensors for Speech Articulator Measurements 717
John F. Holzrichter, Greg C Burnett
Lawrence Livermore National Laboratory, USA
TMC.27 Real Time Measurements of the Vocal Tract Resonances During Speech 721
Julien Epps, Annette Dowd, John Smith, Joe Wolfe
Univ. of New South Wales, Australia
SESSION: TMD
Phonetics and Phonology
Chair: Thomas Portele, Univ. of Bonn, Germany
TMD.1 Linguistic Criteria for Building and Recording Units for Concatenative Speech Synthesis in Brazilian
Portoguese 725
Eleonora Cavalcante Albano, Patricia Aparecida Aquino
UNICAMP, Brazil
TMD.2 "Four-and-Twenty,Twenty-Four".What's in a Number? 729
Knut Kvale, *Arne Kjell Foldvik
Telenor R&D, Norway
*Norwegian Univ. of Science and Technology, Norway
TMD.3 Vowel Nasalization in Brazilian Portuguese: An Articulatory Investigation 733
Antonio de Moraes Joao
Universidade Federal do Rio de Janeiro, Brazil
TMD.4 Rhythmic Organization Pecularities of the Spoken Text 737
Elena Steriopolo
Kiev State Linguistic Univ., Ukraine
TMD.5 Obtaining Confidence Measures from Sentence Probabilities 739
Bernhard Rueber
Philips GmbH, Germany
TMD.6 Sentence Design for Speech Synthesis and Speech Recognition Database by Phonetic Rules 743
Yiqing Zu
Chinese Academy of Social Sciences, China
TMD.7 Identification of Regional Variants of High German from Digit Sequences in German Telephone Speech 747
Christoph Draxler, Susanne Burger
Ludwig-Maximilians-Univ. Muenchen, Germany
TMD.8 Aerodynamic Constraints on the Production of Palatalized Trills: The Case of the Slavic Trilled [r] 751
Darya Kavitskaya
UC Berkeley, USA
TMD.9 An Experimental Phonetic Study of the Interrelationship Between Prosodic Phrase and Syntactic Structure 755
Cheol-jae Seong, *Sanghun Kim
Chungnam National Univ., Korea
*Electronics and Telecommunications Research Institute, Korea
TMD.10 Individual Differences Between Vowel Systems of German Speakers 759
Sebastian J.G.G. Heid
Institute fuer Phonetik und Sprachliche Kommunikation, Germany
TMD.11 Tempo and Its Change in Spontaneous Speech
Anton Batliner, *Andreas Kiebling, †Ralf Kompe, Heinrich Niemann, Elmar Noeth 763
Univ. of Erlangen-Nüremburg, Germany
*Ericsson Eurolab,
†Sony Stuttgart Technology Center,
TMD.12 A Corpus-Based Approach to Diphthong Analysis of Standard Slovenian 767
Bojan Petek, Rastislav Sustarsic
Univ. of Ljubljana, Slovenia
TMD.13 Catalan Vowel Duration 771
Loudres Aguilar, Julia A. Gimenez, Maria Machuca, Rafael Marin, Montse Riera
Universität Autonoma de Barcelona, Spain
TMD.14 The Intonation of Vocatives in Spoken Neapolitan Italian 775
Maria Rosaria Caputo
Napoli, Italy
TMD.15 A Comparative Acoustic Study of Spontaneous and Read Italian Speech 779
Caldognetto Emanuela Magno, Claudio Zmarich, Franco Ferrero
CNR, Italy
TMD.16 A Contribution to the Estimation of Naturalness in the Intonation of Italian Spontaneous Speech 783
Mario Refice, *Michelina Savino, *Martine Grice
DEE,Politecnico di Bari, Italy
*Univ. of the Saarland,Germany
TMD.17 Diphthongs and the Process of Monophthongization in Austria German: A First
Approach 787
Sylvia Moosmüller
Austrian Academy of Sciences, Austria
TMD.18 The Prosody of Broad and Narrow Focus in English: Two Experiments 791
Steve Hoskins
duPont Hospital for Children/Univ. of Delaware, USA
TMD.19 The Domain of Accentual Lengthening in Scottish English 795
Alice Turk, Laurence White
Edinburgh Univ., UK
TMD.20 Spontaneous Dialogue: Some Results About the F0 Predictions of a Pragmatic Model of Information
Processing 799
Mariette Bessac, Geneviève Caelen-Haumont
Univ. Joseph Fourier, France
TMD.21 Phonetic Characteristics of Double Articulations in Some Mangbutu-Efe Languages 803
Didier Demolin, Bernard Teston
Univ. Libre de Bruxelles, Belgium,
Univ. de Lyon 2, France and
Univ. d'Aix en Provence, France
TMD.22 Intonation Modeling for the Southern Dialects of the Basque Language 807
Inmaculada Hernaez, Inaki Gaminde, Borja Etxebarria, Pilartxo Etxebarria
Univ. of the Basque Country, Spain
SESSION: T3A
Confidence Measures in ASR
Chair: Jose M. Pardo, UPM, Spain
T3A.1 A Low-Cost Phonetic Transcription Method 811
Pablo Fetter, Udo Haiber, Peter Regel-Brietzmann
Daimler Benz AG, Germany
T3A.2 Word and Acoustic Confidence Annotation for Large Vocabulary Speech Recognition 815
Lin Chase
Carnegie Mellon Univ., USA
T3A.3 A Senone Based Confidence Measure for Speech Recognition 819
Zachary Bergen, *Wayne Ward
Berdy Medical Systems, USA
*Carnegie Mellon Univ., USA
T3A.4 OOV Utterance Detection Based on the Recognizer Response Function 823
Erica Bernstein, Ward R. Evans
The MITRE Corporation, USA
T3A.5 Estimating Confidence Using Word Lattices 827
Thomas Kemp, Thomas Schaaf
Univ. of Karlsruhe, Germany
T3A.6 Improved Estimation,Evaluation and Applications of Confidence Measures for Speech Recognition 831
Man-Hung Siu, Herbert Gish, Fred Richardson
BBN Inc, USA
SESSION: T3B
Speaker and Language Identification
Chair: Sadaoki Furui, NTT, Japan
T3B.1 Improved Speaker Verification System With Limited Training Data On Telephone Quality Speech 835
Salleh Hussain, Fergus R. McInnes, Mervyn A. Jack
Univ. of Edinburgh, UK
T3B.2 Verbal Information Verification 839
Qi Li, Biing-Hwang Juang, Qiru Zhou, Chin-Hui Lee
Bell Labs, USA
T3B.3 A Segment-Based Speaker Verification System Using SUMMIT 843
Sridevi V. Sarma, Victor Zue
MIT, USA
T3B.4 Speaker Verification on the Word Wide Web 847
Michael Sokolov
Digital Equipment Corp., USA
T3B.5 Text-Prompted Versus Sound-Prompted Passwords in Speaker Verification Systems 851
Johan Lindberg, Hakan Melin
KTH, Sweden
T3B.6 GMM Sample Statistic Log-Likelihoods for Text-Independent Speaker Recognition 855
Michael Schmidt, John Golden, Herbert Gish
BBN Systems and Technologies, USA
SESSION: T3C
Perception of Prosody
Chair: Joseph Olive, Bell Labs, USA
T3C.1 The Influence of Phrase Boundaries on Perceived Prominence in Two-Peak Intonation Contours 859
Toni Rietveld, Carlos Gussenhoven
Univ. of Nijmegen, The Netherlands
T3C.2 Testing the Meaning of Four Dutch Pitch Accent Types 863
Johanneke Caspers
Univ. of Leiden, The Netherlands
T3C.3 A Perceptual Study for Modelling Speaker-Depen-ent Intonation in TTS and Dialog Systems 867
Joachim J. Mersdorf, Thomas Domhover
Ruhr Univ., Germany
T3C.4 Can we Perceive Attitudes Before the End of Sentences? The Gating Paradigm for Prosodic Contours
Veronique Auberge, Tuulikki Grepillat, A. Rilliard 871
Univ. Stendhal, France
T3C.5 To What Extent is Perceived Focus Determined by F0-Cues? 875
Mattias Heldner, Eva Strangert
Umeå Univ., Sweden
T3C.6 Temporal-Alignment Categories of Accent-Lending Rises and Falls 879
David House, *Dik Hermes, †Frederic Beaugendre
Lund Univ., Sweden
*IPO, The Netherlands
†Lernout & Hauspie Speech Products, The Netherlands
SESSION: T3D
Applications of Speech Technology I
Chair: Klaus Fellbaum, Univ. of Cottbus, Germany
T3D.1 WebGalaxy -- Integrating Spoken Language And Hypertext Navigation 883
Raymond Lau, Giovanni Flammia, Christine Pao, Victor Zue
MIT, USA
T3D.2 Pitch Estimation of Singing for Re-Synthesis and Musical Transcription 887
Micheal Carey, Eluned S. Parris, *Graham D. Tattersall
Ensigma Ltd, UK
*Snape Signals Research, UK
T3D.3 Automated Lip Synchronisation for Human-Computer Interaction and Special Effect Animation
Christian Martyn Jones, Satnam Singh Dlay 891
Univ. of Newcastle, UK
T3D.4 Developing Web-Based Speech Applications 895
Charles T. Hemphill, Yeshwant Muthusamy
Texas Instruments, USA
T3D.5 Automatic Post-Synchronization of Speech Utterances 899
Werner Verhelst
Vrije Univ. of Brussel, Belgium
T3D.6 Automatic Generation of Hyperlinks Between Audio and Transcript 903
Jordi Robert-Ribes, *Rami G. Mukhtar
Advanced Computational Systems CRC, Australia
*CSIRO, Australia
SESSION: T4A
Spontaneous Speech Recognition
Chair: Roberto Billi, Centro Studi e Laboratori Telecomunicazioni (CSELT), Italy
T4A.1 Transcription of Broadcast News 907
Jean-Luc Gauvain, Lori Lamel, Gilles Adda, Martine Adda-Decker
LIMSI, France
T4A.2 Can Continuous Speech Recognizers Handle Isolated Speech? 911
Fil Alleva, Xuedong Huang, Mei-Yuh Hwang, Li Jiang
Microsoft Corporation, USA
T4A.3 Toward Automatic Transcription of Japanese Broadcast News 915
Tatsuo Matsuoka, *Yuichi Taguchi, Katsutoshi Ohtsuki, †Sadaoki Furui, *Katsuhiko Shirai
NTT, Japan
*Waseda Univ., Japan
†Tokyo Institute of Technology, Japan
T4A.4 Automatic Detection of Semantic Boundaries 919
Mauro Cettolo, Anna Corazza
Istituto per la Ricerca Scientifica e Tecnologica (IRST), Italy
T4A.5 Connected Digit Recognition in Spontaneous
Speech 923
Etienne Bauche, Bojana Gajic, Yasuhiro Minami, Tatsuo Matsuoka, *Sadaoki Furui
NTT, Japan
*Tokyo Institute of Technology, Japan
T4A.6 Advances in Transcription of Broadcast News 927
Francis Kubala, Hubert Jin, *Spyros Matsoukas, Long Nguyen, Richard Schwartz, John Makhoul
BBN Systems and Technologies, USA
*Northearstern Univ., USA
SESSION: T4B
Language Specific Segmental Features
Chair: Gunnar Fant, KTH, Sweden
T4B.1 The Domain of Final Lengthening in Production and Perception in Dutch 931
Tina Cambier-Langeveld, Marina Nespor, *Vincent J. van Heuven
Univ. of Amsterdam/HIL, The Netherlands
*Leiden Univ./HIL, The Netherlands
T4B.2 Voicing Assimilation as a Cue for Cluster Identification 935
Christine Meunier
Univ. de Genève, Switzerland
T4B.3 On the Perceptual Relevance of Degemination in Dutch 939
Saskia M.M. te Riele, Manon Loef, van O. Herwijen
Utrecht Univ., The Netherlands
T4B.4 Does Deletion of French SCHWA Lead to Neutralization of Lexical Distinctions? 943
Cecile Fougeron, *Donca Steriade
UCLA & Paris III, USA & France
*UCLA, USA
T4B.5 An Approach of the Catalan Palatals Discrimination Based on Durational Patterns of Spectral Evolution 947
Marielle Bruyninckx, Bernard Harmegnies
Univ. de Mons-Hainaut, Belgium
T4B.6 Syllable and Segment Duration at Different Speaking Rates in the Slovenian Language 951
Jerneja Gros, Nikola Pavesic, France Mihelic
Univ. of Ljubljana, Slovenia
SESSION: T4C
Speaker Recognition I
Chair: George Doddigton, SRI, USA
T4C.1 Hybrid Networks Based on RBFN and GMM for Speaker Recognition 955
Wei-Ying Li, Douglas O'Shaughnessy
Univ. du Quebec, Canada
T4C.2 A Discriminative Training Algorithm for Gaussian Mixture Speaker Models 959
Jialong He, Li Liu, Günther Palm
Univ. of Ulm, Germany
T4C.3 Comparison of Background Normalization Methods for Text-Independent Speaker Verification 963
Douglas A. Reynolds
MIT Lincoln Laboratory, USA
T4C.4 Speaker Verification with Limited Enrollment Data
Owen Kimball, Michael Schmidt, Herbert Gish, Jason Waterman 967
BBN Systems and Technologies, USA
T4C.5 Speaker Verification in the Telephone Network: Research Activities in the Cave Project 971
Frederic Bimbot, *Hans-Peter Hutter, †Cedric Jaboulet, ‡Johan W. Koolwaaij, ¤Johan Lindberg, Jean Benoit Pierrot
ENST/CNRS, France
*Ubilab-UBS, Switzerland
†IDIAP, Switzerland
‡KUN, The Netherlands
¤KTH, Sweden
T4C.6 Speaker Verification with GSM Coded Telephone Speech 975
Mark Kuitert, Lou Boves
Univ. of Nijmegen, The Netherlands
SESSION: T4D
Speech Synthesis:Linguistic Analysis
Chair: Björn Granström, KTH, Sweden
T4D.1 Parsers, Prominence, and Pauses 979
Nick Campbell, Tony Hebert, Ezra Black
ATR Interpreting Telecommunications Res. Labs., Japan
T4D.2 Automatic Assignment of Part-of-Speech to Out-of-Vocabulary Words for Text-To-Speech Processing 983
Frederic Bechet, Marc El-Beze
Univ. of Avignon, France
T4D.3 Text-to-Prosody Parsing in an Italian Speech Synthesizer. Recent Improvements 987
Barbara Gili Fivela, *Silvia Quazza
Scuola Normale Superiore, Italy
*Centro Studi e Laboratori Telecomunicazioni (CSELT), Italy
T4D.4 Tagging Syllables 991
Brigitte Krenn
Univ. of the Saarland, Germany
T4D.5 Assigning Phrase Breaks from Part-of-Speech Sequences 995
Alan W. Black, Paul Taylor
Univ. of Edinburgh, UK
T4D.6 Prediction of Word Prominence 999
Christina Widera, Thomas Portele, Maria Wolters
Univ. of Bonn, Germany
SESSION: TAA
Speech Analysis & Modelling
Chair: Pierre Badin, ICP, INPG, France
TAA.1 Acoustic and Perceptual Properties of Phonemes in Continuous Speech as a Function of Speaking Rate 1003
Hisao Kuwabara
Tokyo Univ. of Science & Technology, Japan
TAA.2 New Results in Vowel Production: MRI, EPG, and Acoustic Data 1007
Shrikanth Narayanan, *Abeer Alwan, *Yong Song
AT&T Labs-Research, USA
*Univ. of California, USA
TAA.3 The Temporal Properties of Spoken Japanese are Similar to those of English 1011
Takayuki Arai, Steven Greenberg
International and Computer Science Institute and
Univ. of California at Berkeley, USA
TAA.4 The Amplitudes of the Peaks in the Spectrum: Data from /a/ Context 1015
Anna Esposito
IIASS, Italy
TAA.5 Acoustical Characteristics of Speech and Voice in Speech Pathology 1019
Natalija Bolfan-Stosic, Mladen Hedjever
Univ. of Zagreb, Croatia
TAA.6 Pronuncation Modeling Applied to Automatic Segmentation of Spontaneous Speech 1023
Andreas Kipp, *Maria-Barbara Wesenick, *Florian Schiel
IPSK Univ. of Munich, Germany
*Ludwig-Maximilians-Univ. Muenchen, Germany
TAA.7 Dynamic and Static Improvements to Lexical Baseforms 1027
Simon Downey, Richard Wiseman
BT Laboratories, UK
TAA.8 Signal Driven Generation of Word Baseforms from Few Examples 1031
Andreas Hauenstein
pc-plus GmbH, Germany
TAA.9 Modeling the Acoustic Differences Between L1 and L2 Speech: The Short Vowels of Africaans and South-African English 1035
Elizabeth Botha, *Louis C.W. Pols
Univ. of Pretoria, South Africa
*Univ. of Amsterdam, The Netherlands
TAA.10 Laryngeal Movements and Speech Rate: An X-ray investigation 1039
Beatrice Vaxelaire, Rudolph Sock
Institut de Phonetique de Strasbourg, France
TAA.11 How Flexible is the Human Voice?-A Case Study of Mimicry 1043
Anders Eriksson, Par Wretling
Umea Univ., Sweden
TAA.12 The Effect of Low-Pass Filtering on Estimated Voice Source Parameters 1047
Helmer Strik
Univ. of Nijmegen, The Netherlands
TAA.13 Vowel Development of /i/ and /u/ in 15-36 Month Old Children at Risk and Not at Risk to Stutter 1051
Susan M. Fosnot
Organization UCLA, USA
TAA.14 Optopalatograph: Development of a Device for Measuring Tongue Movement in 3D 1055
Alan Wrench, Alan McIntosh, William Hardcastle
Queen Margaret College, UK
TAA.15 Speech Synthesis and Prosody Modification Using Segmentation and Modelling of the Excitation Signal
Juana M. Gutierrez-Arriola, Francisco M. Gimenez de los Galanes, Mohammed H. Savoji, Jose M. Pardo
Universidad Politecnica de Madrid, Spain 1059
TAA.16 How Can the Control of the Vocal Tract Limit the Speaker's Capability to Produce the Ultimate Perceptive Objectives of Speech? 1063
Christophe Savariaux, Louis-Jean Boe, Pascal Perrier
ICP, INPG, France
TAA.17 A Step Toward General Model for Symbolic Description of the Speech Signal 1067
Goran S. Jovanovic
Institute for Applied Mathematics and Electronics, Yugoslavia
TAA.18 Referring in Long Term Speech by Using Orientation Patterns Obtained from Vector Field of Spectrum Pattern 1071
Kiyoshi Furukawa, Masayuki Nakazawa, Takashi Endo, Oka Ryuichi
Real World Computing Partnership, Japan
SESSION: TAB
Robustness in Recognition and Signal Processing II
Chair: Alex Waibel, Carnegie Mellon Univ., USA
TAB.1 Adaptation of Time Differentiated Cepstrum for Noisy Speech Recognition 1075
Tai-Hwei Hwang, *Lee-Min Lee, Hsiao-Chuan Wang
National Tsing-Hua Univ., ROChina
*Mingchi Institute of Technology, ROChina
TAB.2 On The Importance of Various Modulation Frequencies for Speech Recognition 1079
†Noboru Kanedera, *Takayuki Arai, Hynek Hermansky, Misha Pavel
Oregon Graduate Institute of Science and Technology, USA
*International Computer Science Institute, USA
†Ishikawa National College of Technology,Japan
TAB.3 A Robust RNN-Based Pre-Classification for Noisy Mandarin Speech Recognition 1083
Wei-Tyng Hong, Sin-Horng Chen
National Chiao Tung Univ., ROChina
TAB.4 A Parallel Environment Model (PEM) for Speech Recognition and Adaptation 1087
Mazin Rahim
AT&T Labs, USA
TAB.5 Adaptive Model Combination for Robust Speech Recognition in Car Environments 1091
Volker Schless, Fritz Class
Daimler Benz AG, Germany
TAB.6 A Comparative Study of Speech Detection Methods
Stefaan Gerven Van, Fei Xie 1095
KULEUVEN-ESAT, Belgium
TAB.7 Voice Activity Detection Using Source Separation Techniques 1099
Nikos Doukas, Patrick Naylor, Tania Stathaki
Imperial College, UK
TAB.8 Applying Blind Signal Separation to the Recognition of Overlapped Speech 1103
Tomohiko Taniguchi, Shoji Kajita, Kazuya Takeda, Fumitada Itakura
Nagoya Univ., Japan
TAB.9 Multiresolution Channel Normalization for ASR in Reverberant Environments 1107
Carlos Avendano, Sangita Tibrewala, Hynek Hermansky
Oregon Graduate Institute of Science and Technology, USA
TAB.10 A Speech Pre-Processing Technique for End-Point Detection in Highly Non-Stationary Environments
Rafael Martinez, Agustin Alvarez, Vilda Pedro Gomez, Mercedes Perez, Victor Nieto, Victoria Rodellar
Universidad Politecnica de Madrid, Spain 1111
TAB.11 Application of Several Channel and Noise Compensation Techiques for Robust Speaker Recognition
Laura Docio-Fernandez, Carmen Garcia-Mateo
Univ. of Vigo, Spain 1115
TAB.12 Knowing the Wheat from the Weeds in Noisy
Speech 1119
Hany Agaiby, Thomas J. Moir
Univ. of Paisley, UK
TAB.13 Model-Based Approach for Robust Speech Recognition in Noisy Environements with Multiple Noise Sources 1123
Do Yeong Kim, *Nam Soo Kim, Chong Kwan Un
KAIST, Korea
*SAIT, Korea
TAB.14 Normalization of Speaker Variability by Spectrum Warping for Robust Speech Recognition 1127
Y.C. Chu, Charlie Jie, Vincent Tung, Ben Lin, Richard Lee
Philips Taiwan, ROChina
TAB.15 LPC Poles Tracker for Music/Speech/Noise Segmentation and Music Cancellation 1131
Stephane H. Maes
IBM, USA
TAB.16 Comparative Evaluations of Several Front-Ends for Robust Speech Recognition 1135
Doh-Suk Kim, Jae-Hoon Jeong, Soo-Young Lee, Rhee M. Kil
Korean Advanced Instiute of Science and Technology, Korea
TAB.17 Speaker Normalization Through Formant-Based Warping of the Frequency Scale 1139
Evandro B. Gouvea, Richard M. Stern
Carnegie Mellon Univ., USA
TAB.18 The Use of Cepstral Means in Conversational Speech Recognition 1143
Martin Westphal
Univ. Karlsruhe, Germany
TAB.19 Compensation for Environmental and Speaker Variability by Normalization of Pole Locations 1147
Juan M. Huerta, Richard M. Stern
Carnegie Mellon Univ., USA
TAB.20 Cellular Phone Speech Recognition: Noise Compensation vs. Robust Architectures 1151
Jean-Baptiste Puel, Regine André-Obrecht
IRIT - Universitaire Paul Sabatier, France
TAB.21 Speech Recognition in Noise Using On-Line HMM Adaptation 1155
TungHui Chiang
Advanced Technology Center (ATC), Computer and Communication Labs (CCL), Industrial Technology Research Instiitute (ITRI), ROChina
SESSION: TAC
Acoustic Modelling II
Chair: Vassilios Digalakis, Technical Univ. of Crete, Greece
TAC.1 Incorporating Linguistic Knowledge and Automatic Baseform Generation in Acoustic Subword Unit Based Speech Recognition 1159
Trym Holter, TorBjörn Svendsen
The Norwegian Univ. of Science and Technology (NTNU), Norway
TAC.2 Modeling and Decoding of Crossword Context Dependent Phones in the Philips Large Vocabulary Continuous Speech Recognition System 1163
Peter Beyerlein, Meinhard Ullrich, Patricia Wilcox
Philips GmbH, Germany
TAC.3 Modelling Inter-Frame Dependence with Preceeding and Succeeding Frames 1167
Philip Hanna, Ji Ming, Peter O'Boyle, F.Jack Smith
Queen's Univ. of Belfast, N. Ireland
TAC.4 Continuous Speech Recognition Using Syllables
Rhys James Jones, *Simon Downey, John S. Mason
Univ. of Wales Swansea, UK 1171
*BT Laboratories, UK
TAC.5 A New Approach to Generalized Mixture Tying for Continuous HMM-Based Speech Recognition 1175
Daniel Willett, Gerhard Rigoll
Gerhard-Mercator-Univ. Duisburg, Germany
TAC.6 State Tying for Context Dependent Phoneme
Models 1179
Klaus Beulen, Elmar Bransch, Hermann Ney
RWTH Aachen Univ. of Technology, Germany
TAC.7 A Novel Node Splitting Criterion in Decision Tree Construction for Semi-Continuous HMMS 1183
Jacques Duchateau, Kris Demuynck, Dirk Van Compernolle Katholieke Univ. Leuven-E.S.A.T, Belgium
TAC.8 Creating Unseen Triphones by Phone Concatenation in the Spectral, Cepstral and Formant Domains 1187
Mats Blomberg
KTH, Sweden
TAC.9 Creating Large Subword Units for Speech Recognition 1191
Thilo Pfau, Manfred Beham, *W. Reichl, Günther Ruske
Muenchen Univ. of Technology, Germany
*Bell Laboratories, USA
TAC.10 Segmental Modeling Using a Continuous Mixture of Non-Parametric Models 1195
Jacob Goldberger, David Burshtein, *Horacio Franco
Tel Aviv Univ., Israel
*SRI International, USA
TAC.11 Segmentation and Modeling in Segment-Based Recognition 1199
Jane W. Chang, James R. Glass
MIT, USA
TAC.12 Using Syllables in a Hybrid HMM-ANN Recognition System 1203
Alfred Hauenstein
Siemens AG, Germany
TAC.13 Noise Robust Segment-Based Word Recognition Using Vector Quantisation 1207
Ramalingam Hariharan, Juha Hakkinen, Kari Laurila, Janne Suontausta
Nokia Research Center, Finland
TAC.14 Viterbi Based Splitting of Phoneme HMM's 1211
Luis Javier Rodriguez, *Ines M. Torres
Universidad del Pais Vasco., Spain
*UPV/EHU, Spain
TAC.15 The Demiphone: An efficient Subword Unit for Continuous Speech Recognition 1215
Jose B. Marino, A. Nogueiras, Antonio Bonafonte
Universität Poliitecnica de Catalunya, Spain
TAC.16 Organizing Phone Models Based on Piecewise Linear Segment Lattices of Speech Samples 1219
Hiroaki Kojima, Kazuyo Tanaka
Electrotechnical Lab, Japan
TAC.17 Automatic Architecture Design by Likelihood-Based Context Clustering with Crossvalidation 1223
Ivica Rogina
Univ. Karlsruhe, Germany
TAC.18 Towards Articulatory Speech Recognition: Learning Smooth Maps to Recover Articulator Information 1227
Sam Roweis, *Abeer Alwan
California Institute of Technology, USA
*UCLA, USA
TAC.19 Selection of the Most Effective Set of Subword Units for an HMM-Based Speech Recognition System
Anastasios Tsopanoglou, *Nikos Fakotakis
KNOWLEDGE SA, Greece 1231
*Univ. of Patras, Greece
TAC.20 Multi-Band Continuous Speech Recognition
Christophe Cerisara, Jean-Paul Haton, Jean-Francois Mari, Dominique Fohr
CRIN-CNRS & INRIA Lorraine, France 1235
TAC.21 The Design of Acoustic Parameters for Speaker-Independent Speech Recognition 1239
Nabil N. Bitar, Carol Y. Espy-Wilson
Boston Univ., USA
SESSION: TAD
Speech Coding II
Chair: John Mourjopoulos, Univ. of Patras, Greece
TAD.1 High Quality Split-Band LPC Vocoder and its Fixed Point Real Time Implementation 1243
Stephane Villette, Milos Stefanovic, Ian Atkinson, Ahmet Kondoz
Univ. of Surrey, UK
TAD.2 Missing Packet Recovery Techniques for DM Coded Speech 1247
Wen-Whei Chang, *Hwai-Tsu Chang, *Wan-Yu Meng
National Chiao-Tung Univ., ROChina
*Industrial Technology Research Institute, ROChina
TAD.3 Spectral Sensitivity of LSP Parameters and Their Transformed Coefficients 1251
Vu Hai Le, Laszlo Lois
Technical Univ. of Budapest, Hungary
TAD.4 Reducing the Complexity of the LPC Vector Quantizer Using the K-D Tree Search Algorithm 1255
V. Ramasubramanian, K.K. Paliwal
ATR Interpreting Telecommunications Res. Labs., Japan
TAD.5 LPC Quantization Using Wavelet Based Temporal Decomposition of the LSF 1259
Aweke N. Lemma, *W.Bastiaan Kleijn, Ed F. Deprettere
Delft Univ. of Technology, The Netherlands
*KTH, Sweden
TAD.6 A Novel 1.7/2.4 KB/S DCT Based Prototype Interpolation Speech Coding System 1263
Costas S. Xydeas, Gokhan H. Ilk
Univ. of Manchester, UK
TAD.7 Improved Regular Pulse VSELP Coding of Speech at Low Bit-Rates 1267
Yong-Soo Choi, *Hong-Goo Kang, Sang-Wook Park, †Jae-Ha Yoo, Dae-Hee Youn
Yonsei University, Korea
*AT&T Labs, USA
†LG Electronic Inc., Korea
TAD.8 Joint Estimation of Pitch,Band Magnitudes, and V\UV Decisions for MBE Vocoder 1271
Yong Duk Cho, Hong Kook Kim, Moo Young Kim, Sang Ryong Kim
Samsung Advanced Institute of Technology, South Korea
TAD.9 A New Distance Measure in LPC Coding: Application for Real Time Situations 1275
Balazs Kovesi, Samir Saoudi, Jean Marc Boucher, *Gabor Horvath
ENST-Br, France
*Technological Univ. of Budapest, Hungary
TAD.10 Consideration of Processing Strategies for Very-Low-Rate Compression of Wideband Speech Signals with known Text Transcription 1279
Peter Vepyek, Alan B. Bradley
RMIT, Australia
TAD.11 Zero-Redundancy Error Protection for CELP Speech Codecs 1283
Norbert Gortz
Univ. of Kiel, Germany
TAD.12 Low Bit Rate Speech Coding Using an Improved HSX Model 1287
Ridha Matmti, Milan Jelinek, Jean-Pierre Adoul
Univ. of Sherbrooke, Canada
TAD.13 Phonetic Vocoding with Speaker Adaptation
Carlos M. Ribeiro, Isabel Trancoso
INESC, Portugal 1291
TAD.14 Quantization of Spectral Sequences Using Variable Length Spectral Segments for Speech Coding at Very Low Bit Rate 1295
Geneviève Baudoin, *Jan Cernocky, †Gerard Chollet
ESIEE, France
*FEIVUT, France
†ENST, France
TAD.15 On Modeling Event Functions in Temporal Decomposition Based Speech Coding 1299
Shahrokh Ghaemmaghami, Mohamed Deriche, Boualem Boashash
Queensland Univ. of Technology, Australia
TAD.16 Phase Quantization by Pitch-Cycle Waveform Coding in Low Bit Rate Sinusoidal Coders 1303
Soledad Torres, *Javier F Casajus-Quiros
Universidad de Valladolid, Spain
*Universidad Politecnica de Madrid, Spain
TAD.17 A Perceptual Study of the Greek Vowel Space Using Synthetic Stimuli 1307
Antonis Botinis, *Marios Fourakis, †John W. Hawks
Athens Univ., Greece
*The Ohio State Univ., USA
†Kent State Univ., USA
TAD.18 Mixed Multi-Band Excitation Coder Using Frequency Domain Mixture Function (FDMF) for a Low-Bit Rate Speech Coding 1311
Woo-Jin Han, Sung-Joo Kim, Yung-Hwan Oh
KAIST, Korea
TAD.19 Robust GSM Speech Decoding Using the Channel Decoder's Soft Output 1315
Tim Fingscheidt, Olaf Scheufen
Aachen Univ. of Technology, Germany
TAD.20 A Low-Bit-Rate Speech Coder Using Adaptive Line Spectral Frequency Prediction 1319
Carl W. Seymour, Tony A. Robinson
Cambridge Univ., UK
SESSION: W1A
Dialogue Systems:Applications
Chair: Norman Fraser, Univ. of Surrey, UK
W1A.1 Experiments in Spoken Queries for Document Retrieval 1323
James Barnett, Steve Anderson, *John Broglio, Mona Singh, †R. Hudson, †S.W. Kuo
Dragon Systems, USA
*Univ. of Massachusetts, USA
†Intermetrics Inc., USA
W1A.2 Towards an Automated Directory Information System 1327
Frank Seide, *Andreas Kellner
Philips Research Laboratories Taipei, Taiwan
*Philips GmbH Aachen, Germany
W1A.3 A Strategy for Mixed-Initiative Dialogue Control
Lars Bo Larsen
Aalborg Univ., Denmark 1331
W1A.4 On the Design of Effective Speech-Based Interfaces for Desktop Applications 1335
Jim Hugunin, Victor Zue
MIT, USA
W1A.5 Dialogue Strategies Guiding Users to their Communicative Goals 1339
Matthias Denecke, Alex Waibel
Carnegie Mellon Univ., USA
W1A.6 A Speech Interface for Forms on WWW 1343
Sunil Issar
Carnegie Mellon Univ., USA
SESSION: W1B
Speech Production Modelling
Chair: Michael D. Riley, AT&T Labs, USA
W1B.1 Voice Conversion by Codebook Mapping of Line Spectral Frequencies and Excitation Spectrum 1347
Levent M Arslan, David Talkin
Entropic Research Laboratory, USA
W1B.2 Optimal State Dependent Spectral Represetation for HMM Modeling: A New Theoretical Framework
Chafic Mokbel, *Guillaume Gravier, *Gerard Chollet
France Telecom, France
*ENST, France 1351
W1B.3 Speech Analysis and Systems Using an AM-FM Molulation Model 1355
Alexandros Potamianos, *Petros Maragos
AT&T Labs-Research, USA
*ILSP & Georgia Tech, Greece & USA
W1B.4 Synthesis of Fricative Consonants by Audiovisual-to-Articulatory Inversion 1359
Khaled Mawass, Pierre Badin, Gerard Bailly
ICP, INPG, France
W1B.5 New Transformations of Cepstral Parameters for Automatic Vocal Tract Length Normalization in Speech Recognition 1363
Tom Claes, *Ioannis Dologlou, Louis ten Bosch, Dirk Van Compernolle
Lernout & Hauspie Speech Products, Belgium
*K.U Leuven-E.S.A.T, Belgium
W1B.6 A Multiresolutionally Oriented Approach for Determination of Cepstral Features in Speech Recognition
Simon Dobrisek, France Mihelic, Nikola Pavesic
Univ. of Ljubljana, Slovenia 1367
SESSION: W1C
Speaker Recognition II
Chair: Aaron Rosenberg, AT & T Labs, USA
W1C.1 Speaker Identification with User-Selected Password Phrases 1371
Aaron E. Rosenberg, S. Parthasarathy
AT&T Labs, USA
W1C.2 Speaker Verification Based on Phonetic Decision Making 1375
Jesper Olsen
Aalborg Univ., Denmark
W1C.3 Analysis and Comparison of Score Normalisation Methods for Text-Dependent Speaker Verification 1379
A.M. Ariyaeeinia, P. Sivakumaran
Univ. of Hertfordshire, UK
W1C.4 Automatic Speaker Recognition on a Vocoder
Link 1383
Frederic Jauquet, Patrick Verlinde, Claude Vloeberghs
Royal Military Academy, Belgium
W1C.5 Likelihood Ratio Adjustment for the Compensa-tion of Model Mismatch in Speaker Verification 1387
Frederic Bimbot, *Dominique Genoud
ENST/CNRS, France
*IDIAP, Switzerland
W1C.6 A Lognormal Tied Mixture Model of Pitch for Prosody-Based Speaker Recognition 1391
Kemal M. Sonmez, Larry Heck, Mitchel Weintraub, Elizabeth Shriberg
SRI International, USA
SESSION: W1D
Speech Enhancement I
Chair: Hynek Hermansky, Oregon Graduate Inst. of Science and Tech., USA
W1D.1 Residual Noise Suppression Using Psychoacoustic Criteria 1395
Tim Haulick, Klaus Linhard, Peter Schrogmeier
Daimler Benz AG, Germany
W1D.2 Processing Linear Prediction Residual for Speech Enhancement 1399
*B. Yegnanarayana, Carlos Avendano, Hynek Hermansky, *P.Satyanarayana Murthy
Oregon Graduate Institute of Science and Technology, USA
*ITT MADRAS, India
W1D.3 Combined Acoustic Echo Control and Noise Reduction for Mobile Communications 1403
Stefan Gustafsson, Rainer Martin
Aachen Univ. of Technology, Germany
W1D.4 A Nonstationary Autoregressive HMM and its Application to Speech Enhancement 1407
Ki Yong Lee, Yeol Rheem
Changwon National Univ., Korea
W1D.5 Spectral Subtraction and Mean Normalization in the Context of Weighted Matching Algorithms 1411
Nestor Becerra Yoma, Fergus R. McInnes, Mervyn A. Jack
Univ. of Edinburgh, UK
W1D.6 Improving the Intelligibility of Noisy Speech Using an Audible Noise Suppression Technique 1415
Dionysios Tsoukalas, John Mourjopoulos, George Kokkinakis
Univ. of Patras, Greece
SESSION: W2A
Spoken Language Understanding
Chair: Ioannis Dologlou, ESAT-MI2,KU-Leuven, Belgium
W2A.1 Automatic Acquisition of Salient Grammar Fragments for Call-Type Classification 1419
Jerry H. Wright, Allen L. Gorin, Giuseppe Riccardi
AT&T Labs-Research, USA
W2A.2 Stochastically-Based Natural Language Understanding Across Tasks and Languages 1423
Minker Wolfgang
LIMSI, France
W2A.3 Transducer Composition for Context-Dependent Network Expansion 1427
Michael Riley, Fernando Pereira, Mehryar Mohri
AT&T Labs, USA
W2A.4 Giving Prosody a Meaning 1431
Christian Lieske, *Johan Bos, †Martin Emele, ‡Björn Gamback, *C.J. Rupp
Swiss Federal Institute of Technology Lausanne, Switzerland
*Univ. of Saarland, Germany
†Univ. of Stuttgart, Germany
‡Royal Institute of Technology, Sweden
W2A.5 Feature-Based Language Understanding 1435
Kishore A. Papineni, Salim Roukos, Todd R. Ward
IBM, USA
W2A.6 Speech Translation Based on Automatically Trainable Finite-State Models 1439
Juan Carlos Amengual, *Jose Miguel Benedi, †Klaus Beulen, *Francisco Casacuberta, Asuncion Castano, Antonio Castellanos, *Victor M. Jimenez, *David Llorens, Andres Marzal, †Hermann Ney, Federico Prat, *Enrique Vidal, Juan Miguel Vilar
Universität Jaume I, Spain
*Universidad Politecnica de Valencia, Spain
†RWTH, Germany
SESSION: W2B
Language Model Adaptation
Chair: Herman Ney, RWTH, Germany
W2B.1 Document Space Models Using Latent Semantic Analysis 1443
Yoshihiko Gotoh, Steve Renals
Univ. of Sheffield, UK
W2B.2 Adaptive Topic-Dependent Language Modelling Using Word-Based Varigrams 1447
Sven C. Martin, Jörg Liermann, Hermann Ney
RWTH Aachen, Germany
W2B.3 A Latent Semantic Analysis Framework for Large-Span Language Modeling 1451
Jerome R. Bellegarda
Apple Computer Inc, USA
W2B.4 A Maximum Likelihood Model for Topic Classification of Broadcast News 1455
Richard Schwartz, *Toru Imai, Francis Kubala, Long Nguyen, John Makhoul
BBN Systems and Technologies, USA
*NHK, Japan
W2B.5 Language Modelling for Task-Oriented Domains
Cosmin Popovici, *Paolo Baggia
ICI-Instituto de Cercetari in Informatica, Romania
*Centro Studi e Laboratori Telecomunicazioni
(CSELT), Italy 1459
W2B.6 Chinese Language Model Adaptation Based on Document Classification and Multiple Domain-Specific Language Models 1463
Sung-Chien Lin, Chi-Lung Tsai, *Lee-Feng Chien, *Ker-Jiann Chen, *Lin-Shan Lee
National Taiwan Univerisity, ROChina
*Academia Sinica, ROChina
SESSION: W2C
Prosody and Speech Recognition/ Understanding
Chair: Jan van Santen, Bell Labs-Lucent Technologies, USA
W2C.1 Estimating Prosodic Weights in a Syntactic-Rhythmical Prediction System 1467
Langlais Philippe
CERI, France
W2C.2 Syntactic Information Contained in Prosodic Features of Japanese Utterances 1471
Kazuhiko Ozeki, Kazuyuki Kousaka, Yujie Zhang
The Univ. of Electro-Communications, Japan
W2C.3 Hierarchical Duration Modelling for Speech Recognition Using the ANGIE Framework 1475
Grace Chung, Stephanie Seneff
MIT Laboratory for Computer Science, USA
W2C.4 On the Use of Prosody in a Speech-to-Speech Translator 1479
Volker Strom, Anja Elsner, Wolfgang Hess, *Walter Kasper, †Alexandra Klein, *Hans Ulrich Krieger, ‡Jörg Spilker, ‡Hans Weber, ‡Günther Gorz
Univ. of Bonn, Germany
*German Research Center for AI,Germany
†Univ. of Wein,Austria
‡Univ.of Erlangen-Nurnberg,Germany
W2C.5 Automatic Recognition of Sentence Type from Prosody in Dutch 1483
Vincent J. van Heuven, *Judith Haan, Jos J.A. Pacilly
Leiden University, The Netherlands
*Nijmegen University, The Netherlands
W2C.6 Automatic Word Demarcation Based on Prosody
Paul Munteanu, Bertrand Caillaud, Jean-Francois Serignat, Geneviève Caelen-Haumont
CLIPS/IMAG,France 1487
SESSION: W2D
Wideband Speech Coding
Chair: Jean Pierre Martens, Univ. of Gent, Belgium
W2D.1 A 16-KBIT/S Wideband Speech Codec Scalable with G.729 1491
Akitoshi Kataoka, Sachiko Kurihara, Shigeaki Sasaki, Shinji Hayashi
NTT, Japan
W2D.2 Comparison of Auditory Masking Models for Speech Coding 1495
Michelle E. Lynch, Eliathamby Ambikairajah, *Andy Davis
RTC, Ireland
*BT Labs, UK
W2D.3 Wideband Speech Coding Based on the MBE Structure 1499
Anne Amodio, Gang Feng
Univ. Stendhal/INPG, France
W2D.4 Perceptual Filter Comparisons for Wideband and FM Bandwidth Audio Coders 1503
Marcos Perreau-Guimaraes, *Nicolas Moreau, Madeleine Bonnet
Univ. Renè Descartes - Paris 5, France
*ENST, France
W2D.5 Wideband Coding of Speech Using Neural Network Gain Adaptation 1507
Cheung-Fat Chan, Man-Tak Chu
City Univ. of Hong Kong, Hong Kong
W2D.6 Wideband-Speech APVQ Coding From 16 To 32 KBPS 1511
Josep M. Salavedra
Universität Politecnica de Catalunya, Spain
SESSION: WMA
Speech Recognition in Adverse Environments, CSR and Error Analysis
Chair: Lori Lamel, LIMSI-CNRS, France
WMA.1 A Comparative Analysis of Blind Channel Equalization Methods for Telephone Speech Recognition
Wei-Wen Hung, Hsiao-Chuan Wang
National Tsing Hua Univ., ROChina 1515
WMA.2 HMM Retraining Based on State Duration Alignment for Noisy Speech Recognition 1519
Wei-Wen Hung, Hsiao-Chuan Wang
National Tsing Hua Univ., ROChina
WMA.3 Fast Parallel Model Combination Noise Adaptation Processing 1523
Yasuhiro Komori, Tetsuo Kosaka, Hiroki Yamamoto, Masayuki Yamada
Canon Inc., Japan
WMA.4 Speech Recognition Module for CSCW Using a Microphone Array 1527
Takashi Endo, Shigeki Nagaya, Masayuki Nakazawa, Kiyoshi Furukawa, Ryuuichi Oka
Real World Computing Partnership, Japan
WMA.5 Relative Mel-Frequency Cepstral Coefficients Compensation for Robust Telephone Speech Recognition
*Jiqing Han, Munsung Han, Gyu-Bong Park, Jeongue Park, *Wen Gao
Systems Engineering Research Institute, ETRI, Korea
*Harbin Institute of Technology, ROChina 1531
WMA.6 Robust Speech Detection Method for Speech Recognition System for Telecommunications Networks and ITS Field Trial 1535
Seiichi Yamamoto, Naito Masaki, Shingo Kuroiwa
KDD R&D Labs, Japan
WMA.7 The Tuning of Speech Detection in the Context of a Global Evaluation of a Voice Response System 1539
Laurent Mauuary, Lamia Karray
France Telecom, France
WMA.8 New Methods in Continuous Mandarin Speech Recognition 1543
C. Julian Chen, Ramesh A. Gopinath, Michael D. Monkowski, Michael Picheny, Katherine Shen
IBM, USA
WMA.9 Automanic Transcription of General Audio Data: Effect of Environment Segmentation on Phonetic
Recognition 1547
Michelle S. Spina, Victor Zue
MIT, USA
WMA.10 Automatic Recognition of Continuous Cantonese Speech with Very Large Vocabulary 1551
Alfred Ying Pang NG, L.W. Chan, P.C. Ching
Chinese Univ. of Hong Kong, Hong Kong
WMA.11 Source Normalization Training for HMM Applied to Noisy Telephone Speech Recognition 1555
Yifan Gong
Texas Instruments, USA
WMA.12 The Development of a Speaker Independent Continuous Speech Recognizer for Portuguese 1559
Joao P. Neto, *Ciro A. Martins, *Luis B. Almeida
IST, Portugal
*INESC, Portugal
WMA.13 Blame Assignment for Errors Made by Large Vocabulary Speech Recognizers 1563
Lin Chase
Carnegie Mellon Univ.,USA
WMA.14 Predicting Speech Recognition Performance
Atsushi Nakamura
ATR ITL, Japan 1567
WMA.15 A Voice Activity Detector for the ITU-T 8kbit/s Speech Coding Standard G.729 1571
Scott D. Watson, Barry M.G. Cheetham, *P.A. Barret, *W.T.K. Wong, *A.V. Lewis
The Univ. of Liverpool, UK
*BT Laboratories, UK
WMA.16 Vocabulary-Independent Recognition of American Spanish Phrases and Digit Strings 1575
Yeshwant K. Muthusamy, John J. Godfrey
Texas Instruments, USA
WMA.17 Recognition of Spoken and Spelled Proper
Names 1579
Michael Meyer, Hermann Hild
Univ. Karlsruhe, Germany
WMA.18 HMM Compensation for Noisy Speech Recognition Based on Cepstral Parameter Generation
Takao Kobayashi, Takashi Masuko, *Keiichi Tokuda
Tokyo Institute of Technology, Japan
*Nagoya Institute of Technology, Japan 1583
WMA.19 On the Robustness of the Critical-Band Adaptive Filtering Method for Multi-Source Noisy Speech
Recognition 1587
George Nokas, Evangelos Dermatas, George Kokkinakis
Univ. of Patras, Greece
WMA.20 A Space Transformation Approach for Robust Speech Recognition in Noisy Environments 1591
Cun-tai Guan, Shu-hung Leung, Wing-hong Lau
City Univ. of Hong Kong, Hong Kong
WMA.21 Robust Isolated Word Recognition Using the WSP-PMC Combination 1595
Tzur Vaich, Arnon Cohen
Ben Gurion Univ., Israel
SESSION: WMB
Multimodal Speech Processing, Emerging Techniques and Applications
Chair: Giorgio Micca, Centro Studi e Laboratori Telecomunicazioni (CSELT), Italy
WMB.1 Fuzzy Logic for Rule-Based Formant Speech Synthesis 1599
Spyros Raptis, George Carayannis
ILSP, Greece
WMB.2 Integrating Acoustic and Labial Information for Speaker Identification and Verification 1603
Pierre Jourlin, *Juergen Luettin, *Dominique Genoud, *Hubert Wassner
LIA/IDIAP, France
*IDIAP, Switzerland
WMB.3 Subword Unit Representations for Spoken Document Retrieval 1607
Kenney Ng, Victor Zue
MIT, USA
WMB.4 Non-Linear Representations, Sensor Reliability Estimation and Context-Dependent Fusion in the Audiovisual Recognition of Speech in Noise 1611
Pascal Teissier, Jean-Luc Schwartz, *Anne Guerin-Dugue
ICP, INPG, France
*Laboratoire de Traitement D`Images et de Reconnaissance des Formes, France
WMB.5 Securized Flexible Vocabulary Voice Messaging System on Unix Workstation with ISDN Connection
Philippe Renevey, Andrzej Drygajlo 1615
Swiss Federal Institute of Technology Lausanne, Switzerland
WMB.6 Automatic Deriving of Multiple Variants of Phonetic Trancriptions from Acoustic Signals 1619
Houda Mokbel, Denis Jouvet
France Telecom, France
WMB.7 Improved Bimodal Speech Recognition Using Tied-Mixture HMMs and 5000 Word Audio - Visual Synchronous Database 1623
Satoshi Nakamura, Ron Nagai, Kiyohiro Shikano
Nara Institute of Science and Technology, Japan
WMB.8 On the Use of Phone Duration and Segmental Processing to Label Speech Signal 1627
Philippe Depambour, Regine André-Obrecht, *Bernard Delyon
IRIT - Equipe IHMPT, France
*IRISA, France
WMB.9 Automatic Detection of Disturbing Robot Voice- and Ping Pong-Effects in GSM Transmitted Speech
Martin Paping, Thomas Fahnle
Ascom Systec AG, Switzerland 1631
WMB.10 Speech Synthesis Using Phase Vocoder
Techniques 1635
Joseph Di Martino
Univ. Henri Poincaré Nancy I, France
WMB.11 Integration of Eye Fixation Information with Speech Recognition Systems 1639
Ramesh R. Sarukkai, *Craig Hunter
Univ. of Rochester, USA
*Univ. of Rochester,
WMB.12 Generation of Broadband Speech from Narrowband Speech Using Piecewise Linear Mapping
Yoshihisa Nakatoh, M. Tsushima, T. Norimatsu
Matsushita Electric Industrial Co, Ltd, Japan 1643
WMB.13 An Assessment of the Benefits Active Noise Reduction Systems Provide to Speech Intelligibility in Aircraft Noise Environments 1647
Ian E.C. Rogers
Defence Evaluation and Research Agency, UK
WMB.14 OLGA - A Dialogue System with an Animated Talking Agent 1651
Jonas Beskow, Kjell Elenius, *Scott McGlashan
KTH, Sweden
*Swedish Institute for Computer Science, Sweden
WMB.15 Towards Usable Multimodal Command Languages: Definition and Ergonomic Assessment of Constraints on Users' Spontaneous Speech and Gestures
Sandrine Robbe, Noelle Carbonell, *Claude Valot
CRIN, France
*IMASSA-CERMA, France 1655
WMB.16 Exploiting Repair Context in Interactive Error Recovery 1659
Bernhard Suhm, Alex Waibel
Carnegie Mellon Univ., USA
WMB.17 An Hybrid Image Processing Approach to LipTracking Independent of Head Orientation 1663
Lionel Reveret, *Frederique Garcia, †Christian Benoit, *Eric Vatikiotis-Bateson
INPG/ENSERG, France
*ATR, Japan
†INPG, France
WMB.18 Automatic Modeling of Coartriculation in Text-To Visual Speech Synthesis 1667
Bertrand Le Goff
Univ. of Stendhal, France
WMB.19 A Multimedia Platform for Audio-Visual Speech Processing 1671
Ali Adjoudani, Thierry Guiard-Marigny, Bertrand Le Goff, Lionel Reveret, Christian Benoit
Univ. of Stendhal, France
WMB.20 An Intelligent System for Information Retrieval Over the Internet Through Spoken Dialogue 1675
Hiroya Fujisaki, *Hiroyuki Kameda, Sumio Ohno, Takuya Ito, Ken Tajima, Kenji Abe
Science Univ. of Tokyo, Japan
*Tokyo Engineering Univ., Japan
WMB.21 Data Hiding in Speech Using Phase Coding
Yasemin Yardimci, *Enis A Cetin, *Rashid Ansari
Bilkent University, Turkey
*Univ. of Illinois, USA 1679
WMB.22 CAVE: An On-Line Procedure for Creating and Running Auditory-Visual Speech Perception Experiments-Hardware, Software, and Advantages 1683
Denis Burnham, John Fowler, Michelle Nicol
Univ. of NSW, Australia
SESSION: WMC
Databases, Tools and Evaluations
Chair: Khalid Choukri, ELRA
WMC.1 The Bavarian Archive for Speech Signals: Resources for the Speech Community 1687
Florian Schiel, Christoph Draxler, Hans G. Tillmann
Univ. of Muenchen, Germany
WMC.2 WWWTranscribe - A Modular Transcription System Based on the Word Wide Web 1691
Christoph Draxler
Univ. of Munich, Germany
WMC.3 Design, Recording and Verification of a Danish Emotional Speech Database 1695
Inger S. Engberg, Anya Varnich Hansen, Ove Andersen, Paul Dalsgaard
Aalborg Univ., Denmark
WMC.4 Issues in Database Creation: Recording New Populations, Faster and Better Labeling 1699
Maxine Eskenazi, C. Hogan, J. Allen, R. Frederking
Carnegie Mellon Univ., USA
WMC.5 Design and Analysis of a German Telephone Speech Database for Phoneme Based Training 1703
Stefan Feldes, Bernhard Kaspar, *Denis Jouvet
Deutsche Telekom Berkom, Germany
*France Telekom CNET, France
WMC.6 The Design of a Large Vocabulary Speech Corpus for the Portuguese 1707
Joao P. Neto, Ciro A. Martins, Hugo Meinedo, Luis B. Almeida
INESC, Portugal
WMC.7 Continued Investigations of Laryngectomee Speech in Noise - Measurements and Intelligibility Tests
Lennart Nord, Britta Hammarberg, Elisabet Lunström
KTH, Sweden 1711
WMC.8 An Appreciation Study of an ASR Inquiry System
L.J.M. Rothkrantz, W.A.Th. Manintveld, M.M.M. Rats, R.J. van Vark, J.P.M. de Vreught, H. Koppelaar
Delft Univ. of Technology, The Netherlands 1715
WMC.9 Object-Oriented Modeling of Articulatory Data for Speech Research Information Systems 1719
Kamel Bensaber, Paul Munteanu, Jean-Francois Serignat, Pascal Perrier
Institut de la Communication Parleé (ICP), France
WMC.10 A Korean Speech Corpus for Train Ticket Reservation Aid System Based on Speech Recognition
Kim Woosung, Koo Myoung-Wan
Korea Telecom, Korea 1723
WMC.11 Recall Memory for Earcons 1727
Dawn Dutton, *Candace Kamm, Susan Boyce
AT&T, USA
*AT&T Labs-Research, USA
WMC.12 Semi-Automatic Phonetic Labelling of Large Corpora 1731
Odile Mella, Dominique Fohr
CRIN-CNRS & INRIA Lorraine, France
WMC.13 Corpora-Speech Database for Polish Diphones
Stefan Grocholewski
Poznan Univ. of Technology, Poland 1735
WMC.14 Multilingual Speech Interfaces (MSI) and Dialogue Design Evironments for Computer Telephony Services 1739
Christel Muller, Thomas Ziem
Deutsche Telecom Berkom GmbH, Germany
WMC.15 Getting Started with SUSAS: A Speech Under Simulated and Actual Stress Database 1743
John H.L Hansen, Sahar E. Bou-Ghazale
Duke Univ., USA
WMC.16 A Markup Language for Text-To-Speech Synthesis 1747
Richard Sproat, *Paul Taylor, Michael Tanenblatt, *Amy Isard
Bell Labs, USA
*Univ. of Edinburgh, UK
WMC.17 Several Measures for Selecting Suitable Speech Corpora 1751
Shuichi Itahashi, Naoko Ueda, Mikio Yamamoto
Univ. of Tsukuba, Japan
WMC.18 Greek Speech Database for Creation of Voice Driven Teleservices 1755
IRenè Chatzi, *Nikos Fakotakis, *George Kokkinakis
KNOWLEDGE SA, Greece
*Univ. of Patras, Greece
SESSION: WMD
Applications of Speech Technology
Chair: Richard Winski, Vocalis, UK
WMD.1 Analysis of Infant Cries for the Early Detection of Hearing Impairment 1759
Sebastian Möller, *Rainer Sconweiler
Ruhr-Universität Bochum, Germany
*Hannover Med. School, Germany
WMD.2 Optical Logo-Therapy (OLT): A Computer-Based Real Time Visual Feedback Application for Speech
Training 1763
A. Hatzis, Phil Green, S.J. Howard
Univ. of Sheffield, UK
WMD.3 Intelligent Retrieval of Very Large Chinese Dictionaries with Speech Queries 1767
Sung-Chien Lin, *Lee-Feng Chien, *Ming-Chiuan Chen, *Lin-Shan Lee, *Ker-Jiann Chen
National Taiwan Univerisity, ROChina
*Academia Sinica, ROChina
WMD.4 Preliminary Results of a Multilingual Interactive Voice Activated Telephone Service for People-On -The-
Move 1771
Fulvio Leonardi, Giorgio Micca, Sheyla Militello, Mario Nigra
Centro Studi e Laboratori Telecomunicazioni (CSELT), Italy
WMD.5 Assessment of an Operational Dialogue System Used by a Blind Telephone Switchboard Operator 1775
Jean-Christophe Dubois, *Yolande Anglade, Dominique Fohr
CRIN-CNRS, France
*IRISA-LLI, France
WMD.6 STACC: An Automatic Service for Information Access Using Continuous Speech Recognition Through Telephone Line 1779
Antonio J. Rubio, Pedro Garcia, la Torre Angel de, Jose C. Segura, Jesus Diaz-Verdejo, Maria C. Benitez, Victoria Sanchez, Antonio M. Peinado, Juan M. Lopez-Soler, Jose L. Perez-Cordoba
Universidad de Granada, Spain
WMD.7 A Voiced Activated Dialog System for Fast-Food Restaurant Applications 1783
Ramon Lopez-Cozar, Pedro Garcia, J. Diaz, Antonio J. Rubio
Universidad de Granada, Spain
WMD.8 Multi-Microphone Sub-band Adaptive Signal Processing for Improvement of Hearing Aid Performance
Paul W. Shields, Douglas R. Campbell
Univ. of Paisley, Scotland 1787
WMD.9 Tactile Transmission of Intonation and Stress
Hans Georg Piroth, Thomas Arnhold
Univ. of Munich, Germany 1791
WMD.10 Hearing Impairment Simulation: An Interactive Multimedia Programme on the Internet for Students of Speech Therapy 1795
Kerttu Huttunen, Pentti Korkko, Martti Sorri
Univ. of Oulu, Finland
WMD.11 Analysis of Dysarthric Speech by Means of Formant-to-Area Mapping 1799
Sorin Ciocea, Jean Schoentgen, *Lisa Crevier-Buchman
Univ. Libre de Bruxelles, Belgium
*Lainnec Hospital, France
WMD.12 An Intelligent Telephone Answering System Using Speech Recognition 1803
Boris M. Lobanov, Simon V. Brickle, Andrey V. Kubashin, Tatiana V. Levkovskaja
Academy of Sciences of Belarus, ROBelarus
WMD.13 Speedata: A Prototype for Multilingual Spoken Data-Entry 1807
Ulla Ackermann, *Bianca Angelini, *Fabio Brugnara, *Marcello Federico, *Diego Giuliani, *Roberto Gretter, Heinrich Niemann
FORWISS, Germany
*Istituto per la Ricerca Scientifica e Tecnologica (IRST), Italy
WMD.14 Applications for the Hearing-Impaired: Evaluation of Finnish Phoneme Recognition Methods
Matti Karjalainen, *Peter Boda, Panu Somervuo, Toomas Altosaar
Helsinki Univ. of Technology, Finland
*Nokia Research Centre, Finland 1811
WMD.15 Applications for the Hearing-Impaired: Comprehension of Finnish Text with Phoneme Errors
Nina Alarotu, Mietta Lennes, *Toomas Altosaar, †Anja Malm, *Matti Karjalainen
Univ. of Helsinki, Finland
*Helsinki Univ. of Technology, Finland
†Finnish Association of the Deaf, Finland 1815
WMD.16 ACCeSS - Automated Call Center Through Speech Understanding System 1819
Ute Ehrlich, *Gerhard Hanrieder, †Ludwig Hitzenberger, Paul Heisterkamp, Klaus Mecklenburg, Peter Regel-Brietzmann
Daimler Benz AG, Germany
*Daimler Benz Aerospace, Germany
†Univ. of Regensburg, Germany
WMD.17 Integrating a Radio Model with a Spoken Language Interface for Military Simulations 1823
E. Richard Anthony, Charles Bowen, Margot T. Peet, Susan Tammaro
The MITRE Corporation, USA
WMD.18 On Field Experiments of Continuous Digit Recognition Over the Telephone Network 1827
Daniele Falavigna, Roberto Gretter
Istituto per la Ricerca Scientifica e Tecnologica (IRST), Italy
WMD.19 An HMM-Based Phoneme Recognizer Applied to Assessment of Dysarthric Speech 1831
Xavier Menendez-Pidal, *James B. Polikoff, *H.Timothy Bunnell
SONY Electronics Inc., USA
*Applied Science &Engineering Laboratories, USA
WMD.20 Multiapplication Platform Based on Technology for Mobile Telephone Network Services 1835
Celinda de la Torre, Gonzalo Alonso
Telefonica I+D, Spain
WMD.21 Field Test of a Calling Card Service Based on Speaker Verification and Automatic Speech Recognition
Els den Os, *Lou Boves, †David James, ‡Richard Winski, ¤Kurt Fridh 1839
KPN Research, The Netherlands
*KUN, The Netherlands
†Ubilab, Switzerland
‡Vocalis, England
¤Telia, Sweden
WMD.22 Speech: A Privileged Modality 1843
Luc E. Julia, Adam J Cheyer
SRI International, USA
SESSION: Th1A
Speaker Adaptation I
Chair: Harald Hoege, Siemens AG, Germany
Th1A.1 Combined On-line Model Adaptation and Bayesian Predictive Classification for Robust Speech
Recognition 1847
Qiang Huo, *Chin-Hui Lee
ATR Interpreting Telecommunications Res. Labs., Japan
*Multimedia Communications Research Lab., USA
Th1A.2 Speaker Adaptive Training Applied to Continuous Mixture Density Modeling 1851
Xavier Aubert, Eric Thelen
Philips GmbH, Germany
Th1A.3 Speaker Normalization Training for Mixture Stochastic Trajectory Model 1855
Irina Illina, *Yifan Gong
CRIN-CNRS, France
*Texas Instruments, USA
Th1A.4 On-line Adaptation of Hidden Markov Models Using Incremental Estimation Algorithms 1859
Vassilios Digalakis
Technical Univ. of Crete, Greece
Th1A.5 Modeling Dependency in Adaptation of Acoustic Models Using Multiscale Tree Processes 1863
Ashvin Kannan, Mari Ostendorf
Boston Univ., USA
Th1A.6 Acoustic Clustering and Adaptation for Robust Speech Recognition 1867
Larry Heck, Ananth Sankar
SRI International, USA
SESSION: Th1B
Dialogue Systems:Design and Applications
Chair: Lou Boves, Univ. of Nijmegen, The Netherlands
Th1B.1 Learning The Structure of Mixed Initiative Dialo-gues Using A Corpus of Annotated Conversations 1871
Giovanni Flammia, Victor Zue
MIT, USA
Th1B.2 AMICA: The AT&T Mixed Initiative Conversational Architecture 1875
Roberto Pieraccini, Esther Levin, Wieland Eckert
AT&T, USA
Th1B.3 Generating Semantically Consistent Inputs to a Dialog Manager 1879
Alicia Abella, Allen L. Gorin
AT&T Research Labs., USA
Th1B.4 A Stochastic Model of Computer-Human Interaction for Learning Dialogue Strategies 1883
Esther Levin, Roberto Pieraccini
AT&T Labs-Research, USA
Th1B.5 Semantic Processing of Out-of-Vocabulary Words in a Spoken Dialogue System 1887
Manuela Boros, *Maria Aretoulaki, *Florian Gallwitz, *Elmar Noeth, *Heinrich Niemann
FORWISS, Germany
*Univ. of Erlangen-Nüremburg, Germany
Th1B.6 Clarification Dialogues in VERBMOBIL 1891
Elisabeth Maier
DFKI GmbH, Germany
SESSION: Th1C
Assessment Methods
Chair: John Makhoul, BBN Systems and Techs, USA
Th1C.1 The DET Curve in Assessment of Detection Task Performance 1895
Alvin Martin, George Doddington, Terri Kamm, Mark Ordowski, Mark Przybocki
Natural Institute of Standards and Technology, SRI, Dept. of Defense, USA
Th1C.2 Speech Quality Evaluation of Hands-Free
Terminals 1899
Harald Klaus, Ekkehard Diedrich, Astrid Dehnel, Jens Berger
Deutsche Telekom Berkom GmbH, Germany
Th1C.3 Use of Broadcast News Materials for Speech Recognition Benchmark Tests 1903
David S. Pallett, Jonathan G. Fiscus, William Fisher, John S. Garofolo,
National Institute of Standards and Technology (NIST), USA
Th1C.4 Spoken Dialogue Evaluation: A First Framework for Reporting Results 1907
Norman Fraser
Univ. of Surrey, UK
Th1C.5 Generality and Transferability. Two Issues in Putting a Dialogue Evaluation Tool into Practical Use
Niels Ole Bernsen, Hans Dybkjær, Laila Dybkjær, Vytautas Zinkevicius 1911
Odense Univ., Denmark
Prolog Development Center A/S, Denmark
Th1C.6 Within-Speaker Variability of the Word Error Rate for a Continuous Speech Recognition System
David A.Leeuwen Van, Herman J. M. Steeneken
TNO, The Netherlands 1915
SESSION: Th1D (SPECIAL SESSION)
Education for Language and Speech Communication I
Chair: Gerrit Bloothoft, Utrecht Univ., The Netherlands
Th1D.1 Opportunities for Computer-Aided Instruction in Phonetics and Speech Communication Provided by the Internet 1919
Mark Huckvale, *C. Benoit, †C. Bowerman, ‡Anders Eriksson, ¤M. Rosner, **M. Tatham, ††Briony Williams
University College London, UK
*ICP, France
†Univ. of Sunderland, UK
‡Univ. of Umeå, Sweden
¤Univ. of Malta, Malta
**Univ. of Essex, UK
††Univ. of Edinburgh, UK
Th1D.2 The Landscape of Future Education in Speech Communication Sciences 1923
Gerrit Bloothooft
Utrecht Univ., The Netherlands
Th1D.3 An Integrated System for Teaching Spoken Dialogue Systems Technology 1927
Kare Sjolander, Joakim Gustafson
KTH, Sweden
Th1D.4 Communication Science within Education for Logopedics/ Speech and Language Therapy in Europe: The State of the Art 1931
Janet Beck, *Bernard Camilleri, †Hilde Chantrain, ‡Anu Klippi, ¤Marianne Leterme, **Matti Lehtihalmes, ††Peter Schneider, ‡‡Wilhelm Vieregge, ¤¤Eva Wigforss
Queen Margaret College, UK
*Hogeschool Antwerpen,
†Univ. of Malta, Malta
‡Univ. of Helsinki, Finland
¤CPLOL,
**Univ. of Oulu, Finland
††RWTH Aachen,
‡‡Univ. of Nijmegen, The Netherlands
¤¤Lund University, Sweden
Th1D.5 Education in Spoken Language Engineering in Europe 1935
Phil Green, *Carlos Espain
Univ. of Sheffield, UK
*Univ. of Porto, UK
Th1D.6 A Survey of Phonetics Education in Europe 1939
Valerie Hazan, *Dommelen van Wim
Univ. College London, UK
*Norwegian Univ. of Science and Technology, Norway
SESSION: Th2A
Hybrid Systems for ASR
Chair: Shigeki Sagayama, NTT Human Interface Labs, Japan
Th2A.1 Matching Training and Testing Criteria in Hybrid Speech Recognition Systems 1943
Xin Tu, Yonghong Yan, Ron Cole
Oregon Graduate Institute of Science and Technology, USA
Th2A.2 Context Independent and Context Dependent Hybrid HMM/ANN Systems for Vocabulary Independent
Tasks 1947
Stephane Dupont, Christophe Ris, Olivier Deroo, Vincent Fontaine, Jean-Marc Boite, L. Zanoni
FPMs-TCTS, Belgium
Th2A.3 Estimation of Global Posteriors and Forward-Backward Training of Hybrid HMM/ANN Systems 1951
J. Hennebert, *Christophe Ris, †Hervè Bourlard, ‡Steve Renals, Nelson Morgan
ICSI, USA
*TCTS, Belgium
†IDIAP, Switzerland
‡Univ. of Sheffield, UK
Th2A.4 Confidence Measures for Hybrid HMM/ANN Speech Recognition 1955
Gethin Williams, Steve Renals
Univ. of Sheffield, UK
Th2A.5 Ensemble Methods for Connectionist Acoustic Modelling 1959
Gary D. Cook, Steve R. Waterhouse, Tony A. Robinson
Cambridge Univ., UK
Th2A.6 Improving Performance on Switchboard by Combining Hybrid HME/HMM and Mixture of Gaussians Acoustic Models 1963
Jurgen Fritsch, *Michael Finke
Univ. of Karlsruhe, Germany
*Carnegie Mellon Univ., USA
SESSION: Th2B
Topic and Dialogue Dependent Language Modelling
Chair: Frederic Jelinek, Johns Hopkins Univ. Baltimore, MD, USA
Th2B.1 Experiments in Adaptation of Language Models for Commercial Applications 1967
Petra Witschel, Harald Höge
Siemens AG, Germany
Th2B.2 Language Model Adaptation Using Dynamic Marginals 1971
Reinhard Kneser, Jochen Peters, Dietrich Klakow
Philips GmbH, Germany
Th2B.3 Transforming Out-of-Domain Estimates to Improve In-Domain Language Models 1975
Rukmini Iyer, Mari Ostendorf
Boston Univ., USA
Th2B.4 MDI Adaptation of Language Models Across Corpora 1979
P. Srinivasa Rao, Satya Dharanipragada, Salim Roukos
IBM, USA
Th2B.5 A Class Based Approach to Domain Adaptation and Constraint Integration for Empiral M-Gram Models
Klaus Ries
Carnegie Mellon Univ., USA 1983
Th2B.6 Using Story Topics for Language Model
Adaptation 1987
Kristie Seymore, Ronald Rosenfeld
Carnegie Mellon Univ., USA
SESSION: Th2C
Lipreading
Chair: Christian Benoit, ICP, Univ. Stendhal, France
Th2C.1 Towards Speaker Independent Continuous Speechreading 1991
Juergen Luettin
IDIAP, Switzerland
Th2C.2 Driving Synthetic Mouth Gestures: Phonetic Recognition for FaceMe! 1995
William Goldenthal, Keith Waters, Thong Jean-Manuel Van, Oren Glickman
Digital Equipment Corp., USA
Th2C.3 Continuous Visual Speech Recognition Using Geometric Lip-Shape Models and Neural Networks
Alexandrina Rogozan, Paul Deleglise
Laboratoire d' Informatique de l'Univ. du Maine (LIUM), France 1999
Th2C.4 The Teleface Project Multi-Modal Speech-Communication for the Hearing Impaired 2003
Jonas Beskow, Martin Dahlquist, Björn Granström, Magnus Lundeberg, Karl-Erik Spens, Tobias Ohman
KTH, Sweden
Th2C.5 Real-Time Lip-Tracking for Lipreading 2007
Rainer Stiefelhagen, Uwe Meier, *Jie Yang
Univ. Karlsruhe, Germany
*Carnegie Mellon Univ., USA
Th2C.6 From Raw Image of the Lips to Articulatory Parameters: A Viseme-Based Prediction 2011
Lionel Reveret
Univ. of Stendhal, France
SESSION: Th2D
Articulatory Modelling
Chair: Luis Pols, Univ. of Amsterdam, The Netherlands
Th2D.1 Adaptation of Maeda's Model for Acoustic to Articulatory Inversion 2015
Bruno Mathieu, Yves Laprie
CRIN-CNRS & INRIA, France
Th2D.2 Why Should Speech Control Studies Based on Kinematics be Considered with Caution? Insights from a 2D Biomechanical Model of the Tongue 2019
Yohan Payan, Pascal Perrier
Institut de la Communication Parleé, France
Th2D.3 An Integrated Model of the Biomechanics and Neural Control of the Tongue, Jaw, Hyoid and Larynx System 2023
Vittorio Sanguineti, *Rafael Laboissiere, †David J. Ostry
Universita' di Genova, Italy
*Institut de la Communication Parleé, France
†McGill Univ., Canada
Th2D.4 Using MRI to Image the Moving Vocal Tract During Speech 2027
M. Mohammad, *E. Moore, J.N. Carter, C.H. Shadle, S.J Gunn
Univ. of Southampton, UK
*Southampton General Hospital, UK
Th2D.5 Unified Physiological Model of Audible-Visible Speech Production 2031
Eric Vatikiotis-Bateson, Hani Yehia
ATR, Japan
Th2D.6 Motor Control Information Recovering from the Dynamics with the EP Hypothesis 2035
Helene Loevenbruck, Pascal Perrier
ICP, INPG, France
SESSION: ThMA
Front-Ends and Adaptation to Acoustics, Speaker Adaptation
Chair: Hervè Bourlard, IDIAP, Belgium
ThMA.1 Speaker Adaptation for Context-Dependent HMM Using Spatial Relation of Both Phoneme Context Hierarchy and Speakers 2039
Yasuhiro Komori, Tetsuo Kosaka, Masayuki Yamada, Hiroki Yamamoto
Canon Inc., Japan
ThMA.2 Fast Algorithm for Speech Recognition Using Speaker Cluster HMM 2043
Masayuki Yamada, Yasuhiro Komori, Tetsuo Kosaka, Hiroki Yamamoto
Canon Inc., Japan
ThMA.3 A Comparison of Novel Techniques for Instantaneous Speaker Adaptation 2047
Timothy J. Hazen, James R. Glass
MIT, USA
ThMA.4 Fast Adaptation of Acoustic Models to Environmental Noise Using Jacobian Adaptation
Algorithm 2051
Yoshikazu Yamaguchi, Satoshi Takahashi, Shigeki Sagayama
NTT, Japan
ThMA.5 Unsupervised HMM Adaptation Based on Speech-Silence Discrimination 2055
Ilija Zeljkovic, Shrikanth Narayanan, Alexandros Potamianos
AT&T Labs-Research, USA
ThMA.6 Correlation Based Predictive Adaptation of Hidden Markov Models 2059
Mohamed Afify, Yifan Gong, Jean-Paul Haton
CRIN-CNRS, France
ThMA.7 Adaptation of Hidden Markov Models Using Multiple Stochastic Transformations 2063
Vassilios Diakoloukas, Vassilios Digalakis
Technical Univ. of Crete, Greece
ThMA.8 Transformation Smoothing for Speaker and Environmental Adaptation 2067
M.J.F. Gales
Cambridge Univ., UK
ThMA.9 Nonlinear Discriminant Analysis for Improved Speech Recognition 2071
Vincent Fontaine, Christophe Ris, Jean-Marc Boite
FPMs-TCTS, Belgium
ThMA.10 On the Interplay Between Auditory - Based Features and Locally Recurrent Neural Networks for Robust Speech Recognition In Noise 2075
Jurgen Tchorz, *Klaus Kasper, *Herbert Reininger, Bilger Kollmeier
Carl von Ossietzky-Universität, Germany
*Johan Wolfgang-Goethe-Universität, Germany
ThMA.11 Speech Recognition Using On-Line Estimation of Speaking Rate 2079
Nelson Morgan, Eric Fosler, Nikki Mirghafori
International Computer Science Institute, USA
ThMA.12 Using Formant Frequencies in Speech
Recognition 2083
John N. Holmes, *Wendy J. Holmes, *Philip N. Garner
Speech Technology Consultant, UK
*SRU/DRA, UK
ThMA.13 Speaker Normalization and Speaker Adaptation -- A Combination for Conversational Speech
Recognition 2087
Puming Zhan, *Martin Westphal, *Michael Finke, Alex Waibel
Carnegie Mellon Univ., USA
*Univ. of Karlsruhe, Germany
ThMA.14 Speaker Adaptation Based on Pre-Clustering Training Speakers 2091
Gao Yuqing, Padmanabhan Mukund, Michael Picheny
IBM T.J, USA
ThMA.15 A Fast Method of Speaker Normalisation Using Formant Estimation 2095
Mike Lincoln, Stephen Cox, *Simon Ringland
Univ. of East Anglia, UK
*British Telecom, UK
ThMA.16 Acoustic Front--End Oprimization for Large Vocabulary Speech Recognition 2099
Lutz Welling, N. Haberland, Hermann Ney
RWTH Aachen-Univ. of Technology, Germany
ThMA.17 Improving Autoregressive Hidden Markov Model Recognition Accuracy Using a Non-Linear Frequency Scale with Application to Speech Enhancement
B.T. Logan, A.J. Robinson
Cambridge Univ., UK 2103
ThMA.18 Designing a Reduced Feature-Vector Set for Speech Recognition by Using KL/GPD Competitive Training 2107
Tsuneo Nitta, Akinori Kawamura
Toshiba Multimedia Eng. Lab., Japan
ThMA.19 Speaker Adaptation by Correlation (ABC)
Shaobing Chen Scott, Peter DeSouza
IBM, USA 2111
SESSION: ThMB
Speech Perception
Chair: Paul Taylor, Univ. of Edinburgh, UK
ThMB.1 Preliminary Experiments on the Perception of Double Semivowels 2115
William A. Ainsworth, Georg F. Meyer
Keele Univ., UK
ThMB.2 Does Syllable Frequency Affect Production Time in a Delayed Naming Task? 2119
Niels Olaf Schiller
Max Planck Institute , The Netherlands
ThMB.3 Human and Machine Identification of Consonantal Place of Articulation from Vocalic
Transition 2123
Segments
Andrew C. Morris, *Gerrit Bloothooft, †William J. Barry, †Bistra Andreeva, †Jacques Koreman
Univ. of Sheffield, UK
*Utrecht Institute of Linguistics, The Netherlands
†Univ. of Saarbrucken, Germany
ThMB.4 Modelling the Recognition of Spectrally Reduced Speech 2127
Jon Barker, Martin Cooke
Univ. of Sheffield, UK
ThMB.5 Prosodic Structure and Phonetic Processing: A Cross-Linguistic Study 2131
*Christophe Pallier, Anne Cutler, *Nuria Sebastian-Galles
Max Planck Institute for Psycholinguistics,The Netherlands
*Univ. of Barcelona, Spain
ThMB.6 The Correlation Between Consonant Identification and The Amount of Acousttic Consonant Reduction 2135
R. J. J. H. van Son, Louis C.W. Pols
Univ. of Amsterdam, The Netherlands
ThMB.7 Relevant Spectral Information for the Identification of Vowel Features from Bursts 2139
Anne Bonneau
CRIN-CNRS, France
ThMB.8 Perceptual Study of Intersyllabic Formant Transitions in Synthesized V1-V2 in Standard Chinese
Aijun Li
Chinese Academy of Social Sciences, China 2143
ThMB.9 Role of Perception of Rhythmically Organized Speech in Consolidation Process of Long-Term Memory Traces (LTM-Traces) and in Speech Production
Controlling 2147
Oleg P. Skljarov
Research Institute of Otolaryngology and Speech Pathology, Russia
ThMB.10 Sequential Probabilities as a Cue for
Segmentation 2151
Arie H. van der Lugt
Max Planck Institute for Psycholinguistics, The Netherlands
ThMB.11 Perception and Acoustics of Emotions in Singing
Susan Jansens, Gerrit Bloothooft, Guus de Krom
Utrecht Univ., The Netherlands 2155
ThMB.12 Phonemes and Syllables in Speech Perception: Size of Attentional Focus in French 2159
Christophe Pallier
Max Planck Institute for Psycholinguistics, The Netherlands
ThMB.13 Quality of a Vowel with Formant Undershoot: a Preliminary Perceptual Study 2163
Shinichi Tokuma
Sophia University, Japan
ThMB.14 Segmental and Suprasegmental Contributions to Spoken-Word Recognition in Dutch 2167
Mariette Koster, Anne Cutler
Max Planck Institute for Psycholinguistics, The Netherlands
ThMB.15 Perception of Vowel Duration and Spectral Characteristics in Swedish 2171
Dawn M. Behne, *Peter E. Czigler, *Kirk P.H. Sullivan
Norwegian Univ. of Sceince and Technology, Norway
*Umeå Univ., Sweden
ThMB.16 Relative Contributions of Noise Burst and Vocalic Transitions to the Perceptual Identification of Stop Consonants 2175
Adrien Neagu, Gerard Bailly
Institut de la Communication Parleé, France
ThMB.17 Effect of Speaker Familiarity and Background Noise on Acoustic Features Used in Speaker Identification
Satoshi Kitagawa, Makoto Hashimoto, Norio Higuchi
ATR ITL, Japan 2179
ThMB.18 Dynamic Versus Static Specification for the Perceptual Identity of a Coarticulated Vowel 2183
Michel Piterman
Univ. de Provence, France
ThMB.19 Asymmetries in Consonant Confusion 2187
Madelaine C. Plauche, *Cristina Delogu, John J. Ohala
Univ. of California at Berkeley, USA
*Fondazione Ugo Bordoni, Italy
ThMB.20 Rime and Syllabic Effects in Phonological Priming Between French Spoken Words 2191
Nicolas Dumay, *Monique Radeau
Univ. de Genève, Switzerland
*Univ. Libre de Bruxelles, Belgium
ThMB.21 Roles of Static and Dynamic Features of Formant Trajectories in the Perception of Talk Individuality 2195
Weizhong Zhu, Hideki Kasuya
Utsunomiya Univ., Japan
SESSION: ThMC
Dialogue Systems:Linguistic Structures, Modelling and Evaluation
Chair: Niels Ole Bernsen, Roskilde Univ., Denmark
ThMC.1 Database Management and Analysis for Spoken Dialog Systems: Methodology and Tools 2199
Chih-mei Lin, Shrikanth Narayanan, Russell Ritenour
AT&T Labs-Research, USA
ThMC.2 Evaluating Spoken Dialog Systems for Telecommunication Services 2203
Candace Kamm, Shrikanth Narayanan, Dawn Dutton, Russell Ritenour
AT&T Labs-Research, USA
ThMC.3 Robust Spoken Dialogue Management for Driver Information Systems 2207
Xavier Pouteau, Emiel Krahmer, Jan Landsbergen
IPO/TUE, The Netherlands
ThMC.4 Using Acoustic and Prosodic Cues to Correct Chinese Speech Repairs 2211
Yue-Shi Lee, Hsin-Hsi Chen
National Taiwan Univerisity, ROChina
ThMC.5 Integrating Domain Specific Focusing In Dialogue Models 2215
Arne Jonsson, Nils Dahlback
Linkoping Univ., Sweden
ThMC.6 Evaluating Competing Agent Strategies for a Voice Email Agent 2219
Marilyn Walker, Donald Hindle, Jeanne Fromer, Giuseppe Di Fabbrizio, Craig Mestel
AT&T Labs-Research, USA
ThMC.7 Discourse Marker Use in Task-Oriented Spoken Dialog 2223
Donna K. Byron, *Peter A. Heeman
Univ. of Rochester, USA
*France Telecom, France
ThMC.8 From Interface to Content: Translingual Access and Delivery of On-line Information 2227
Victor Zue, Stephanie Seneff, James R. Glass, Lee Hetherington, Edward Hurley, Helen Meng, Christine Pao, Joseph Polifroni, Rafael Schloming, Philipp Schmid
MIT, USA
ThMC.9 Learning Dialogue Structures From a Corpus
Jan Alexandersson, Norbert Reithinger
DFKI GmbH, Germany 2231
ThMC.10 Dialogue Act Classification Using Language Models 2235
Norbert Reithinger, Martin Klesen
DFKI GmbH, Germany
ThMC.11 User's Multiple Goals in Spoken Dialogue
Didier Pernel
Thomson-CSF / LCR, France 2239
ThMC.12 Chatting with Interactive Agent 2243
Noriko Suzuki, Seiji Inokuchi, K. Ishii, Michio Okada
ATR Media Intergration & Communications Research Laboratories, Japan
ThMC.13 Generic Template for the Evaluation of Dialogue Management Systems 2247
Gavin E. Churcher, Eric S. Atwell, Clive Souter
Centre for Computer Analysis of Language and Speech, UK
ThMC.14 Analysis of Interactive Strategy to Recover from Misrecognition of Utterances Including Multiple Information Items 2251
Yasuhisa Niimi, Takuya Nishimoto, Yutaka Kobayashi
Kyoto Institute of Technology, Japan
ThMC.15 A Referential Approach to Reduce Perplexity in the Vocal Command System Comppa 2255
Francois-Arnould Mathieu, Bertrand Gaiffe, Jean-Marie Pierrel
CRIN-CNRS&INRIA-Lorraine, France
ThMC.16 Linguistic Processor for a Spoken Dialogue System Based on Island Parsing Techniques 2259
Aristomenis Thanopoulos, Nikos Fakotakis, George Kokkinakis
Univ. of Patras, Greece
ThMC.17 Modelling of Speech-Based User Interfaces
Brian Mellor, *Chris Baber
Speech Research Unit, UK
*Univ. of Birmingham, UK 2263
ThMC.18 Can You Predict Responses to Yes/No Questions? Yes, No, and Stuff 2267
Beth Ann Hockey, *Deborah Rossen-Knill, Beverly Spejewski, Matthew Stone, †Stephen Isard
Univ. of Pennsylvania, USA
*Philadelphia College of Textiles and Science, USA
†Univ. of Edingburgh, UK
ThMC.19 DIA-MOLE: An Unsupervised Learning Approach to Adaptive Dialogue Models for Spoken Dialogue Systems 2271
Jens-Uwe Moeller
Univ. of Hamburg, Germany
ThMC.20 How Do System Questions Influence Lexical Choices in User Answers? 2275
Joakim Gustafson, Anette Larsson, Rolf Carlson, *K. Hellman
KTH, Sweden
*Stockholm Univ., Sweden
SESSION: ThMD
Speaker Recognition and Language Identification
Chair: Douglas Reynolds, MIT, USA
ThMD.1 Gaussian Mixture Models with Common Principal Axes and Their Application in Text-Independent Speaker Identification 2279
Kuo-Hwei Yuo, Hsiao-Chuan Wang
National Tsing Hua Univ., ROChina
ThMD.2 Speaker Models Designed from Complete Data Sets: A New Approach to Text-Independent Speaker Verification 2283
Dominik R. Dersch, *Robin W. King
Univ. of Sydney, Australia
*Univ. of South Australia, Australia
ThMD.3 A Double Gaussian Mixture Modeling Approach To Speaker Recognition 2287
Vergin Rivarol, Douglas O'Shaughnessy
INRS Telecommunications, Canada
ThMD.4 An Acoustic Subword Unit Approach to Non-Linguistic Speech Feature Identification 2291
Mohamed Afify, Yifan Gong, Jean-Paul Haton
CRIN-CNRS, France
ThMD.5 N-Best GMM's for Speaker Identification 2295
Chakib Tadj, *Pierre Dumouchel, †Yu Fang
Ecole de Technologie Superieure, Canada
*Centre de Recherche Informatique, Canada
†Institut Universitaire de Technologie, Canada
ThMD.6 Model Dependent Spectral Representations for Speaker Recognition 2299
Guillaume Gravier, *Chafic Mokbel, Gerard Chollet
ENST/SIG, France
*CNET-DIH/RCP, France
ThMD.7 Equalizing Sub-Band Error Rates in Speaker Recognition 2303
Roland Auckenthaler, *John S. Mason
Technical Univ. Graz, Austria
*Univ. of Wales Swansea, UK
ThMD.8 Automatic Gender Identification Under Adverse Conditions 2307
Stefan Slomka, Sridha Sridharan
Queensland Univ. of Technology, Australia
ThMD.9 Acoustic Features and Perceptive Processes in the Identification of Familiar Voices 2311
Yizhar Lavner, Isak Gath, Judith Rosenhouse
Israel Institute of Technology, Israel
ThMD.10 On the Use Acoustic Segmentation in Speaker Identification 2315
Leandro Rodriguez-Linares, Carmen Garcia-Mateo
Univ. of Vigo, Spain
ThMD.11 Speaker Recognition by Humans and Machines
Herman J.M. Steeneken, David A. Van Leeuwen
TNO-HFRI, The Netherlands 2319
ThMD.12 Foreign Speaker Accent Classification Using Phoneme-Dependent Accent Discrimination Models and Comparisons with Human Perception Benchmarks
Karsten Kumpf, *Robin W. King
Univ. of Sydney, Australia
*Univ. of South Australia, Australia 2323
ThMD.13 A Comparison of Human and Machine In Speaker Recognition 2327
Li Liu, Jialong He, Günther Palm
Univ. of Ulm, Germany
ThMD.14 Evaluation of Second Language Learners' Pronunciation Using Hidden Markov Models 2331
Simo M.A. Goddijn, *Guus de Krom
Forensic Science Laboratory, Rijswijk
*Univ. of Utrecht, The Netherlands
ThMD.15 Delta Vector Taylor Series Environment Compensation for Speaker Recognition 2335
Brian Eberman, Pedro J. Moreno
Digital Equipment Corp., USA
ThMD.16 Wavelet-Like Regression Features in the Cepstral Domain for Speaker Recognition 2339
Jonathan Hume
Univ. of Wales Swansea, UK
ThMD.17 Minimum Classification Error Linear Regression (MCELR) for Speaker Adaptation Using HMM with Trend Functions 2343
Rathinavelu Chengalvarayan
Bell Labs-Lucent Technologies, USA
ThMD.18 A Continuous HMM Text Independent Speaker Recognition System Based on Vowel Spotting 2347
Nikos Fakotakis, *Anastasios Tsopanoglou, Kallirroi Georgila
Univ. of Patras, Greece
*KNOWLEDGE SA, Greece
ThMD.19 On the Independence of Digits in Connected Digit Strings 2351
Johan W. Koolwaaij, Lou Boves
Nijmegen University, The Netherlands
ThMD.20 A New Procedure for Classifying Speakers in Speaker Verification Systems 2355
Johan W. Koolwaaij, Lou Boves
Nijmegen University, The Netherlands
ThMD.21 Sound Channel Video Indexing 2359
Claude Montacie, Marie-Jose Caraty
Univ. Pierre et Marie Curie - CNRS, France
ThMD.22 CDHMM Speaker Recognition By Means of Frequency Filtering of Filter-Bank Energies 2363
Javier Hernando, Climent Nadeu
Universität Politecnica de Catalunya, Spain
SESSION: Th3A
Style and Accent Recognition
Chair: Gerard Chollet, ENST/SIG, Switzerland
Th3A.1 Using Accent-Specific Pronunciation Modelling for Improved Large Vocabulary Continuous Speech
Recognition 2367
J. J. Humphries, P. C. Woodland
Cambridge Univ., UK
Th3A.2 Automatic Speech Recognition for Children
Alexandros Potamianos, Shrikanth Narayanan, Sungbok Lee
AT&T Labs-Research, USA 2371
Th3A.3 Recognition of Non-Native Accents 2375
Carlos Teixeira, Isabel Trancoso, Antonio Serralheiro
INESC, Portugal
Th3A.4 Speaking Mode Dependent Pronunciation Modeling in Large Vocabulary Conversational Speech Recognition 2379
Michael Finke, Alex Waibel
Carnegie Mellon Univ., USA
Th3A.5 A Prosody-Only Decision-Tree Model for Disfluency Detection 2383
Elizabeth Shriberg, *Rebecca Bates, Andreas Stolcke
SRI International, USA
*Boston Univ., USA
Th3A.6 A Novel Training Approach for Improving Speech Recognition Under Adverse Stressful Conditions 2387
Sahar E. Bou-Ghazale, John H.L Hansen
Duke Univ., USA
SESSION: Th3B
Phonetics
Chair: Joaquim Llisterri, Univ. of Barcelona, Spain
Th3B.1 From Phone Indentification to Phone Clustering Using Mutual Information 2391
Peter O'Boyle, Ji Ming, Marie Owens, F.Jack Smith
Queen's Univ. of Belfast, N. Ireland
Th3B.2 Phonetic Code Emergence in a Society of Speech Robots: Explaining Vowel Systems and the MUAF
Principle 2395
Ahmed-Reda Berrah, Rafael Laboissiere
Institut de la Communication Parleé, France
Th3B.3 Effects of Voicing on /t,d/ Tongue/Palate Contact in English and Norwegian 2399
Inger Moen, Hanne Gram Simonsen
Univ. of Oslo, Norway
Th3B.4 Fieldwork Techniques for Relating Formant Frequency, Amplitude and Bandwidth 2403
Peter Ladefoged, *Gunnar Fant
UCLA, USA
*KTH, Sweden
Th3B.5 Word Juncture Modelling Based on the TIMIT Database 2407
Xue Wang, Louis C.W. Pols
Univ. of Amsterdam, The Netherlands
Th3B.6 The Phonology and Phonetics of Second Language Intonation: The Case of "Japanese English" 2411
Motoko Ueyama
UCLA, USA
SESSION: Th3C (SPECIAL SESSION)
Towards Robust ASR for Car and Telephone Applications
Chair: Jean-Claud Junqua, Panasonic Technologies Inc., California, USA
Th3C.1 Methods for Microphone Equalization in Speech Recognition 2415
L. Fissore, Giorgio Micca, C. Vair
Centro Studi e Laboratori Telecomunicazioni (CSELT), Italy
Th3C.2 Room Acoustics and Reverberation: Impact on Hands-Free Recognition 2419
Satoshi Nakamura, Kiyohiro Shikano
Nara Institute of Science and Technology, Japan
Th3C.3 Echo and Noise Reduction for Hands-Free Terminals -State of the Art- 2423
Gerard Faucon, Regine Le Bouquin-Jeannes
Univ. de Rennes I, France
Th3C.4 Robust Speech Recognition for Wireless Networks and Mobile Telephony 2427
Reinhold Haeb-Umbach
Philips GmbH, Germany
Th3C.5 Robust ASR for the Cellular Environment -
Jay Naik
Nynex, USA
(Not arrived in time to be included in the Proceedings)
Th3C.6 Speech Recognition in the Car From Phone Dialing to Car Navigation 2431
Dirk Van Compernolle
Lernout & Hauspie Speech Products NV, Belgium
SESSION: Th3D
Language Specific Systems
Chair: Christel Sorin, CNET, Lannion, France
Th3D.1 A Keyvowel Approach to the Synthesis of Regional Accents of English 2435
Briony Williams, Stephen Isard
Univ. of Edinburgh, UK
Th3D.2 Experimental Implementation of Pitch-Synchronous Synthesis Methods for the ROMVOX Text-to-Speech System 2439
Attila Ferencz, Radu Arsinte, *Istvan Nagy, Teodora Ratiu, Maria Ferencz, †Gavril Toderean, †Diana Zaiu, Tunde-Csilla Kovacs, Lajos Simon
Software ITC SA, Romania
*Music Academy Gh.Dima, Romania
†Technical Univ. of Cluj-Napoca, Romania
Th3D.3 The Bell Labs German Text-to-Speech System: An Overview 2443
Bernd Mobius, Richard Sproat,Jan P.H van Santen , Joseph P. Olive
Bell Labs-Lucent Technologies, USA
Th3D.4 The Generation of Regional Pronunciations of English for Speech Synthesis 2447
Susan Fitt
Univ. of Edinburgh, UK
Th3D.5 Bell Laboratories Russian Text-To-Speech System
Elena Pavlova, Yuri Pavlov, Richard Sproat, Chilin Shih, Jan P.H. van Santen
Bell Labs-Lucent Technologies, USA 2451
Th3D.6 A Bilingual Text-To-Speech System in Spanish and Catalan 2455
Antonio Bonafonte, Ignasi Esquerra, Albert Febrer, Francesc Vallverdu
Universität Politecnica de Catalunya, Spain
SESSION: Th4A
Pronunciation Models
Chair: Jean-Paul Haton, CRIN/CNRS-INRIA, France
Th4A.1 Automatic Rule-based Generation of Word Pronunciation Networks 2459
Nick Cremelie, Jean-Pierre Martens
Univ. of Gent, Belgium
Th4A.2 Creating User Defined New Vocabularies for Voice Dialing 2463
Jose Maria Elvira, Juan Carlos Torrecilla, Javier Caminero
Telefonica I+D, Spain
Th4A.3 Automatic Generation of Context-Dependent Pronunciations 2467
Mosur Ravishankar, Maxine Eskenazi
Carnegie Mellon Univ., USA
Th4A.4 Automatic Generation of a Pronunciation Dictionary Based on a Pronunciation Network 2471
Toshiaki Fukada, Yoshinori Sagisaka
ATR ITL, Japan
Th4A.5 What is Wrong with the Lexicon-An Attempt to Model Pronunciations Probabilistically 2475
Uwe Jost, Henrik Heine, Gunnar Evermann
Hamburg Univ., Germany
Th4A.6 Lexical Tuning Based on Triphone Confidence Estimation 2479
Kevin L. Markey, *Wayne Ward
Berdy Medical Systems, USA
*Carnegie Mellon Univ., USA
SESSION: Th4B
Auditory Modelling and Psychoacoustics
Chair: William Ainsworth, Keele Univ., UK
Th4B.1 Improving of Amplitude Modulation Maps for FO-Dependent Segregation of Harmonic Sounds 2483
Frederic Berthommier, *Georg Meyer
ICP, INPG, France
*Univ. of Keele, UK
Th4B.2 Psychophysical Evaluation of PSOLA: Natural Versus Synthetic Speech 2487
Reinier Kortekaas, Armin Kohlrausch
IPO, The Netherlands
Th4B.3 Perception of Noised Words by Normal Children and Children with Speech and Language Impairments
Valentina V. Lublinskaja, *Inna V. Koroleva, A.N. Kornev, Elena V. Iagounova 2491
Pavlov Institute of Physiology, Russia
*Institute of Ear, Throat, Nose and Speech Pathology, Russia
Th4B.4 Modeling the Perception of Simultaneous Semi-Vowels 2495
Georg F. Meyer, William.A Ainsworth
Keele Univ., UK
Th4B.5 Properties of Auditory Model Representations
Fernando Santos Perdigao, Luis V. Sá
Universidade de Coimbra, Portugal 2499
Th4B.6 Impact of "Ascending Sequence" AI (Auditory Primary Cortex) Cells on Stop Consonant Perception"
Marta Eduardo Sa, de Sa Luis Vieira
Universidade de Coimbra, Portugal 2503
SESSION: Th4C
Voice Conversion and Data Driven F0-Models
Chair: Yoshinori Sagisaka, ATR Interpret. Telecom. Res. Labs., Japan
Th4C.1 Application-Dependent Prosodic Models for Text-To-Speech Synthesis and Automatic Design of Learning Database Corpus Using Genetic Algorithm 2507
Olivier Boeffard,Emerard F.
France Telecom-CNET, France
Th4C.2 Combinatorial Issues in Text-To-Speech Synthesis
Jan P.H. van Santen
Bell Labs-Lucent Technologies, USA 2511
Th4C.3 Automatic Corpus-Based Training of Rules for Prosodic Generation in Text-To-Speech 2515
Eduardo Lopez-Gonzalo, Jose M. Rodriguez-Garcia, Luis Hernandez-Gomez, Juan M. Villar
ETSIT-UPM, Spain
Th4C.4 Hidden Markov Model Based Voice Conversion Using Dynamic Characteristics of Speaker 2519
Eun-Kyoung Kim, Sangho Lee, Yung-Hwan Oh
KAIST, Korea
Th4C.5 Speaker Interpolation in HMM-Based Speech Synthesis System 2523
Takayoshi Yoshimura, *Takashi Masuko, Keiichi Tokuda, *Takao Kobayashi, Tadashi Kitamura
Nagoya Institute of Technology, Japan
*Tokyo Institute of Technology, Japan
Th4C.6 Designing a Speaker Adaptable Formant-Based Text-To-Speech System 2527
Vassilios Darsinos, Dimitrios Galanis, George Kokkinakis
Univ. of Patras, Greece
SESSION: Th4D
Vocal Tract Analysis
Chair: Antreas Paoloni, Fondazione Ugo Bordoni, Italy
Th4D.1 On Using Fractal Features of Speech Sounds in Automatic Speech Recognition 2531
Petros Maragos, *Alexandros Potamianos
ILSP & Georgia Tech, Greece & USA
*AT&T Labs-Research, USA
Th4D.2 Dynamic Constraint Weighting in the Context of Articulatory Parameter Estimation 2535
Hywel B. Richards, John S. Bridle, Melvyn J. Hunt, *John S. Mason
Dragon Systems, UK
*Univ. of Wales Swansea, UK
Th4D.3 Estimation of Vocal Tract Front Cavity Resonance in Unvoiced Fricative Speech 2539
Minkyu Lee, Donald G. Childers
Univ. of Florida, USA
Th4D.4 A Software Tool to Study Portuguese Vowels
Antonio Teixeira, Francisco Vaz, *Jose Carlos Principe
INESC, Portugal
*Univ. of Florida, USA 2543
Th4D.5 Post-Synchronization Via Formant-to-Area Mapping of Asynchronously Recorded Speech Signals and Area Functions 2547
Jean Schoentgen, Sorin Ciocea
Univ. Libre de Bruxelles, Belgium
Th4D.6 Geometrically and Acoustically Optimized Codebook for Unique Mapping from Formants to Vocal-Tract Shape 2551
Zhenli L. Yu, P.C. Ching
The Chinese Univ. of Hong Kong, Hong Kong
SESSION: ThAA
Noise Mitigation, Speech Enhancement II
Chair: Bayya Yegnanarayana, IIT MADRAS, India
ThAA.1 Noisy Speech Enhancement by Fusion of Auditory and Visual Information: A Study of Vowel
Transitions 2555
Laurent Girin, Gang Feng, Jean-Luc Schwartz
Univ. of Stendhal, France
ThAA.2 Spectral Subtraction Using a Non-Critically Decimated Discrete Wavelet Transform 2559
Andreas Engelberg, Thomas Gulzow
Univ. of Kiel, Germany
ThAA.3 Bayesian Affine Trasformation of HMM Parameters for Instantaneous and Supervised Adaptation in Telephone Speech Recognition 2563
Jen-Tzung Chien, Hsiao-Chuan Wang, *Chin-Hui Lee
National Tsing Hua Univ., ROChina
*Bell Labs, USA
ThAA.4 Integrated Bias Removal Techniques for Robust Speech Recognition 2567
Craig Lawrence, Mazin Rahim
Univ. of Maryland, USA
ThAA.5 Acoustic Front Ends for Speaker-Independent Digit Recognition in Car Environments 2571
Detlev Langmann, Alexander Fischer, Friedhelm Wuppermann, Reinhold Haeb-Umbach, Thomas Eisele
Philips GmbH, Germany
ThAA.6 Signal Bias Removal Using the Multi-Path Stochastic Equalization Technique 2575
Lionel Delphin-Poulat, Chafic Mokbel
France Telecom, France
ThAA.7 Subband Echo Cancellation in Automatic Speech Dialog Systems 2579
Andrej Miksic, Bogomir Horvat
Univ. of Maribor, Slovenia
ThAA.8 Speech Enhancencement Via Energy Separation
Hesham Tolba, Douglas O'Shaughnessy
Univ. du Quebec, Canada 2583
ThAA.9 A Method of Signal Extraction from Noisy Signal
Masashi Unoki, Masato Akagi 2587
Japan Advanced Institute of Science and Technology, Japan
ThAA.10 Multi-Channel Noise Reduction Using Wavelet Filter Bank 2591
Jiri Sika, Vratislav Davidek
Czech Technical Univ., Czech Republic
ThAA.11 Speech Signal Detection in Noisy Environment Using a Local Entropic Criterion 2595
Imad Abdallah, Silvio Montresor, Marc Baudry
Laboratoire d'Informatique de l'Univ. du Maine, France
ThAA.12 A New Algorithm for Robust Speech Recognition: The Delta Vector Taylor Series Approach
Pedro J. Moreno, Brian Eberman
Digital Equipment Corp., USA 2599
ThAA.13 Robust Enhancement of Reverberant Speech Using Iterative Noise Removal 2603
David Cole, Miles Moody, Sridha Sridharan
Queensland Univ. of Technology, Australia
ThAA.14 A Network Speech Echo Canceller with Comfort Noise 2607
David J. Jones, Scott D. Watson, Kenneth G. Evans, Barry M.G. Cheetham, *Robert A. Reeves
Univ. of Liverpool, UK
*BT Laboratories, UK
ThAA.15 A New Metric for Selecting Sub-Band Processing in Adaptive Speech Enhancement Systems 2611
Amir Hussain, Douglas R. Campbell, Thomas J. Moir
Univ. of Paisley, UK
ThAA.16 Estimation of LPC Cepstrum Vector of Speech Contaminated by Additive Noise and its Application to Speech Enhancement 2615
Hidefumi Kobatake, Hideta Suzuki
Tokyo Univ. of Agriculture & Technology, Japan
ThAA.17 Multi-Band and Adaptation Approaches to Robust Speech Recognition 2619
Sangita Tibrewala, Hynek Hermansky
Oregon Graduate Institute of Science and Technology, USA
ThAA.18 Non-Quadratic Criterion Algorithms for Speech Enhancement 2623
Enrique Masgrau, Eduardo Lleida, Luis Vicente
Universidad de Zaragoza, Spain
SESSION: ThAB
F0 and Duration Modelling, Spoken language processing
Chair: Richard Schwartz, BBN Systems and Techs, USA
ThAB.1 Modeling Segmental Duration with Multivariate Adaptive Regression Splines 2627
Marcel Riedi
ETH Zentrum TIK, Switzerland
ThAB.2 High Quality Speech Synthesis for Phonetic Speech Segmentation 2631
Fabrice Malfrere, Thierry Dutoit
Circuits Theory and Signal Processing Lab, Belgium
ThAB.3 Factors Affecting Perceived Quality and Intelligibility in the CHATR Concatenative Speech Synthesiser 2635
Nick Campbell, Itoh Yoshiharu, Wen Ding, Norio Higuchi
ATR Interpreting Telecommunications Res. Labs., Japan
ThAB.4 Reduced Lexicon Trees for Decoding in a MMI-Connectionist/HMM Speech Recognition System 2639
Christoph Neukirchen, Daniel Willett, Gerhard Rigoll
Gerhard-Mercator-Univ. Duisburg, Germany
ThAB.5 A Stochastic Model of Intonation for French Text-to-Speech Synthesis 2643
Jean Veronis, Philippe Di Cristo, Fabienne Courtois, Benoit Lagrue
Univ. de Provence & CNRS, France
ThAB.6 Phonetic Rules for a Phonetic-to-Speech System
Angelien A. Sanderman, *Renè Collier 2647
KPN Research, The Netherlands
*Institute for Perception Research, The Netherlands
ThAB.7 Multi-Lingual Duration Modeling 2651
Jan P.H van Santen, Chilin Shih, Bernd Mobius, Evelyne Tzoukermann, Michael Tanenblatt
Bell Labs-Lucent Technologies, USA
ThAB.8 A Model of Segment (and Pause) Duration Generation for Brazilian Portuguese Text-to-Speech Synthesis 2655
Plinio A. Barbosa
State Univ. of Campinas, Brazil
ThAB.9 Parsing Strategy for Spoken Language Interfaces with a Lexicalized Tree Grammar 2659
Ariane Halber, David Roussel
Thomson-CSF, France
ThAB.10 What's in a Word Graph -- Evaluation and Enhancement of Word Lattices 2663
Jan W. Amtrup, Henrik Heine, Uwe Jost
Hamburg Univ., Germany
ThAB.11 Accelerated DP Based Search for Statistical Translation 2667
Christoph Tillmann, Stefan Vogel, Hermann Ney, A. Zubiaga, H. Sawaf
RWTH Aachen, Germany
ThAB.12 Use of Pitch Pattern Improvement in the CHATR Speech Synthesis System 2671
Ken Fujisawa, Toshio Hirai, Norio Higuchi
ATR ITL, Japan
ThAB.13 Generating Segment Durations in a Text-to-Speech System: A Hybrid Rule-Based/Neural Network
Approach 2675
Gerald Corrigan, Noel Massey, Orhan Karaali
Motorola, USA
ThAB.14 On the Global F0 Shape Model Using a Transition Network for Japanese Text-To-Speech Systems
Yasushi Ishikawa, Takashi Ebihara
Mitsubishi Electric Corporation, Japan 2679
ThAB.15 An Alternative and Flexible Approach in Robust Information Retrieval Systems 2683
Jose Colas, Juan M. Montero, Javier Ferreiros, Jose M. Pardo
Universidad Politecnica de Madrid, Spain
ThAB.16 A Probalistic Approach to Analogical Speech Translation 2687
Keiko Horiguchi, Alexander Franz
Sony, Japan
ThAB.17 Dynamic Lexicon for a Very Large Vocabulary Vocal Dictation 2691
Marie-Jose Caraty, Claude Montacie, Fabrice Lefèvre
Univ. Pierre et Marie Curie - CNRS, France
SESSION: ThAC
Language Modelling
Chair: Ronald Rosenfeld, Carnegie Mellon Univ., USA
ThAC.1 Construction of Language Models Using the Morphic Generator Grammatical Inference (MGGI) Methodology 2695
Encarna Segarra, Luis Hurtado
Universidad Politecnica de Valencia, Spain
ThAC.2 An Integrated Language Modeling with N-Gram Model and WA Model for Speech Recognition 2699
Shuwu Zhang, Taiyi Huang
Chinese Academy of Sciences, China
ThAC.3 Statistical Analysis of Dialogue Structure 2703
Ye-Yi Wang, Alex Waibel
Carnegie Mellon Univ., USA
ThAC.4 Statistical Language Modeling Using the CMU-Cambridge Toolkit 2707
Philip Clarkson, *Ronald Rosenfeld
Cambridge Univ., UK
*Carnegie Mellon Univ., USA
ThAC.5 Text Normalization and Speech Recognition in French 2711
Gilles Adda, Martine Adda-Decker, Jean-Luc Gauvain, Lori Lamel
LIMSI, France
ThAC.6 A Novel Tree-Based Clustering Algorithm for Statistical Language Modeling 2715
Geraldine Damnati, Jacques Simonin
France Telecom, France
ThAC.7 Variable-Length Language Modeling Integrating Global Constraints 2719
Shoichi Matsunaga, Shigeki Sagayama
NTT, Japan
ThAC.8 An Hybrid Language Model for Continuous Dictation Prototype 2723
Kamel Smaili, Imed Zitouni, Francois Charpillet, Jean-Paul Haton
CRIN-CNRS & INRIA, Lorraine, France
ThAC.9 Dealing with Pronunciation Variants at the Language Model Level for the Continuous Automatic Speech Recognition of French 2727
Laure Pousse, Guy Perennou
IRIT - Equipe IHMPT, France
ThAC.10 Rational Interpolation of Maximum Likelihood Predictors in Stochastic Language Modeling 2731
Ernst Günter Schukat-Talamazzini, *Florian Gallwitz, *Stefan Harbeck, *Volker Warnke
Univ. of Jena, Germany
*Univ. of Erlangen, Germany
ThAC.11 N-Gram Language Model Adaptation Using Small Corpus for Spoken Dialog Recognition 2735
Akinori Ito, Hideyuki Saitoh, Masaharu Katoh, Masaki Kohda
Yamagata Univ., Japan
ThAC.12 Variable N-Gram Language Modeling and Extensions for Conversational Speech 2739
Man-Hung Siu, *Mari Ostendorf
BBN Inc, USA
*Boston Univ., USA
ThAC.13 Fuzzy Class Rescoring: A Part-of-Speech Language Model 2743
Petra Geutner
Univ. of Karlsruhe, Germany
ThAC.14 Speech Understanding Based on Integrating Concepts By Conceptual Dependency 2747
Akito Nagai, Yasushi Ishikawa
Mitsubishi Electric Corporation, Japan
ThAC.15 Dynamic Language Models for Interactive Speech Applications 2751
Fabio Brugnara, Marcello Federico
Istituto per la Ricerca Scientifica e Tecnologica (IRST), Italy
ThAC.16 Large-Scale Lexical Semantics for Speech Recognition Support 2755
George Demetriou, Eric Atwell, Clive Souter
Univ. of Leeds, UK
ThAC.17 Integration of Grammar and Statistical Language Constraints for Partial Word-Sequence Recognition 2759
Hajime Tsukada, Hirofumi Yamamoto, Yoshinori Sagisaka
ATR Interpreting Telecommunications Res. Labs., Japan
ThAC.18 Using Intonation to Constrain Language Models in Speech Recognition 2763
Paul Taylor, Simon King, Stephen Isard, Helen Wright, Jaqueline Kowtko
Univ. of Edinburgh, UK
ThAC.19 Incorporating POS Tagging into Language Modeling 2767
Peter A. Heeman, *James F. Allen
France Telecom, France
*Univ. of Rochester, USA
ThAC.20 Confidence Metrics Based on N-Gram Language Model Backoff Behaviors 2771
Carl Uhrik, *Wayne Ward
Berdy Medical Systems, USA
*Carnegie Mellon Univ., USA
ThAC.21 Structure and Performance of a Dependency Language Model 2775
Ciprian Chelba, *David Engle, Frederick Jelinek, †Victor M. Jimenez, Sanjeev Khudanpur, Lidia Mangu, ¤Harry Printz, **Eric Ristad, ††Ronald Rosenfeld, ‡‡Andreas Stolcke, ¤¤Dekai Wu
John Hopkins Univ., USA
*Dept.of Defense Fort Meade,MD,USA
†Universität Politecnica de Valencia, Spain
¤IBM, USA
**Princeton Univ., USA
††Carnegie Mellon Pittsburgh,PA, USA
‡‡SRI International, USA
¤¤Hong Kong Tech University, Hong Kong
ThAC.22 Modeling Linguistic Segment and Turn Boundaries for N-Best Rescoring of Spontaneous Speech
Andreas Stolcke
SRI International, USA 2779
ThAC.23 Hybrid Language Models: Is Simpler better?
Peter E. Kenne, Mary O'Kane
Univ. of Adelaide, Australia 2783
ThAC.24 Internal and External Tagsets in Part-of-Speech Tagging 2787
Thorsten Brants
Univ. of the Saarland, Germany
SESSION: ThAD
Auditory Modelling and Psychoacoustics, Neural Networks for Speech Processing and Recognition
Chair: Phil D. Green, Univ. of Sheffield, UK
ThAD.1 A Probabilistic Model of Double-Vowel
Segregation 2791
Laurent Varin, Frederic Berthommier
ICP, INPG, France
ThAD.2 Stimulus Signal Estimation From Auditory-Neural Transduction Inverse Processing 2795
Houshang Habibzadeh Vaneghi, Shigeyoshi Kitazawa
Shizuoka Univ., Japan
ThAD.3 FDVQ Based Keyword Spotter Which Incorporates A Semi-Supervised Learning for Primary Processing 2799
Chakib Tadj, Pierre Dumouchel, *Franck Poirier
Ecole de Technologie Superieure, Canada
*Institut Universitaire Professionnalise, France
ThAD.4 The Initial Time Span of Auditory Processing Used for Speaker Attribution of the Speech Signal 2803
Valentina V. Lublinskaja, *Christian Sappok
Pavlov Institute of Physiology, Russia
*Ruhr Universität, Germany
ThAD.5 Sparse Connection and Pruning in Large Dynamic Artificial Neural Networks 2807
Nikko Strom
KTH, Sweden
ThAD.6 A Modular Initialization Scheme for Better Speech Recognition Performance Using Hybrid Systems of MLPs\HMMs 2811
Roxana Teodorescu,Dirk Van Compernolle, Ioannis Dologlou
K.U Leuven-ESAT, Belgium
ThAD.7 Lateralization for Auditory Perception of Foreign Words 2815
Tatiana Chernigovskaya
Russian Academy of Sciences, Russia
ThAD.8 The Structural Weighted Sets Method for Continuous Speech and Text Recognition 2819
Yuri Kosarev, Pavel Jarov, Alexander Osipov
Russian Academy of Sciences, Russia
ThAD.9 Lateral Inhibitory Networks for Auditory Processing 2823
Christian J. Sumner, Duncan F. Gillies
Imperial College, UK
ThAD.10 Missing Fundamentals:A Problem of Auditory or Mental Processing? 2827
Henning Reetz
Univ. of Konstanz, Germany
ThAD.11 Predictive Neural Networks Applied to Phoneme Recognition 2831
Felix Freitag, Enric Monte, Josep M. Salavedra
Polytechnic University of Catalunya, Spain
ThAD.12 Empirical Comparison of Two Multilayer Perceptron-Based Keyword Speech Recognition Algorithms 2835
Suhardi, *Klaus Fellbaum
Technical Univ. of Berlin, Germany
*Brandenburg Technical Univ. of Cottbus, Germany
ThAD.13 Segment Boundary Estimation Using Recurrent Neural Networks 2839
Toshiaki Fukada, *Sophie Aveline, Mike Schuster, Yoshinori Sagisaka
ATR Interpreting Telecommunications Res. Labs.,
*ENST, France
ThAD.14 Incorporation of HMM Output Constraints in Hybrid NN/HMM Systems During Training 2843
Mike Schuster
ATR ITL, Japan
ThAD.15 Principles of the Hearing Periphery Fuctioning in New Methods of Pitch Detection and Speech
Enhancement 2847
Ludmila Babkina, *Sergey Koval, Alexander Molchanov
Research Institute of Ear,Nose, Throat and Speech Disorders, Russia
*Speech Technology Centre, Russia
ThAD.16 The Locus of the Syllable Effect: Prelexical or Lexical? 2851
Christine Meunier, *Alain Content, Uli H. Frauenfelder, †Ruth Kearns
Univ. of Geneva, Switzerland
*Univ. Libre de Bruxelles, Belgium
†Medical Research Council, UK
ThAD.17 On Not Remembering Disfluencies 2855
Ellen Gurman Bard, Robin J. Lickley
Univ. of Edinburgh, UK
ThAD.18 Using an Auditory Model and Leaky Autocorrelators to Tune In to Speech 2859
Tjeerd Andringa
Univ. of Groningen, The Netherlands