Spoken Language Generation and Translation 2

Home
Full List of Titles
1: ICSLP'98 Proceedings
Keynote Speeches
Text-To-Speech Synthesis 1
Spoken Language Models and Dialog 1
Prosody and Emotion 1
Hidden Markov Model Techniques 1
Speaker and Language Recognition 1
Multimodal Spoken Language Processing 1
Isolated Word Recognition
Robust Speech Processing in Adverse Environments 1
Spoken Language Models and Dialog 2
Articulatory Modelling 1
Talking to Infants, Pets and Lovers
Robust Speech Processing in Adverse Environments 2
Spoken Language Models and Dialog 3
Speech Coding 1
Articulatory Modelling 2
Prosody and Emotion 2
Neural Networks, Fuzzy and Evolutionary Methods 1
Utterance Verification and Word Spotting 1 / Speaker Adaptation 1
Text-To-Speech Synthesis 2
Spoken Language Models and Dialog 4
Human Speech Perception 1
Robust Speech Processing in Adverse Environments 3
Speech and Hearing Disorders 1
Prosody and Emotion 3
Spoken Language Understanding Systems 1
Signal Processing and Speech Analysis 1
Spoken Language Generation and Translation 1
Spoken Language Models and Dialog 5
Segmentation, Labelling and Speech Corpora 1
Multimodal Spoken Language Processing 2
Prosody and Emotion 4
Neural Networks, Fuzzy and Evolutionary Methods 2
Large Vocabulary Continuous Speech Recognition 1
Speaker and Language Recognition 2
Signal Processing and Speech Analysis 2
Prosody and Emotion 5
Robust Speech Processing in Adverse Environments 4
Segmentation, Labelling and Speech Corpora 2
Speech Technology Applications and Human-Machine Interface 1
Large Vocabulary Continuous Speech Recognition 2
Text-To-Speech Synthesis 3
Language Acquisition 1
Acoustic Phonetics 1
Speaker Adaptation 2
Speech Coding 2
Hidden Markov Model Techniques 2
Multilingual Perception and Recognition 1
Large Vocabulary Continuous Speech Recognition 3
Articulatory Modelling 3
Language Acquisition 2
Speaker and Language Recognition 3
Text-To-Speech Synthesis 4
Spoken Language Understanding Systems 4
Human Speech Perception 2
Large Vocabulary Continuous Speech Recognition 4
Spoken Language Understanding Systems 2
Signal Processing and Speech Analysis 3
Human Speech Perception 3
Speaker Adaptation 3
Spoken Language Understanding Systems 3
Multimodal Spoken Language Processing 3
Acoustic Phonetics 2
Large Vocabulary Continuous Speech Recognition 5
Speech Coding 3
Language Acquisition 3 / Multilingual Perception and Recognition 2
Segmentation, Labelling and Speech Corpora 3
Text-To-Speech Synthesis 5
Spoken Language Generation and Translation 2
Human Speech Perception 4
Robust Speech Processing in Adverse Environments 5
Text-To-Speech Synthesis 6
Speech Technology Applications and Human-Machine Interface 2
Prosody and Emotion 6
Hidden Markov Model Techniques 3
Speech and Hearing Disorders 2 / Speech Processing for the Speech and Hearing Impaired 1
Human Speech Production
Segmentation, Labelling and Speech Corpora 4
Speaker and Language Recognition 4
Speech Technology Applications and Human-Machine Interface 3
Utterance Verification and Word Spotting 2
Large Vocabulary Continuous Speech Recognition 6
Neural Networks, Fuzzy and Evolutionary Methods 3
Speech Processing for the Speech-Impaired and Hearing-Impaired 2
Prosody and Emotion 7
2: SST Student Day
SST Student Day - Poster Session 1
SST Student Day - Poster Session 2

Author Index
A B C D E F G H I
J K L M N O P Q R
S T U V W X Y Z

Multimedia Files

A Generic Algorithm for Generating Spoken Monologues

Authors:

Esther Klabbers, IPO, Center for Research on User-System Interaction (The Netherlands)
Emiel Krahmer, IPO, Center for Research on User-System Interaction (The Netherlands)
Mariët Theune, IPO, Center for Research on User-System Interaction (The Netherlands)

Page (NA) Paper number 278

Abstract:

The defining property of a Concept-to-Speech system is that it combines language and speech generation. Language generation converts the input-concepts into natural language, which speech generation subsequently transforms into speech. Potentially, this leads to a more `natural sounding' output than can be achieved in a plain Text-to-Speech system, since the correct placement of pitch accents and intonational boundaries ---an important factor contributing to the `naturalness' of the generated speech--- is co-determined by syntactic and discourse information, which is typically available in the language generation module. In this paper, a generic algorithm for the generation of coherent spoken monologues is discussed, called D2S. Language generation is done by a module called LGM which is based on TAG-like syntactic structures with open slots, combined with conditions which determine when the syntactic structure can be used properly. A speech generation module converts the output of the LGM into speech using either phrase-concatenation or diphone-synthesis.

SL980278.PDF (From Author) SL980278.PDF (Rasterized)

TOP


On the Use of Automatically Generated Discourse-Level Information in a Concept-to-Speech Synthesis System

Authors:

Janet Hitzeman, Centre for Speech Technology Research, University of Edinburgh (U.K.)
Alan W. Black, Centre for Speech Technology Research, University of Edinburgh (U.K.)
Paul Taylor, Centre for Speech Technology Research, University of Edinburgh (U.K.)
Chris Mellish, Department of Artificial Intelligence, University of Edinburgh (U.K.)
Jon Oberlander, Human Communication Research Centre, University of Edinburgh (U.K.)

Page (NA) Paper number 591

Abstract:

This paper describes the latest version of the SOLE concept-to-speech system, which uses linguistic information provided by a natural language generation system to improve the prosody of synthetic speech. We discuss the types of linguistic information that prove most useful and the implications for text-to-speech systems.

SL980591.PDF (From Author) SL980591.PDF (Rasterized)

TOP


Learning Phrase-Based Head Transduction Models for Translation of Spoken Utterances

Authors:

Hiyan Alshawi, AT&T Labs (USA)
Srinivas Bangalore, AT&T Labs (USA)
Shona Douglas, AT&T Labs (USA)

Page (NA) Paper number 293

Abstract:

We describe a method for learning head-transducer models of translation automatically from examples consisting of transcribed spoken utterances and reference translations of the utterances. The method proceeds by first searching for a hierarchical alignment (specifically a synchronized dependency tree) of each training example. The alignments produced are optimal with respect to a cost function that takes into account co-occurrence statistics and the recursive decomposition of the example into aligned substrings. A probabilistic head-transducer model is then constructed from the alignments. We report results of applying the method to English-to-Spanish translation in the domain of air travel information and English-to-Japanese translation in the domain of telephone operator assistance. We also report on a variation on this model-construction method in which multi-word pairings are used in the computation of the hierarchical alignments and head transducer models.

SL980293.PDF (From Author) SL980293.PDF (Rasterized)

TOP


Probabilistic Dialogue Act Extraction for Concept Based Multilingual Translation Systems

Authors:

Toshiaki Fukada, ATR-ITL (Japan)
Detlef Koll, CMU-ISL (USA)
Alex Waibel, CMU-ISL (USA)
Kouichi Tanigaki, ATR-ITL (Japan)

Page (NA) Paper number 657

Abstract:

This paper describes a probabilistic method for dialogue act (DA) extraction for concept-based multilingual translation systems. A DA is a unit of a semantic interlingua and it consists of speaker information, speech act, concept and argument. Probabilistic models for the extraction of speech acts or concepts are trained as speech act or concept dependent word n-gram models. The proposed method is evaluated on DA-annotated English and Japanese databases. The experimental results show that the proposed method gives a better performance compared to the conventional grammar-based approach. In addition, the proposed method is much more robust for erroneous inputs obtained as speech recognition outputs.

SL980657.PDF (From Author) SL980657.PDF (Rasterized)

TOP


Fast Decoding For Statistical Machine Translation

Authors:

Ye-Yi Wang, Carnegie Mellon University (USA)
Alex Waibel, Carnegie Mellon University (USA)

Page (NA) Paper number 826

Abstract:

We investigated an efficient decoding algorithm for statistical machine translation. Compared to the other algorithms, this new algorithm is applicable to different translation models, and it is much faster. Experiments showed that the algorithm achieved an overall performance comparable to the state of the art decoding algorithms.

SL980826.PDF (From Author) SL980826.PDF (Rasterized)

TOP


A Japanese-to-English Speech Translation System: ATR-MATRIX

Authors:

Toshiyuki Takezawa, ATR Interpreting Telecommunications Research Laboratories (Japan)
Tsuyoshi Morimoto, Fukuoka University (Japan)
Yoshinori Sagisaka, ATR Interpreting Telecommunications Research Laboratories (Japan)
Nick Campbell, ATR Interpreting Telecommunications Research Laboratories (Japan)
Hitoshi Iida, ATR Interpreting Telecommunications Research Laboratories (Japan)
Fumiaki Sugaya, ATR Interpreting Telecommunications Research Laboratories (Japan)
Akio Yokoo, ATR Interpreting Telecommunications Research Laboratories (Japan)
Seiichi Yamamoto, ATR Interpreting Telecommunications Research Laboratories (Japan)

Page (NA) Paper number 957

Abstract:

We have built a new speech translation system called ATR-MATRIX (ATR's Multilingual Automatic Translation System for Information Exchange). This system can recognize natural Japanese utterances such as those used in daily life, translate them into English and output synthesized speech. This system is running on a workstation or a high-end PC and achieves nearly real-time processing. The current implementation of our system deals with a hotel room reservation task/domain. We plan to develop a bidirectional speech translation system, i.e., Japanese-to-English and English-to-Japanese. We also plan to develop multi-language output functions from ATR-MATRIX (Japanese-to-English, German and Korean) for the international joint experiment of C-STAR II (Consortium for Speech Translation Advanced Research).

SL980957.PDF (From Author) SL980957.PDF (Rasterized)

TOP