Prosody and Emotion 6

Home
Full List of Titles
1: ICSLP'98 Proceedings
Keynote Speeches
Text-To-Speech Synthesis 1
Spoken Language Models and Dialog 1
Prosody and Emotion 1
Hidden Markov Model Techniques 1
Speaker and Language Recognition 1
Multimodal Spoken Language Processing 1
Isolated Word Recognition
Robust Speech Processing in Adverse Environments 1
Spoken Language Models and Dialog 2
Articulatory Modelling 1
Talking to Infants, Pets and Lovers
Robust Speech Processing in Adverse Environments 2
Spoken Language Models and Dialog 3
Speech Coding 1
Articulatory Modelling 2
Prosody and Emotion 2
Neural Networks, Fuzzy and Evolutionary Methods 1
Utterance Verification and Word Spotting 1 / Speaker Adaptation 1
Text-To-Speech Synthesis 2
Spoken Language Models and Dialog 4
Human Speech Perception 1
Robust Speech Processing in Adverse Environments 3
Speech and Hearing Disorders 1
Prosody and Emotion 3
Spoken Language Understanding Systems 1
Signal Processing and Speech Analysis 1
Spoken Language Generation and Translation 1
Spoken Language Models and Dialog 5
Segmentation, Labelling and Speech Corpora 1
Multimodal Spoken Language Processing 2
Prosody and Emotion 4
Neural Networks, Fuzzy and Evolutionary Methods 2
Large Vocabulary Continuous Speech Recognition 1
Speaker and Language Recognition 2
Signal Processing and Speech Analysis 2
Prosody and Emotion 5
Robust Speech Processing in Adverse Environments 4
Segmentation, Labelling and Speech Corpora 2
Speech Technology Applications and Human-Machine Interface 1
Large Vocabulary Continuous Speech Recognition 2
Text-To-Speech Synthesis 3
Language Acquisition 1
Acoustic Phonetics 1
Speaker Adaptation 2
Speech Coding 2
Hidden Markov Model Techniques 2
Multilingual Perception and Recognition 1
Large Vocabulary Continuous Speech Recognition 3
Articulatory Modelling 3
Language Acquisition 2
Speaker and Language Recognition 3
Text-To-Speech Synthesis 4
Spoken Language Understanding Systems 4
Human Speech Perception 2
Large Vocabulary Continuous Speech Recognition 4
Spoken Language Understanding Systems 2
Signal Processing and Speech Analysis 3
Human Speech Perception 3
Speaker Adaptation 3
Spoken Language Understanding Systems 3
Multimodal Spoken Language Processing 3
Acoustic Phonetics 2
Large Vocabulary Continuous Speech Recognition 5
Speech Coding 3
Language Acquisition 3 / Multilingual Perception and Recognition 2
Segmentation, Labelling and Speech Corpora 3
Text-To-Speech Synthesis 5
Spoken Language Generation and Translation 2
Human Speech Perception 4
Robust Speech Processing in Adverse Environments 5
Text-To-Speech Synthesis 6
Speech Technology Applications and Human-Machine Interface 2
Prosody and Emotion 6
Hidden Markov Model Techniques 3
Speech and Hearing Disorders 2 / Speech Processing for the Speech and Hearing Impaired 1
Human Speech Production
Segmentation, Labelling and Speech Corpora 4
Speaker and Language Recognition 4
Speech Technology Applications and Human-Machine Interface 3
Utterance Verification and Word Spotting 2
Large Vocabulary Continuous Speech Recognition 6
Neural Networks, Fuzzy and Evolutionary Methods 3
Speech Processing for the Speech-Impaired and Hearing-Impaired 2
Prosody and Emotion 7
2: SST Student Day
SST Student Day - Poster Session 1
SST Student Day - Poster Session 2

Author Index
A B C D E F G H I
J K L M N O P Q R
S T U V W X Y Z

Multimedia Files

A Contrastive Study of Lexical Stress Placement in Singapore English and British English

Authors:

Ee Ling Low, Nanyang Technological University (Singapore)
Esther Grabe, University of Cambridge (U.K.)

Page (NA) Paper number 98

Abstract:

Singapore English and British English have been claimed to differ in lexical stress placement. Examples cited in the literature involve polysyllabic words such as 'hopelessly' and compounds such as 'blackboard'. Such words are stressed word-initially in BE, but are said to be stressed word-finally in SE. Two observations lead us to explore the claim that SE and BE differ in lexical stress placement. Firstly, observations about stress differences between SE and BE are based solely on auditory impressions by British English listeners. Acoustic evidence is not available. Secondly, the auditory evidence comes from realisations of test words in citation form, i.e. in nuclear, phrase-final position. If Singapore English has more phrase-final lengthening than British English, then this may account for the suggested differences in lexical stress placement. In the present paper, we investigate the acoustic evidence for the suggested cross-varietal difference.

SL980098.PDF (From Author) SL980098.PDF (Rasterized)

TOP


Integrated Recognition of Words and Phrase Boundaries

Authors:

Florian Gallwitz, University of Erlangen-Nuremberg (Germany)
Anton Batliner, University of Erlangen-Nuremberg (Germany)
Jan Buckow, University of Erlangen-Nuremberg (Germany)
Richard Huber, University of Erlangen-Nuremberg (Germany)
Heinrich Niemann, University of Erlangen-Nuremberg (Germany)
Elmar Nöth, University of Erlangen-Nuremberg (Germany)

Page (NA) Paper number 328

Abstract:

In this paper we present an integrated approach for recognizing both the word sequence and the syntactic-prosodic structure of a spontaneous utterance. We take into account the fact that a spontaneous utterance is not merely an unstructured sequence of words by incorporating phrase boundary information into the language model and by providing HMMs to model boundaries. This allows for a distinction between word transitions across phrase boundaries and transitions within a phrase. During recognition, the syntactic-prosodic structure of the utterance is determined implicitly. Without any increase in computational effort, this leads to a 4% reduction of word error rate, and, at the same time, syntactic-prosodic boundary labels are provided for subsequent processing. The boundaries are recognized with a precision and recall rate of about 75% each. They can be used to reduce drastically the computational effort for parsing spontaneous utterances. We also present a system architecture to incorporate additional prosodic information.

SL980328.PDF (From Author) SL980328.PDF (Rasterized)

TOP


Phrase Accents Revisited: Comparative Evidence From Standard and Cypriot Greek

Authors:

Amalia Arvaniti, University of Cyprus (Cyprus)

Page (NA) Paper number 550

Abstract:

Phrase accents, one of the three tonal categories assumed by much recent research on intonation, are expected to associate with a prosodic boundary (e.g. the end of the utterance) but not to phonetically align with a specific tone-bearing unit (TBU), such as a stressed syllable. This paper presents experimental evidence on the intonation of Cypriot Greek polar questions suggesting that phrase accents prefer to associate with specific TBUs. Concretely, it is shown that in Cypriot Greek polar question intonation, autosegmentally described as L* H L%, the H phrase accent does not align with the final stressed vowel as in Standard Greek, but instead it aligns approximately 30ms from the onset of either the penultimate or the final vowel of the utterance. The data provide evidence that phrase accents, like other tonal categories, exhibit stable phonetic alignment and support Ladd, Arvaniti and Mennen's (1997) typology of stress-seeking and non-stress-seeking phrase accents.

SL980550.PDF (From Author) SL980550.PDF (Rasterized)

TOP


Phonetic Invariance and Phonological Stability: Lithuanian Pitch Accents

Authors:

Grzegorz Dogil, University of Stuttgart (Germany)
Gregor Möhler, University of Stuttgart (Germany)

Page (NA) Paper number 206

Abstract:

We argue that phonetically invariant realizations of phonological categories imply the synchronic and diachronic imperviousness of such categories to phonological rules and sound laws. We claim that phonetic invariance is the foundation of phonological stability. The category we discuss in this contribution is the pitch-accent. We provide a parametric phonetic description of this phonological category. By means of a parametrization technique we apply this description to the contrastive pitch-accents of Lithuanian. The statistic differences between acute and circumflex pitch-accents derived by the parametrization provide a basis for the discussion of synchronic and diachronic behavior of the phonetically nonbalanced phonological contrasts.

SL980206.PDF (From Author) SL980206.PDF (Rasterized)

TOP


A HMM-Based Recognition System for Perceptive Relevant Pitch Movements of Spontaneous German Speech

Authors:

Christel Brindöpke, University Bielefeld (Germany)
Gernot A. Fink, University Bielefeld (Germany)
Franz Kummert, University Bielefeld (Germany)
Gerhard Sagerer, University Bielefeld (Germany)

Page (NA) Paper number 503

Abstract:

This paper presents an HMM-based recognition system for perceptive relevant pitch movements of spontaneous German speech. The pitch movements are defined according to the perceptively and phonetically motivated IPO-approach to intonation. For recognition we use a hybrid approach combining polynomial classification with Hidden Markov Modelling. The recognition is based only on the speech signal, its fundamental frequency and eleven derived features. We evaluate the system on a speaker independent recognition task.

SL980503.PDF (From Author) SL980503.PDF (Rasterized)

TOP


Towards a Reversible Symbolic Coding of Intonation

Authors:

Jean Véronis, Université de Provence (France)
Estelle Campione, Université de Provence (France)

Page (NA) Paper number 846

Abstract:

This paper presents a two-step model for the symbolic coding and generation of intonation. First, the F0 curve is reduced to a series of pitch target points that capture the macroprosodic information of the utterance. Target points are then converted into a sequence of labels. Generation is achieved through the reverse steps. The model is language independent and requires no prior training on the data. We discuss the influence of the number of categories on the precision of fit, and show, by an evaluation on a large multilingual corpus (4 hours 20 minutes of speech, 50 speakers, 5 languages) that a model composed of three ascending and three descending categories, plus a category for small or null movements enables a regeneration of ca. 99% of points at less than 2 ST than the original. Given that the model is capable of various improvements, it seems a good candidate for practical applications.

SL980846.PDF (From Author) SL980846.PDF (Rasterized)

TOP