Authors:
Ee Ling Low, Nanyang Technological University (Singapore)
Esther Grabe, University of Cambridge (U.K.)
Page (NA) Paper number 98
Abstract:
Singapore English and British English have been claimed to differ in
lexical stress placement. Examples cited in the literature involve
polysyllabic words such as 'hopelessly' and compounds such as 'blackboard'.
Such words are stressed word-initially in BE, but are said to be stressed
word-finally in SE. Two observations lead us to explore the claim that
SE and BE differ in lexical stress placement. Firstly, observations
about stress differences between SE and BE are based solely on auditory
impressions by British English listeners. Acoustic evidence is not
available. Secondly, the auditory evidence comes from realisations
of test words in citation form, i.e. in nuclear, phrase-final position.
If Singapore English has more phrase-final lengthening than British
English, then this may account for the suggested differences in lexical
stress placement. In the present paper, we investigate the acoustic
evidence for the suggested cross-varietal difference.
Authors:
Florian Gallwitz, University of Erlangen-Nuremberg (Germany)
Anton Batliner, University of Erlangen-Nuremberg (Germany)
Jan Buckow, University of Erlangen-Nuremberg (Germany)
Richard Huber, University of Erlangen-Nuremberg (Germany)
Heinrich Niemann, University of Erlangen-Nuremberg (Germany)
Elmar Nöth, University of Erlangen-Nuremberg (Germany)
Page (NA) Paper number 328
Abstract:
In this paper we present an integrated approach for recognizing both
the word sequence and the syntactic-prosodic structure of a spontaneous
utterance. We take into account the fact that a spontaneous utterance
is not merely an unstructured sequence of words by incorporating phrase
boundary information into the language model and by providing HMMs
to model boundaries. This allows for a distinction between word transitions
across phrase boundaries and transitions within a phrase. During recognition,
the syntactic-prosodic structure of the utterance is determined implicitly.
Without any increase in computational effort, this leads to a 4% reduction
of word error rate, and, at the same time, syntactic-prosodic boundary
labels are provided for subsequent processing. The boundaries are
recognized with a precision and recall rate of about 75% each. They
can be used to reduce drastically the computational effort for parsing
spontaneous utterances. We also present a system architecture to incorporate
additional prosodic information.
Authors:
Amalia Arvaniti, University of Cyprus (Cyprus)
Page (NA) Paper number 550
Abstract:
Phrase accents, one of the three tonal categories assumed by much recent
research on intonation, are expected to associate with a prosodic boundary
(e.g. the end of the utterance) but not to phonetically align with
a specific tone-bearing unit (TBU), such as a stressed syllable. This
paper presents experimental evidence on the intonation of Cypriot Greek
polar questions suggesting that phrase accents prefer to associate
with specific TBUs. Concretely, it is shown that in Cypriot Greek polar
question intonation, autosegmentally described as L* H L%, the H phrase
accent does not align with the final stressed vowel as in Standard
Greek, but instead it aligns approximately 30ms from the onset of either
the penultimate or the final vowel of the utterance. The data provide
evidence that phrase accents, like other tonal categories, exhibit
stable phonetic alignment and support Ladd, Arvaniti and Mennen's (1997)
typology of stress-seeking and non-stress-seeking phrase accents.
Authors:
Grzegorz Dogil, University of Stuttgart (Germany)
Gregor Möhler, University of Stuttgart (Germany)
Page (NA) Paper number 206
Abstract:
We argue that phonetically invariant realizations of phonological categories
imply the synchronic and diachronic imperviousness of such categories
to phonological rules and sound laws. We claim that phonetic invariance
is the foundation of phonological stability. The category we discuss
in this contribution is the pitch-accent. We provide a parametric phonetic
description of this phonological category. By means of a parametrization
technique we apply this description to the contrastive pitch-accents
of Lithuanian. The statistic differences between acute and circumflex
pitch-accents derived by the parametrization provide a basis for the
discussion of synchronic and diachronic behavior of the phonetically
nonbalanced phonological contrasts.
Authors:
Christel Brindöpke, University Bielefeld (Germany)
Gernot A. Fink, University Bielefeld (Germany)
Franz Kummert, University Bielefeld (Germany)
Gerhard Sagerer, University Bielefeld (Germany)
Page (NA) Paper number 503
Abstract:
This paper presents an HMM-based recognition system for perceptive
relevant pitch movements of spontaneous German speech. The pitch movements
are defined according to the perceptively and phonetically motivated
IPO-approach to intonation. For recognition we use a hybrid approach
combining polynomial classification with Hidden Markov Modelling. The
recognition is based only on the speech signal, its fundamental frequency
and eleven derived features. We evaluate the system on a speaker independent
recognition task.
Authors:
Jean Véronis, Université de Provence (France)
Estelle Campione, Université de Provence (France)
Page (NA) Paper number 846
Abstract:
This paper presents a two-step model for the symbolic coding and generation
of intonation. First, the F0 curve is reduced to a series of pitch
target points that capture the macroprosodic information of the utterance.
Target points are then converted into a sequence of labels. Generation
is achieved through the reverse steps. The model is language independent
and requires no prior training on the data. We discuss the influence
of the number of categories on the precision of fit, and show, by an
evaluation on a large multilingual corpus (4 hours 20 minutes of speech,
50 speakers, 5 languages) that a model composed of three ascending
and three descending categories, plus a category for small or null
movements enables a regeneration of ca. 99% of points at less than
2 ST than the original. Given that the model is capable of various
improvements, it seems a good candidate for practical applications.
|