Authors:
Frantz Clermont, University of Tsukuba (Japan)
Parham Mokhtari, LORIA-Campus Scientifique (France)
Page (NA) Paper number 87
Abstract:
We present some evidence indicating that phonetic distinctiveness and
speaker individuality, are indeed manifested in vowels' vocal-tract
(VT) sha pes estimated from the lower and upper formants, respectively.
The methodology developed to demonstrate this dichotomy, implicates
Schroeder's (1967) acoustic-articulatory model which can be coerced
to yield area-function approximations to VT-shapes of differing formant
components. Using ten steady-state vowels recorded in /hVd/-context,
five times at random, by four adult-male speakers of Australian English,
VT-shape variability was then measured on an intra- and an inter-speaker
basis. Gross shapes estimated from the lower formants, caused the
largest spread amongst the vowels of individual speakers. By contrast,
more detailed sha pes estimated from certain higher formants of front
and back vowels, caused the largest spread amongst speakers. These
results contribute a quasi-articulatory substantiation of a long-standing
view on the speaker-specific potency of the upper vowel-formant region,
together with some useful implications for speech and speaker recognition.
Authors:
Philip Hoole, Institut fuer Phonetik, Munich University (Germany)
Christian Kroos, Institut fuer Phonetik, Munich University (Germany)
Page (NA) Paper number 1097
Abstract:
Digital video filming of the thyroid prominence was used to measure
larynx height in German vowels, with focus on contrasts involving front
unrounded, front rounded and back rounded vowels. The study aimed to
provide a foundation for interpreting the acoustic consequences of
articulatory maneuvres not only at the larynx but also elsewhere in
the vocal tract. Results showed the expected pattern of lower larynx
position for the rounded vowels. However no clear preference emerged
for the same, more, or less larynx lowering on front rounded versus
back rounded vowels. Coarticulatory effects of the flanking consonants
were weak. The most striking result was that the magnitude of the differences
between vowels varied substantially over speakers. This reinforces
the contention that interpretation of vertical laryngeal gestures must
be embedded in speaker-specific analysis of downstream articulatory
maneuvres. Work in this direction is currently in progress.
Authors:
Paavo Alku, University of Turku (Finland)
Juha Vintturi, Helsinki Univ. Central Hospital (Finland)
Erkki Vilkman, University of Oulu (Finland)
Page (NA) Paper number 67
Abstract:
For voiced speech the main excitation of the vocal tract occurs at
the end of the glottal closing phase when the rate of change of the
flow reaches its absolute maximum. This study presents a straightforward
method that yields a numerical value to characterize the effect of
the main excitation on vocal intensity. The method, Energy Ratio by
Modified Excitation (ERME), takes advantage of the glottal flow and
the model of the vocal tract transfer function given by inverse filtering
and it synthesizes two signals based on the source-filter theory. The
first synthesized sound is produced using the glottal flow waveform
given by inverse filtering per se. The second signal is synthesized
by removing the main excitation from the differentiated glottal flow.
ERME is defined as the ratio between the energy of the first synthesized
signal and the energy of the second one. It is shown that when the
loudness of speech increases, the value of ERME first rises but in
the case of loud voices it starts to decrease. This behavior of ERME
shows that effects of secondary excitations of the vocal tract that
occur during glottal opening become important in the production of
loud voices.
Authors:
Gordon Ramsay, ICP-INPG (France)
Page (NA) Paper number 670
Abstract:
Speech is typically modelled using time-domain or frequency-domain
simulations of the acoustic field in the vocal tract. Using a biorthogonal
modal decomposition, it is shown that time-domain finite-difference
simulations can be transformed algebraically into equivalent formant
synthesizers, the parameters of which vary in time and are calculated
directly from the laws of physics. Examining the structure of the equivalent
formant synthesizer, it is observed that formant excitation is largely
due to internal modal coupling effects, induced by rapid perturbation
of the acoustic eigenmodes caused by vibration of the glottis, and
does not rely precisely on external sources provided by boundary conditions.
This leads to a novel interpretation and justification of traditional
models of the glottal source.
Authors:
John H. Esling, University of Victoria (Canada)
Page (NA) Paper number 617
Abstract:
Using fibreoptic laryngoscopy to observe pharyngeal articulations,
the aryepiglottic sphincter mechanism is shown to be responsible for
the production of speech sounds in the phonetic category "pharyngeal."
Major differences in auditory/acoustic quality are also produced when
the larynx as a whole is raised or lowered during the production of
pharyngeals. The voiceless pharyngeal fricative and voiced pharyngeal
approximant are the result of increased sphincteric constriction of
the laryngeal "tube" in a continuum that begins with normal glottal
stop and ventricular fold closure. A pharyngeal stop is produced when
the aryepiglottic sphincter mechanism achieves complete closure, and
trilling accompanying friction is evident at the pharyngeal place of
articulation in both voiceless and voiced modes. It is suggested that
all five sounds share a common, pharyngeal place of articulation, but
differ in manner of articulation. Raised larynx is the default setting
for these articulations, but they may be produced with lowered larynx.
Authors:
Hiroki Matsuzaki, Hokkai-Gakuen University (Japan)
Kunitoshi Motoki, Hokkai-Gakuen University (Japan)
Nobuhiro Miki, Hokkaido University (Japan)
Page (NA) Paper number 656
Abstract:
The acoustic characteristics of acoustic tubes with protrusions at
the radiation end are computed by FEM simulation. In the first experiment,
two different shapes of the protrusions, a symmetrical and an asymmetrical
shape with respect to the vertical, are investigated. Frequency characteristics
of the radiation impedance are computed from simulation results. The
simulation results show that the results of FEM simulation are in good
agreement with our measurement results. The proposed 3-D radiational
model is useful for analysis of the acoustic characteristics of human
speech. In the 2nd experiment, the protrusion is attached to our 3-D
vocal tract model. The vocal tract shape corresponds to the Japanese
vowel /a/. The cross sections of the tubes are eliptic in shape.
The simulation results show that the vocal tract transfer function
of the FEM results is different from our previous FEM results and 1-D
analytical solution.
|