Articulatory Modelling 2

Home
Full List of Titles
1: ICSLP'98 Proceedings
Keynote Speeches
Text-To-Speech Synthesis 1
Spoken Language Models and Dialog 1
Prosody and Emotion 1
Hidden Markov Model Techniques 1
Speaker and Language Recognition 1
Multimodal Spoken Language Processing 1
Isolated Word Recognition
Robust Speech Processing in Adverse Environments 1
Spoken Language Models and Dialog 2
Articulatory Modelling 1
Talking to Infants, Pets and Lovers
Robust Speech Processing in Adverse Environments 2
Spoken Language Models and Dialog 3
Speech Coding 1
Articulatory Modelling 2
Prosody and Emotion 2
Neural Networks, Fuzzy and Evolutionary Methods 1
Utterance Verification and Word Spotting 1 / Speaker Adaptation 1
Text-To-Speech Synthesis 2
Spoken Language Models and Dialog 4
Human Speech Perception 1
Robust Speech Processing in Adverse Environments 3
Speech and Hearing Disorders 1
Prosody and Emotion 3
Spoken Language Understanding Systems 1
Signal Processing and Speech Analysis 1
Spoken Language Generation and Translation 1
Spoken Language Models and Dialog 5
Segmentation, Labelling and Speech Corpora 1
Multimodal Spoken Language Processing 2
Prosody and Emotion 4
Neural Networks, Fuzzy and Evolutionary Methods 2
Large Vocabulary Continuous Speech Recognition 1
Speaker and Language Recognition 2
Signal Processing and Speech Analysis 2
Prosody and Emotion 5
Robust Speech Processing in Adverse Environments 4
Segmentation, Labelling and Speech Corpora 2
Speech Technology Applications and Human-Machine Interface 1
Large Vocabulary Continuous Speech Recognition 2
Text-To-Speech Synthesis 3
Language Acquisition 1
Acoustic Phonetics 1
Speaker Adaptation 2
Speech Coding 2
Hidden Markov Model Techniques 2
Multilingual Perception and Recognition 1
Large Vocabulary Continuous Speech Recognition 3
Articulatory Modelling 3
Language Acquisition 2
Speaker and Language Recognition 3
Text-To-Speech Synthesis 4
Spoken Language Understanding Systems 4
Human Speech Perception 2
Large Vocabulary Continuous Speech Recognition 4
Spoken Language Understanding Systems 2
Signal Processing and Speech Analysis 3
Human Speech Perception 3
Speaker Adaptation 3
Spoken Language Understanding Systems 3
Multimodal Spoken Language Processing 3
Acoustic Phonetics 2
Large Vocabulary Continuous Speech Recognition 5
Speech Coding 3
Language Acquisition 3 / Multilingual Perception and Recognition 2
Segmentation, Labelling and Speech Corpora 3
Text-To-Speech Synthesis 5
Spoken Language Generation and Translation 2
Human Speech Perception 4
Robust Speech Processing in Adverse Environments 5
Text-To-Speech Synthesis 6
Speech Technology Applications and Human-Machine Interface 2
Prosody and Emotion 6
Hidden Markov Model Techniques 3
Speech and Hearing Disorders 2 / Speech Processing for the Speech and Hearing Impaired 1
Human Speech Production
Segmentation, Labelling and Speech Corpora 4
Speaker and Language Recognition 4
Speech Technology Applications and Human-Machine Interface 3
Utterance Verification and Word Spotting 2
Large Vocabulary Continuous Speech Recognition 6
Neural Networks, Fuzzy and Evolutionary Methods 3
Speech Processing for the Speech-Impaired and Hearing-Impaired 2
Prosody and Emotion 7
2: SST Student Day
SST Student Day - Poster Session 1
SST Student Day - Poster Session 2

Author Index
A B C D E F G H I
J K L M N O P Q R
S T U V W X Y Z

Multimedia Files

Acoustic-Articulatory Evaluation of the Upper Vowel-Formant Region and its Presumed Speaker-Specific Potency

Authors:

Frantz Clermont, University of Tsukuba (Japan)
Parham Mokhtari, LORIA-Campus Scientifique (France)

Page (NA) Paper number 87

Abstract:

We present some evidence indicating that phonetic distinctiveness and speaker individuality, are indeed manifested in vowels' vocal-tract (VT) sha pes estimated from the lower and upper formants, respectively. The methodology developed to demonstrate this dichotomy, implicates Schroeder's (1967) acoustic-articulatory model which can be coerced to yield area-function approximations to VT-shapes of differing formant components. Using ten steady-state vowels recorded in /hVd/-context, five times at random, by four adult-male speakers of Australian English, VT-shape variability was then measured on an intra- and an inter-speaker basis. Gross shapes estimated from the lower formants, caused the largest spread amongst the vowels of individual speakers. By contrast, more detailed sha pes estimated from certain higher formants of front and back vowels, caused the largest spread amongst speakers. These results contribute a quasi-articulatory substantiation of a long-standing view on the speaker-specific potency of the upper vowel-formant region, together with some useful implications for speech and speaker recognition.

SL980087.PDF (From Author) SL980087.PDF (Rasterized)

TOP


Control of Larynx Height in Vowel Production

Authors:

Philip Hoole, Institut fuer Phonetik, Munich University (Germany)
Christian Kroos, Institut fuer Phonetik, Munich University (Germany)

Page (NA) Paper number 1097

Abstract:

Digital video filming of the thyroid prominence was used to measure larynx height in German vowels, with focus on contrasts involving front unrounded, front rounded and back rounded vowels. The study aimed to provide a foundation for interpreting the acoustic consequences of articulatory maneuvres not only at the larynx but also elsewhere in the vocal tract. Results showed the expected pattern of lower larynx position for the rounded vowels. However no clear preference emerged for the same, more, or less larynx lowering on front rounded versus back rounded vowels. Coarticulatory effects of the flanking consonants were weak. The most striking result was that the magnitude of the differences between vowels varied substantially over speakers. This reinforces the contention that interpretation of vertical laryngeal gestures must be embedded in speaker-specific analysis of downstream articulatory maneuvres. Work in this direction is currently in progress.

SL981097.PDF (From Author) SL981097.PDF (Rasterized)

TOP


Analyzing the Effect of Secondary Excitations of the Vocal Tract on Vocal Intensity in Different Loudness Conditions

Authors:

Paavo Alku, University of Turku (Finland)
Juha Vintturi, Helsinki Univ. Central Hospital (Finland)
Erkki Vilkman, University of Oulu (Finland)

Page (NA) Paper number 67

Abstract:

For voiced speech the main excitation of the vocal tract occurs at the end of the glottal closing phase when the rate of change of the flow reaches its absolute maximum. This study presents a straightforward method that yields a numerical value to characterize the effect of the main excitation on vocal intensity. The method, Energy Ratio by Modified Excitation (ERME), takes advantage of the glottal flow and the model of the vocal tract transfer function given by inverse filtering and it synthesizes two signals based on the source-filter theory. The first synthesized sound is produced using the glottal flow waveform given by inverse filtering per se. The second signal is synthesized by removing the main excitation from the differentiated glottal flow. ERME is defined as the ratio between the energy of the first synthesized signal and the energy of the second one. It is shown that when the loudness of speech increases, the value of ERME first rises but in the case of loud voices it starts to decrease. This behavior of ERME shows that effects of secondary excitations of the vocal tract that occur during glottal opening become important in the production of loud voices.

SL980067.PDF (From Author) SL980067.PDF (Rasterized)

TOP


An Analysis of Modal Coupling Effects During the Glottal Cycle: Formant Synthesizers from Time-Domain Finite-Difference Simulations

Authors:

Gordon Ramsay, ICP-INPG (France)

Page (NA) Paper number 670

Abstract:

Speech is typically modelled using time-domain or frequency-domain simulations of the acoustic field in the vocal tract. Using a biorthogonal modal decomposition, it is shown that time-domain finite-difference simulations can be transformed algebraically into equivalent formant synthesizers, the parameters of which vary in time and are calculated directly from the laws of physics. Examining the structure of the equivalent formant synthesizer, it is observed that formant excitation is largely due to internal modal coupling effects, induced by rapid perturbation of the acoustic eigenmodes caused by vibration of the glottis, and does not rely precisely on external sources provided by boundary conditions. This leads to a novel interpretation and justification of traditional models of the glottal source.

SL980670.PDF (From Author) SL980670.PDF (Rasterized)

TOP


Laryngoscopic Analysis of Pharyngeal Articulations and Larynx-Height Voice Quality Settings

Authors:

John H. Esling, University of Victoria (Canada)

Page (NA) Paper number 617

Abstract:

Using fibreoptic laryngoscopy to observe pharyngeal articulations, the aryepiglottic sphincter mechanism is shown to be responsible for the production of speech sounds in the phonetic category "pharyngeal." Major differences in auditory/acoustic quality are also produced when the larynx as a whole is raised or lowered during the production of pharyngeals. The voiceless pharyngeal fricative and voiced pharyngeal approximant are the result of increased sphincteric constriction of the laryngeal "tube" in a continuum that begins with normal glottal stop and ventricular fold closure. A pharyngeal stop is produced when the aryepiglottic sphincter mechanism achieves complete closure, and trilling accompanying friction is evident at the pharyngeal place of articulation in both voiceless and voiced modes. It is suggested that all five sounds share a common, pharyngeal place of articulation, but differ in manner of articulation. Raised larynx is the default setting for these articulations, but they may be produced with lowered larynx.

SL980617.PDF (From Author) SL980617.PDF (Rasterized)

TOP


Effects of Shapes of Radiational Aperture on Radiation Characteristics

Authors:

Hiroki Matsuzaki, Hokkai-Gakuen University (Japan)
Kunitoshi Motoki, Hokkai-Gakuen University (Japan)
Nobuhiro Miki, Hokkaido University (Japan)

Page (NA) Paper number 656

Abstract:

The acoustic characteristics of acoustic tubes with protrusions at the radiation end are computed by FEM simulation. In the first experiment, two different shapes of the protrusions, a symmetrical and an asymmetrical shape with respect to the vertical, are investigated. Frequency characteristics of the radiation impedance are computed from simulation results. The simulation results show that the results of FEM simulation are in good agreement with our measurement results. The proposed 3-D radiational model is useful for analysis of the acoustic characteristics of human speech. In the 2nd experiment, the protrusion is attached to our 3-D vocal tract model. The vocal tract shape corresponds to the Japanese vowel /a/. The cross sections of the tubes are eliptic in shape. The simulation results show that the vocal tract transfer function of the FEM results is different from our previous FEM results and 1-D analytical solution.

SL980656.PDF (From Author) SL980656.PDF (Rasterized)

TOP