ABSTRACT
A number of experiments have shown the importance of the use of speech production models for automatic speech recognition ([1],[4],[6]).This work is very interesting for the concise representation of the sound coarticulation phenomena in continuous speech. Maeda's statistical model [5] has been chosen to conduct our experiments. The first part of the paper focusses on adjusting the model configurations characterizing the French vocalic sounds in an optimum way so as to minimize the acoustic distances from the phonemes produced by a speaker. The second part provides a control strategy for Maeda's model command parameters.
ABSTRACT
This paper will present three-dimensional tongue "volumes," reconstructed from three sagittal slices (left, mid, right) made using tagged cine MRI. The volumes will be animated to show CV movement from the consonants /k/ and /s/ to the vowels /i/, /a/, and /u/.
ABSTRACT
Many researchers have seen in the articulation an intermediate level of representation. In the gestural phonetic theory, units are articulatory gestures. In order to assess this theory with observed parameters, we have defined a robust labelling system (AMULET) of the multi sensor ACCOR speech database. Main articulatory gestures searched are Voice Onset and Voice Termination on both acoustic and laryngographic signals. We present here two efficient Voiced/Unvoiced/Silence detectors for the acoustic signals and a third one for the laryngographic signal.
ABSTRACT
The physiological background of sentence declination (fundamental frequency, F0, drop) was studied using ultrasound (US) examination of the cricothyroid (CT) space. The US probe was placed anteriorly in the region of middle cricothyroid ligament. The echoes caused by the antero-inferior edge of thyroid and antero-superior edge of the cricoid cartilages were used as points of measurement. The test utterances consisted of three- and five-word sentences. F0, sound level and CT space were measured from recordings. F0 declination and CT space widening showed a phase relationship. E.g., in a long sentence in which the F0 declined from 194 Hz to 85 Hz the CT space change was 4 mm (from 0.83 cm to 1.25 cm). The correlation between the F0 declination and CT space was r=-0.85. The main pitch regulating system connected with CT joint movements seems to contribute to sentence declination production. These biomechanical events can be monitored using the US method.
ABSTRACT
This paper presents an ultra fast implementation of Turbo Spin Echo (TSE) to achieve continuous monitoring of the vocal tract with an actual time resolution of 4 images per second. We present preliminary results of two experiments involving coarticulation and articulatory compensations. articulations involved in speech production i.e. lips, tongue, larynx, lower jaw and velum.
ABSTRACT
The present study demonstrates the possibility to reconstruct complete midsagittal tongue shapes from the coordinates of three fleshpoints on the tongue and from the position of the jaw. The method is based on the inversion of an articulatory model made on the subject from cineradiographic images, and lead to an average reconstruction error of 1.26mm.