Full List of Titles 1: ICSLP'98 Proceedings 2: SST Student Day Author Index A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Multimedia Files |
A Three-Dimensional Linear Articulatory Model Based on MRI DataAuthors:
Pierre Badin, Institut de la Communication Parlée, UPRESA CNRS 5009, INPG - Univ. Stendhal, Grenoble (France)
Page (NA) Paper number 14Abstract:Based on a set of 3D vocal tract images obtained by MRI, a 3D linear articulatory model has been built using guided Principal Component Analysis. It constitutes an extension to the lateral dimension of the mid-sagittal model previously developed from a radiofilm recorded on the same subject. The parameters of the 2D model have been found to be good predictors of the 3D shapes, for most configurations. A first evaluation of the model in terms of area functions and formants is presented.
|
0532_01.PDF(was: 0532_1.GIF) | MRI image File type: Image File Format: Image : GIF Tech. description: 514x510 pixels, 150 dpi, 113k Creating Application:: Photoshop Creating OS: MacOs 8.1 |
0532_02.PDF(was: 0532_2.GIF) | MRI image File type: Image File Format: Image : GIF Tech. description: 277x228 pixels, 72 dpi, 27k Creating Application:: Photoshop Creating OS: MacOs 8.1 |
0532_03.PDF(was: 0532_3.GIF) | 3-D reconstruction File type: Image File Format: Image : GIF Tech. description: 627x385 pixels, 72 dpi, 8 bits/pixel Creating Application:: Photoshop Creating OS: MacOs 8.1 |
Masafumi Matsumura, Osaka Electro-Communication University (Japan)
Takuya Niikawa, Osaka Electro-Communication University (Japan)
Takao Tanabe, Osaka Electro-Communication University (Japan)
Takashi Tachimura, Osaka University (Japan)
Takeshi Wada, Osaka University (Japan)
A 15-cantilever-type force-sensor unit is presented for the measurement of palatolingual contact stress and pattern during palatal consonant phonation. The force sensor unit is composed of a strain gauge and a cantilever, and is embedded in a thin palatal plate attached to the human hard palate. It is 3 mm wide, by 5 mm long, and 1.3 mm thick. The output of the force sensor unit at the low stress range of 0-64 kPa (0-5 gw) is proportional to the stress applied to the force sensing unit, with nearly no hysteresis. Measurement error of the force sensor is less than 1.7%. Error by mechanical interference among cantilever-type force sensors is less than 0.2%. The presented 15-cantilever-type force-sensor-mounted palatal plate allows for ready observation of the dynamic aspect of the palatolingual contact stress and patterns during the phonation of consonants.
Tokihiko Kaburagi, NTT Basic Research Laboratories (Japan)
Masaaki Honda, NTT Basic Research Laboratories (Japan)
This paper presents a method for determining the vocal-tract spectrum from the positions of fixed points on the articulatory organs. The method is based on the search of a database comprised of pairs of articulatory and acoustic data representing the direct relationship between the articulator position and vocal-tract spectrum. To compile the database, the electro-magnetic articulograph (EMA) system is used to measure the movements of the jaw, lips, tongue, velum, and larynx simultaneously with speech waveforms. The spectrum estimation is accomplished by selecting database samples neighboring the input articulator position and interpolating the selected samples. In addition, phoneme categorization of the input position is performed to restrict the search area of the database to portions of the same phoneme category. Experiments show that the mean estimation error is 2.24 dB and the quality of speech synthesized from the estimated spectrum can be improved by using the phoneme categorization.
0425_01.WAV(was: 0425_01.WAV) | Speech samples synthesized with phoneme categorization are included
in the CD-ROM [SOUND 0425\_01.WAV] [SOUND 0425\_02.WAV] [SOUND 0425\_03.WAV]. File type: Sound File Format: Sound File: WAV Tech. description: None Creating Application:: Unknown Creating OS: Unknown |
0425_02.WAV(was: 0425_02.WAV) | Speech samples synthesized with phoneme categorization are included
in the CD-ROM [SOUND 0425\_01.WAV] [SOUND 0425\_02.WAV] [SOUND 0425\_03.WAV]. File type: Sound File Format: Sound File: WAV Tech. description: None Creating Application:: Unknown Creating OS: Unknown |
0425_03.WAV(was: 0425_03.WAV) | Speech samples synthesized with phoneme categorization are included
in the CD-ROM [SOUND 0425\_01.WAV] [SOUND 0425\_02.WAV] [SOUND 0425\_03.WAV]. File type: Sound File Format: Sound File: WAV Tech. description: None Creating Application:: Unknown Creating OS: Unknown |
Kiyoshi Honda, ATR Human Information Processing Labs. (Japan)
Mark K. Tiede, ATR Human Information Processing Labs. (Japan)
Individual variation of larynx position reflects human morphological differences and thus contributes to generating biological information in speech sounds. This study examines the factors of orofacial morphology that co-vary with larynx position based on MRI data collected for 12 Japanese and 12 English speakers. The materials are midsagittal craniofacial images, and the method is based on the measurement of angles and indices. Among all the measures examined, the aspect ratio of the oral cavity in the lateral view showed the highest correlation (r=0.87) with larynx height index (ratio of arytenoid - palatal plane distance and anterior nasal spine - nasopharyngeal wall distance), and a facial angle (angle of maxillary incisor - nasion - nasopharyngeal wall) showed the second highest correlation (r=0.66) with larynx height index. The result indicates that larynx position co-varies with oral cavity shape, being higher when oral cavity shape is flatter and more prognathic.