ABSTRACT
In this paper we show how a confusion matrix derived from phone identification experiments can be used to automatically generate phone clusters. These clusters can be applied when constructing triphone models to overcome the sparse data problem. Two techniques are presented; firstly an hierarchical clustering technique is described; then an open clustering technique is presented. Both of these use mutual information calculated on a probability distribution derived from the confusion matrix as a measure of phone similarity. Sample results from each technique are presented.
ABSTRACT
The purpose of this paper is to explain how it is possible to simulate the emergence of a common phonetic code in a society of speech robots using evolutionary techniques. Simulations of the prediction of vowel systems and the explanation of the Maximum Use of Available distinctive Features (MUAF) principle are discussed. These experimental results show how simple local rules of interaction between robots may explain some of the universals characteristics of the phonological structure of world's languages. On going work aiming to answer more complex questions, such as the use of supplementary features in large vowel systems, is presented.
ABSTRACT
Our paper addresses the question of cross-linguistic similarities and differences in the articulatory patterns of plosives. An EPG investigation of the English and Norwegian plosives /t/ and /d/ shows a larger contact area between tongue and palate for /t/ than for /d/ in both languages. The investigation also shows a more laminal articulation, larger contact areas, for both plosives in Norwegian compared to English. We suggest that the same general phonetic-physiological factors may explain the larger contact areas for /t/ than for /d/ in both languages. The oral air pressure is stronger during the articulation of /t/ than of /d/. In order to prevent air from escaping between the tongue and the palate, a firmer contact is needed for voiceless than for voiced plosives. The larger contact areas for the Norwegian plosives compared to the English ones are interpreted as the result of different phonological patterns in the two languages.
ABSTRACT
An analysis-by-synthesis method for finding formant bandwidths from vowel spectra has been implemented on a solar-powered computer used in fieldwork, thus enabling linguists to test hypotheses about differences between sets of vowels while working with speakers of the language. The procedure has been tested on the two sets of vowels that occur in Degema, a language spoken in Nigeria.
ABSTRACT
In this study, we develop data-based word juncture models, which account for the pronunciation variations at word boundaries, as an optional form of phonological rules. We used the American English TIMIT database. Issues in generating the models and using them in a continuous recognition task are discussed. A comparison is given between the coverage of the pronunciation variations by the models and by a set of phonological rules. There is a fairly good agreement between the models and the rules in predicting the pronunciation variations, whereas the models cover a larger set of variation phenomena. Furthermore, use of the models improved recognition performance.
ABSTRACT
A production experiment was conducted in order to examine the acquisition of English intonation by native speakers of Japanese, and the results were analyzed within the framework developed by Pierrehumb../pdf/th3b [3] and her colleagues. The results suggest that second language intonation is acquired on two different levels: learners first acquire the categorical patterns of the foreign intonation, and only later learn to produce native-like continuous intonational streams. This supports models in which the speech cognitive system is split into two sub-modules: a phonological component (characterized by categorical units) and a phonetic component (implementing the phonological units as a continuous articulatory/acoustic stream).