Neural Networks, Fuzzy and Evolutionary Methods 3

Home
Full List of Titles
1: ICSLP'98 Proceedings
Keynote Speeches
Text-To-Speech Synthesis 1
Spoken Language Models and Dialog 1
Prosody and Emotion 1
Hidden Markov Model Techniques 1
Speaker and Language Recognition 1
Multimodal Spoken Language Processing 1
Isolated Word Recognition
Robust Speech Processing in Adverse Environments 1
Spoken Language Models and Dialog 2
Articulatory Modelling 1
Talking to Infants, Pets and Lovers
Robust Speech Processing in Adverse Environments 2
Spoken Language Models and Dialog 3
Speech Coding 1
Articulatory Modelling 2
Prosody and Emotion 2
Neural Networks, Fuzzy and Evolutionary Methods 1
Utterance Verification and Word Spotting 1 / Speaker Adaptation 1
Text-To-Speech Synthesis 2
Spoken Language Models and Dialog 4
Human Speech Perception 1
Robust Speech Processing in Adverse Environments 3
Speech and Hearing Disorders 1
Prosody and Emotion 3
Spoken Language Understanding Systems 1
Signal Processing and Speech Analysis 1
Spoken Language Generation and Translation 1
Spoken Language Models and Dialog 5
Segmentation, Labelling and Speech Corpora 1
Multimodal Spoken Language Processing 2
Prosody and Emotion 4
Neural Networks, Fuzzy and Evolutionary Methods 2
Large Vocabulary Continuous Speech Recognition 1
Speaker and Language Recognition 2
Signal Processing and Speech Analysis 2
Prosody and Emotion 5
Robust Speech Processing in Adverse Environments 4
Segmentation, Labelling and Speech Corpora 2
Speech Technology Applications and Human-Machine Interface 1
Large Vocabulary Continuous Speech Recognition 2
Text-To-Speech Synthesis 3
Language Acquisition 1
Acoustic Phonetics 1
Speaker Adaptation 2
Speech Coding 2
Hidden Markov Model Techniques 2
Multilingual Perception and Recognition 1
Large Vocabulary Continuous Speech Recognition 3
Articulatory Modelling 3
Language Acquisition 2
Speaker and Language Recognition 3
Text-To-Speech Synthesis 4
Spoken Language Understanding Systems 4
Human Speech Perception 2
Large Vocabulary Continuous Speech Recognition 4
Spoken Language Understanding Systems 2
Signal Processing and Speech Analysis 3
Human Speech Perception 3
Speaker Adaptation 3
Spoken Language Understanding Systems 3
Multimodal Spoken Language Processing 3
Acoustic Phonetics 2
Large Vocabulary Continuous Speech Recognition 5
Speech Coding 3
Language Acquisition 3 / Multilingual Perception and Recognition 2
Segmentation, Labelling and Speech Corpora 3
Text-To-Speech Synthesis 5
Spoken Language Generation and Translation 2
Human Speech Perception 4
Robust Speech Processing in Adverse Environments 5
Text-To-Speech Synthesis 6
Speech Technology Applications and Human-Machine Interface 2
Prosody and Emotion 6
Hidden Markov Model Techniques 3
Speech and Hearing Disorders 2 / Speech Processing for the Speech and Hearing Impaired 1
Human Speech Production
Segmentation, Labelling and Speech Corpora 4
Speaker and Language Recognition 4
Speech Technology Applications and Human-Machine Interface 3
Utterance Verification and Word Spotting 2
Large Vocabulary Continuous Speech Recognition 6
Neural Networks, Fuzzy and Evolutionary Methods 3
Speech Processing for the Speech-Impaired and Hearing-Impaired 2
Prosody and Emotion 7
2: SST Student Day
SST Student Day - Poster Session 1
SST Student Day - Poster Session 2

Author Index
A B C D E F G H I
J K L M N O P Q R
S T U V W X Y Z

Multimedia Files

Fuzzy-Integration Based Normalization for Speaker Verification

Authors:

Tuan Pham, Faculty of Information Sciences & Engineering, University of Canberra (Australia)
Michael Wagner, Faculty of Information Sciences & Engineering, University of Canberra (Australia)

Page (NA) Paper number 953

Abstract:

Similarity or likelihood normalization techniques are important for speaker verification systems as they help to alleviate the variations in the speech signals. In the conventional normalization, the a priori probabilities of the cohort speakers are assumed to be equal. From this standpoint, we apply the theory of fuzzy measure and fuzzy integral to combine the likelihood values of the cohort speakers in which the assumption of equal a priori probabilities is relaxed. This approach replaces the conventional normalization term by the fuzzy integral which acts as a non-linear fusion of the similarity measures of an utterance assigned to the cohort speakers. We illustrate the performance of the proposed approach by testing the speaker verification system with both the conventional and the fuzzy algorithms using the commercial speech corpus TI46. The results in terms of the equal error rates show that the speaker verification system using the fuzzy integral is more flexible and more favorable than the conventional normalization method.

SL980953.PDF (From Author) SL980953.PDF (Rasterized)

TOP


Improving The Generalization Performance Of The MCE/GPD Learning

Authors:

Hiroshi Shimodaira, Japan Advanced Institute of Science and Technology (Japan)
Jun Rokui, Japan Advanced Institute of Science and Technology (Japan)
Mitsuru Nakai, Japan Advanced Institute of Science and Technology (Japan)

Page (NA) Paper number 795

Abstract:

A novel method to prevent the over-fitting effect and improve the generalization performance of the Minimum Classification Error (MCE) / Generalized Probabilistic Descent (GPD) learning is proposed. The MCE/GPD method, which is one of the newest discriminative-learning approaches proposed by Katagiri and Juang in 1992, results in better recognition performance in various areas of pattern recognition than the maximum-likelihood (ML) based approach where a posteriori probabilities are estimated. Despite its superiority in recognition performance, it still suffers from the problem of over-fitting to the training samples as it is with other learning algorithms. In the present study, a regularization technique is employed to the MCE method to overcome this problem. Feed-forward neural networks are employed as a recognition platform to evaluate the recognition performance of the proposed method. Recognition experiments are conducted on several sorts of datasets. The proposed method shows better generalization performance than the original one

SL980795.PDF (From Author) SL980795.PDF (Rasterized)

TOP


Acoustic Speech Recognition Model by Neural Net Equation with Competition and Cooperation

Authors:

Tetsuro Kitazoe, Miyazaki University (Japan)
Tomoyuki Ichiki, Miyazaki University (Japan)
Sung-Ill Kim, Miyazaki University (Japan)

Page (NA) Paper number 965

Abstract:

The equation of neural nets for stereo vision is applied to speech recognition. We use Coupled Pattern Recognition (CPR) equation which has been shown to organize depth perception very well through competition and cooperation. We construct Gaussian probability density function for each phoneme from a number of training data. The input data to be recognized are compared to the pdf's and the similarity measures are obtained for each phoneme. The CPR equation develops neuron activities by receiving the similarity measures as input. A recognition is achieved when the activities arrive at a stable states. The recognition rates for 25 Japanese phoneme are 74.75% in average which is compared to 71.53% Hidden Markov Model. A certain technical improvement is applied to our neuron model, by dividing data of a phoneme into two part, one for the former frames, the other for the latter frames.A remarkable improvement is obtained with average recognition rate of 79.79%.

SL980965.PDF (Scanned)

TOP


Improved Surname Pronunciations Using Decision Trees

Authors:

Julie Ngan, Institute for Signal and Information Processing, Mississippi State University (USA)
Aravind Ganapathiraju, Institute for Signal and Information Processing, Mississippi State University (USA)
Joseph Picone, Institute for Signal and Information Processing, Mississippi State University (USA)

Page (NA) Paper number 384

Abstract:

Proper noun pronunciation generation is a particularly challenging problem in speech recognition since a large percentage of proper nouns often defy typical letter-to-sound conversion rules. In this paper, we present decision tree methods which outperform neural network techniques. Using the decision tree method, we have achieved an overall error rate of 45.5%, which is a 35% reduction over the previous techniques. Our best system is a binary decision tree that uses a context length of 3 and employs information gain ratio as the splitting rule.

SL980384.PDF (From Author) SL980384.PDF (Rasterized)

TOP