Neural Networks, Fuzzy and Evolutionary Methods 2

Home
Full List of Titles
1: ICSLP'98 Proceedings
Keynote Speeches
Text-To-Speech Synthesis 1
Spoken Language Models and Dialog 1
Prosody and Emotion 1
Hidden Markov Model Techniques 1
Speaker and Language Recognition 1
Multimodal Spoken Language Processing 1
Isolated Word Recognition
Robust Speech Processing in Adverse Environments 1
Spoken Language Models and Dialog 2
Articulatory Modelling 1
Talking to Infants, Pets and Lovers
Robust Speech Processing in Adverse Environments 2
Spoken Language Models and Dialog 3
Speech Coding 1
Articulatory Modelling 2
Prosody and Emotion 2
Neural Networks, Fuzzy and Evolutionary Methods 1
Utterance Verification and Word Spotting 1 / Speaker Adaptation 1
Text-To-Speech Synthesis 2
Spoken Language Models and Dialog 4
Human Speech Perception 1
Robust Speech Processing in Adverse Environments 3
Speech and Hearing Disorders 1
Prosody and Emotion 3
Spoken Language Understanding Systems 1
Signal Processing and Speech Analysis 1
Spoken Language Generation and Translation 1
Spoken Language Models and Dialog 5
Segmentation, Labelling and Speech Corpora 1
Multimodal Spoken Language Processing 2
Prosody and Emotion 4
Neural Networks, Fuzzy and Evolutionary Methods 2
Large Vocabulary Continuous Speech Recognition 1
Speaker and Language Recognition 2
Signal Processing and Speech Analysis 2
Prosody and Emotion 5
Robust Speech Processing in Adverse Environments 4
Segmentation, Labelling and Speech Corpora 2
Speech Technology Applications and Human-Machine Interface 1
Large Vocabulary Continuous Speech Recognition 2
Text-To-Speech Synthesis 3
Language Acquisition 1
Acoustic Phonetics 1
Speaker Adaptation 2
Speech Coding 2
Hidden Markov Model Techniques 2
Multilingual Perception and Recognition 1
Large Vocabulary Continuous Speech Recognition 3
Articulatory Modelling 3
Language Acquisition 2
Speaker and Language Recognition 3
Text-To-Speech Synthesis 4
Spoken Language Understanding Systems 4
Human Speech Perception 2
Large Vocabulary Continuous Speech Recognition 4
Spoken Language Understanding Systems 2
Signal Processing and Speech Analysis 3
Human Speech Perception 3
Speaker Adaptation 3
Spoken Language Understanding Systems 3
Multimodal Spoken Language Processing 3
Acoustic Phonetics 2
Large Vocabulary Continuous Speech Recognition 5
Speech Coding 3
Language Acquisition 3 / Multilingual Perception and Recognition 2
Segmentation, Labelling and Speech Corpora 3
Text-To-Speech Synthesis 5
Spoken Language Generation and Translation 2
Human Speech Perception 4
Robust Speech Processing in Adverse Environments 5
Text-To-Speech Synthesis 6
Speech Technology Applications and Human-Machine Interface 2
Prosody and Emotion 6
Hidden Markov Model Techniques 3
Speech and Hearing Disorders 2 / Speech Processing for the Speech and Hearing Impaired 1
Human Speech Production
Segmentation, Labelling and Speech Corpora 4
Speaker and Language Recognition 4
Speech Technology Applications and Human-Machine Interface 3
Utterance Verification and Word Spotting 2
Large Vocabulary Continuous Speech Recognition 6
Neural Networks, Fuzzy and Evolutionary Methods 3
Speech Processing for the Speech-Impaired and Hearing-Impaired 2
Prosody and Emotion 7
2: SST Student Day
SST Student Day - Poster Session 1
SST Student Day - Poster Session 2

Author Index
A B C D E F G H I
J K L M N O P Q R
S T U V W X Y Z

Multimedia Files

Modular Neural Networks for Low-Complex Phoneme Recognition

Authors:

Axel Glaeser, Ascom Systec Ltd. (Switzerland)

Page (NA) Paper number 133

Abstract:

We present a Modular Neural Network (MNN) for phoneme recognition within the framework of a hybrid system (neural networks and HMMs) for speakerindependent single word recognition. With this approach, we are taking the computational effort into account which is used as an additional criterion for assessing the system performance. The main idea of the proposed MNN is the distribution of the complexity for the phoneme classification task on a set of modules. Each of these modules is a single neural network which is characterized by its high degree of specialization. The number of interfaces, and therewith the possibilities for infiltering external acoustic-phonetic knowledge, increases for a modular architecture. Moreover, after the development of a suitable topology for the MNN, each of the modules can be optimized for its specific phoneme recognition task. This is done by detecting and pruning irrelevant input parameters and leads to a more efficient system in terms of memory and computational requirements.

SL980133.PDF (From Author) SL980133.PDF (Rasterized)

TOP


Global Optimisation of Neural Network Models Via Sequential Sampling-Importance Resampling

Authors:

João F.G. de Freitas, Cambridge University (U.K.)
Sue E. Johnson, Cambridge University (U.K.)
Mahesan Niranjan, Cambridge University (U.K.)
Andrew H. Gee, Cambridge University (U.K.)

Page (NA) Paper number 213

Abstract:

We propose a novel strategy for training neural networks using sequential Monte Carlo algorithms. This global optimisation strategy allows us to learn the probability distribution of the network weights in a sequential framework. It is well suited to applications involving on-line, nonlinear or non-stationary signal processing. We show how the new algorithms can outperform extended Kalman filter (EKF) training.

SL980213.PDF (From Author) SL980213.PDF (Rasterized)

TOP


Efficient Computation of MMI Neural Networks for Large Vocabulary Speech Recognition Systems

Authors:

Jörg Rottland, Duisburg University (Germany)
Andre Lüdecke, Duisburg University (Germany)
Gerhard Rigoll, Duisburg University (Germany)

Page (NA) Paper number 331

Abstract:

This paper describes, how to train Maximum Mutual Information Neural Networks (MMINN) in an efficient way, with a new topology. Large vocabulary speech recognition systems, based on a Hybrid MMI/connectionist HMM combination, have shown good performance on several tasks (RM and WSJ). MMINNs are trained to maximize the mutual information between the index of the winning output neuron (Winner-Takes-All network) and the phonetical class of the corresponding acoustic frame. One major problem of MMI-neural networks is the high computational effort, which is needed for the training of the neural networks. The computational effort is proportional to the input and output size of the neural network and to the number of training samples. This paper shows two approaches, that demonstrate, how these long training times can be reduced with very low or even no loss in recognition accuracy. This is achieved by the use of phonetical knowledge, to build a network topology based on phonetical classes.

SL980331.PDF (From Author) SL980331.PDF (Rasterized)

TOP


Modular Connectionist Systems for Identifying Complex Arabic Phonetic Features

Authors:

Sid-Ahmed Selouani, IE/Houari boumedienne University of science and technology (Algeria)
Jean Caelen, IMAG/UJF (France)

Page (NA) Paper number 358

Abstract:

This paper concerns the identification of Arabic macro-classes and phonetic features by systems using a hierarchy of neural networks. These systems are composed of sub-neural-networks (SNNs) carrying out binary discrimination sub-tasks. Two types of architecture are presented: serial structure of experts and parallel disposition of them. This mixture of experts is composed of typically time delay neural networks using a version of autoregressive backpropagation algorithm (AR-TDNN). These hierarchical configurations are confronted to a monolithic system using standard backpropagation learning procedure. The test database consists of 60 VCV utterances and 50 phrases pronounced by 6 Algerian native speakers. The parallel configuration achieved much fewer error rate (13% vs. 16% and 28%) than other architectures. The parallel mixture of experts is incorporated in a hybrid structure (HMM-SNN) in the order to enhance performances of standard HMMs. Identification results show that 10% reduction of error rate is obtained by the hybrid system.

SL980358.PDF (From Author) SL980358.PDF (Rasterized)

TOP