Authors:
Axel Glaeser, Ascom Systec Ltd. (Switzerland)
Page (NA) Paper number 133
Abstract:
We present a Modular Neural Network (MNN) for phoneme recognition within
the framework of a hybrid system (neural networks and HMMs) for speakerindependent
single word recognition. With this approach, we are taking the computational
effort into account which is used as an additional criterion for assessing
the system performance. The main idea of the proposed MNN is the distribution
of the complexity for the phoneme classification task on a set of modules.
Each of these modules is a single neural network which is characterized
by its high degree of specialization. The number of interfaces, and
therewith the possibilities for infiltering external acoustic-phonetic
knowledge, increases for a modular architecture. Moreover, after the
development of a suitable topology for the MNN, each of the modules
can be optimized for its specific phoneme recognition task. This is
done by detecting and pruning irrelevant input parameters and leads
to a more efficient system in terms of memory and computational requirements.
Authors:
João F.G. de Freitas, Cambridge University (U.K.)
Sue E. Johnson, Cambridge University (U.K.)
Mahesan Niranjan, Cambridge University (U.K.)
Andrew H. Gee, Cambridge University (U.K.)
Page (NA) Paper number 213
Abstract:
We propose a novel strategy for training neural networks using sequential
Monte Carlo algorithms. This global optimisation strategy allows us
to learn the probability distribution of the network weights in a sequential
framework. It is well suited to applications involving on-line, nonlinear
or non-stationary signal processing. We show how the new algorithms
can outperform extended Kalman filter (EKF) training.
Authors:
Jörg Rottland, Duisburg University (Germany)
Andre Lüdecke, Duisburg University (Germany)
Gerhard Rigoll, Duisburg University (Germany)
Page (NA) Paper number 331
Abstract:
This paper describes, how to train Maximum Mutual Information Neural
Networks (MMINN) in an efficient way, with a new topology. Large vocabulary
speech recognition systems, based on a Hybrid MMI/connectionist HMM
combination, have shown good performance on several tasks (RM and WSJ).
MMINNs are trained to maximize the mutual information between the index
of the winning output neuron (Winner-Takes-All network) and the phonetical
class of the corresponding acoustic frame. One major problem of MMI-neural
networks is the high computational effort, which is needed for the
training of the neural networks. The computational effort is proportional
to the input and output size of the neural network and to the number
of training samples. This paper shows two approaches, that demonstrate,
how these long training times can be reduced with very low or even
no loss in recognition accuracy. This is achieved by the use of phonetical
knowledge, to build a network topology based on phonetical classes.
Authors:
Sid-Ahmed Selouani, IE/Houari boumedienne University of science and technology (Algeria)
Jean Caelen, IMAG/UJF (France)
Page (NA) Paper number 358
Abstract:
This paper concerns the identification of Arabic macro-classes and
phonetic features by systems using a hierarchy of neural networks.
These systems are composed of sub-neural-networks (SNNs) carrying
out binary discrimination sub-tasks. Two types of architecture are
presented: serial structure of experts and parallel disposition of
them. This mixture of experts is composed of typically time delay
neural networks using a version of autoregressive backpropagation algorithm
(AR-TDNN). These hierarchical configurations are confronted to a monolithic
system using standard backpropagation learning procedure. The test
database consists of 60 VCV utterances and 50 phrases pronounced by
6 Algerian native speakers. The parallel configuration achieved much
fewer error rate (13% vs. 16% and 28%) than other architectures. The
parallel mixture of experts is incorporated in a hybrid structure (HMM-SNN)
in the order to enhance performances of standard HMMs. Identification
results show that 10% reduction of error rate is obtained by the hybrid
system.
|