Learning Theory and Algorithms II

Chair: Jose C. Principe, University of Florida, USA

Home

Minimum Detection Error Training for Acoustic Signal Monitoring

Authors:

Hideyuki Watanabe, ATR Human Information Processing Labs (Japan)
Yuji Matsumoto, ATR Human Information Processing Labs (Japan)
Shigeru Katagiri, ATR Human Information Processing Labs (Japan)

Volume 2, Page 1193, Paper number 1533

Abstract:

In this paper we propose a novel approach to the detection of acoustic irregular signals using Minimum Detection Error (MDE) training. The MDE training is based on the Generalized Probabilistic Descent for discriminative pattern recognizer design. We demonstrate its fundamental utility by experiments in which several acoustic events are detected in a noisy environment.

ic981533.pdf (From Postscript)

TOP

Fast Subspace Tracking and Neural Network Learning by a Novel Information Criterion

Authors:

Yongfeng Miao, University of Melbourne (Australia)
Yingbo Hua, University of Melbourne (Australia)

Volume 2, Page 1197, Paper number 1548

Abstract:

We introduce a novel information criterion (NIC) for searching for the optimum weights of a two-layer linear neural network (NN). The NIC exhibits a single global maximum attained if and only if the weights span the (desired) principal subspace of a covariance matrix. The other stationary points of the NIC are (unstable) saddle points. We develop an adaptive algorithm based on the NIC for estimating and tracking the principal subspace of a vector sequence. The NIC algorithm provides a fast on-line learning of the optimum weights for the two-layer linear NN. The NIC algorithm has several key advantages such as faster convergence which is illustrated through analysis and simulation.

ic981548.pdf (From Postscript)

TOP

Adaptive Regularization of Neural Networks Using Conjugate Gradient

Authors:

Cyril Goutte, Technical University of Denmark (Denmark)
Jan Larsen, Technical University of Denmark (Denmark)

Volume 2, Page 1201, Paper number 1933

Abstract:

Recently we suggested a regularization scheme which iteratively adapts regularization parameters by minimizing validation error using simple gradient descent. In this contribution we present an improved algorithm based on the conjugate gradient technique. Numerical experiments with feed-forward neural networks successfully demonstrate improved generalization ability and lower computational cost.

ic981933.pdf (From Postscript)

TOP

Design of Robust Neural Network Classifiers

Authors:

Jan Larsen, Technical University of Denmark (Denmark)
Lars Nonboe Andersen, Technical University of Denmark (Denmark)
Mads Hintz-Madsen, Technical University of Denmark (Denmark)
Lars Kai Hansen, Technical University of Denmark (Denmark)

Volume 2, Page 1205, Paper number 1946

Abstract:

This paper addresses a new framework for designing robust neural network classifiers. The network is optimized using the maximum a posteriori technique, i.e., the cost function is the sum of the log-likelihood and a regularizationterm (prior). In order to perform robust classification, we present a modified likelihood function which incorporate the potential risk of outliers in the data. This leads to introduction of a new parameter, the outlier probability. Designing the neural classifier involves optimization of network weights as well as outlier probability and regularization parameters. We suggest to adapt the outlier probability and regularization parameters by minimizing the error on a validation set, and a simple gradient descent scheme is derived. In addition, the framework allows for constructing a simple outlier detector. Experiments with artificial data demonstrates the potential of the suggested framework.

ic981946.pdf (From Postscript)

TOP

Extraction of Independent Components from Hybrid Mixture: KuicNet Learning Algorithm and Applications

Authors:

Sun-Yuan Kung, Princeton University (U.S.A.)
Cristina Mejuto, University of La Coruna (Spain)

Volume 2, Page 1209, Paper number 2114

Abstract:

A hybrid mixture is a mixture of supergaussian, gaussian, and subgaussian independent components(ICs). This paper addressesextraction of ICs from a hybrid mixture. There are two (single-outputvs. all-outputs) approaches to the design of contrast functions. We advocate the former approachdue to its (1) simple and closed-form analysis, and (2) numerical convergence andcomputational saving. Via this approach, the positive kurtosis (resp. negative kurtosis) can be proved to be a valid contrast function for extracting supergaussian (resp. subgaussian) ICs from any nontrivial hybrid mixture. We shall also develop a network algorithm, Kurtosis-based Independent Component Network (KuicNet), for recursively extracting ICs. Numerical and convergence properties are analyzed and several application examples demonstrated.

ic982114.pdf (From Postscript)

TOP

Why Natural Gradient?

Authors:

Shun-ichi Amari, RIKEN Brain Science Institute (Japan)
Scott C. Douglas, University of Utah (U.S.A.)

Volume 2, Page 1213, Paper number 2134

Abstract:

Gradient adaptation is a useful technique for adjusting a set of parameters to minimize a cost function. While often easy to implement, the convergence speed of gradient adaptation can be slow when the slope of the cost function varies widely for small changes in the parameters. In this paper, we outline an alternative technique, termed natural gradient adaptation, that overcomes the poor convergence properties of gradient adaptation in many cases. The natural gradient is based on differential geometry and employs knowledge of the Riemannian structure of the parameter space to adjust the gradient search direction. Unlike Newton's method, natural gradient adaptation does not assume a locally-quadratic cost function. Moreover, for maximum likelihood estimation tasks, natural gradient adaptation is asymptotically Fisher-efficient. A simple example illustrates the desirable properties of natural gradient adaptation.

ic982134.pdf (From Postscript)

TOP

Neural Network Inversion of Snow Parameters by Fusion of Snow Hydrology Prediction and SSM/I Microwave Satellite Measurements

Authors:

Yuankai Wang, University of Washington (U.S.A.)
Jenq-Neng Hwang, University of Washington (U.S.A.)
Chi-Te Chen, University of Washington (U.S.A.)
Leung Tsang, University of Washington (U.S.A.)
Bart Nijssen, University of Washington (U.S.A.)
Dennis P. Lettenmaier, University of Washington (U.S.A.)

Volume 2, Page 1217, Paper number 2246

Abstract:

Inverse remote sensing problems are generally ill-posed. In this paper, we propose an approach, which integrates the dense media radiative transfer (DMRT) model, snow hydrology model, neural networks and SSM/I microwave measurements, to infer the snow depth. Four multilayer perceptrons (MLPs) were trained using the data from DMRT model. With the provision of initial guess from snow hydrology prediction, neural networks effectively invert the snow parameters based on SSM/I measurements. In addition, a prediction neural network is used to achieve adaptive learning rates and good initial estimate of snow depth for inversion. Result shows that our algorithm can effectively and accurately retrieve snow parameters from these highly nonlinear and many-to-one mappings.

ic982246.pdf (From Postscript)

TOP

A Piecewise Linear Recurrent Neural Network Structure and its Dynamics

Authors:

Xiao Liu, GlobeSpan Technologies Inc. (U.S.A.)
Tulay Adali, University of Maryland, Baltimore (U.S.A.)
Levent Demirekler, University of Maryland, Baltimore (U.S.A.)

Volume 2, Page 1221, Paper number 5139

Abstract:

We present a piecewise linear recurrent neural network (PL-RNN) structure by combining the canonical piecewise linear function with the autoregressive moving average (ARMA) model such that an augmented input space is partitioned into regions where an ARMA model is used in each. The piecewise linear structure allows for easy implementation, and in training, allows for use of standard linear adaptive filtering techniques based on gradient optimization and description of convergence regions for the step-size. We study the dynamics of PL-RNN and show that it defines a contractive mapping and is bounded input bounded output stable. We introduce application of PL-RNN to channel equalization and show that it closely approximates the performance of the traditional RNN that uses sigmoidal activation functions.