Georges Linares, University of Avignon (France)
Pascal Nocera, University of Avignon (France)
Henri Meloni, University of Avignon (France)
This paper describes a new neural architecture for unsupervised learning of a classification of mixed transient signals. This method is based on neural techniques for blind separation of sources and subspace methods. The feed-forward neural network dynamically builds and refreshes an acoustic events classification by detecting novelties, creating and deleting classes. A self-organization process achieves a class prototype rotation in order to minimise the statistical dependence of class activities. Simulated multi-dimensional signals and mixed acoustic signals in real noisy environment have been used to test our model. The results on classification and detection model properties are encouraging, in spite of structured sound bad modeling.
John-Paul Hosom, OGI (U.S.A.)
Ronald Cole, OGI (U.S.A.)
In exploring new ways of looking at speech data, we have developed an alternative method of segmentation for training a neural-network-based digit-recognition system. Whereas previous methods segment the data into monophones, biphones, or triphones and train on each sub-phone unit in several broad-category contexts, our new method uses modified diphones to train on the regions of greatest spectral change as well as the regions of greatest stability. Although we account for regions of spectral stability, we do not require their presence in our word models. Empirical evidence for the advantage of this new method is seen by the 13% reduction in word-level error that was achieved on a test set of the OGI Numbers corpus. Comparison was made to a baseline system that used context-independent monophones and context-dependent biphones and triphones.
Andreas Kosmala, Duisburg University (Germany)
Jörg Rottland, Duisburg University (Germany)
Gerhard Rigoll, Duisburg University (Germany)
This paper presents an extensive investigation of the use of trigraphs for on-line cursive handwriting recognition based on Hidden Markov Models (HMMs). Trigraphs are context dependent HMMs representing a single written character in its left and right context, similar to triphones in speech recognition. Looking at the great success of triphones in continuous speech recognition ([1]-[3]), it was always a challenging and open question, if the introduction of trigraphs could lead to substantially improved handwriting recognition systems. The results of this investigation are indeed extremely encouraging: The introduction of suitable trigraphs led to a 50% relative error reduction for a writer dependent 1000 word handwriting recognition system, and to a 35% relative error reduction for the same system with an extended 30000 word vocabulary for cursive handwriting recognition.
Hans-Jürgen Winkler, MMK, Technical University of Munich (Germany)
Manfred Lang, MMK, Technical University of Munich (Germany)
This paper is concerned with the symbol segmentation and recognition task in the context of on-line sampled handwritten mathematical expressions, the first processing stage of an overall system for understanding arithmetic formulas. Within our system a statistical approach is used tolerating ambiguities within the decision stages and resolving them either automatically by additional knowledge acquired within the following processing stages or by interaction with the user. The recognition results obtained by different writers and expressions demonstrate the performance of our approach.
Bernhard Sick, University of Passau (Germany)
One of the most important tasks of automatic tool monitoring systems for CNC-lathes is the supervision of a tool's wear. Considering the state of wear and the actual working process (e.g. rough or finish turning) it is possible to exchange a tool (or only the insert) just in time, which offers significant economic advantages. This paper presents a new method to estimate two wear parameters by means of artificial neural networks (multilayer perceptrons or time-delay neural networks). The input parameters of the networks are process-specific parameters (like the feed rate or the depth of cut) and characteristic coefficients extracted from signals measured with a multi-sensor system in the tool holder.
Hans J.G.A. Dolfing, Philips Research (The Netherlands)
Reinhold Haeb-Umbach, Philips Research (Germany)
This paper addresses the problem of on-line, writer-independent, unconstrained handwriting recognition. Based on Hidden Markov Models (HMM), which are successfully employed in speech recognition tasks, we focus on representations which address scalability, recognition performance and compactness. `Delayed' features are introduced which integrate more global, handwriting specific knowledge into the HMM representation. These features lead to larger error-rate reduction than `delta' features which are known from speech recognition and even require fewer additional components. Scalability is addressed with a size-independent representation. Compactness is achieved with Linear Discriminant Analysis (LDA). The representations are discussed and the results for a mixed-style word recognition task with vocabularies of 200 (up to 99% correct words) and 20,000 words (up to 88.8% correct words) are given.
Yu Hen Hu, University of Wisconsin (U.S.A.)
Jong-Min Park, University of Wisconsin (U.S.A.)
Thomas Knoblock, University of Wisconsin (U.S.A.)
Novel methods which combines outputs of multiple pattern classifiers to enhance the overall performance of pattern classification are presented. Specific attention is given to combination rules which are independent of the input feature vectors. Potentials and pitfalls of this so called stack generalization method are discussed, and experimentation using several machine learning data bases are reported.
Lane M.D. Owsley, University of Washington. Dept. of EE (U.S.A.)
Les E. Atlas, University of Washington. Dept. of EE (U.S.A.)
Lane M.D. Knoblock, Boeing Commercial (U.S.A.)
Gary D. Bernard, Boeing Commercial (U.S.A.)
Our research in on-line monitoring of industrial milling tools has focused on the occurrence of certain wide-band transient events. Time- frequency representations of these events appear to reveal a variety of classes of transients, and a time-structure to these classes which would be well modeled using hidden Markov models. However, the identities of these classes are not known, and obtaining a labeled training set based on a priori information is not possible for reasons both theoretical and practical. Unsupervised clustering algorithms which exist are only appropriate for single vector patterns. We introduce an approach to unsupervised clustering of vector series based around the hidden Markov model. This system is justified as a generalization of a common single- vector approach, and applied to a set of vector patterns from a milling data set. Results presented illustrate the value of this approach in the milling application.
Stanislaw Osowski, Warsaw University of Technology (Poland)
Andrzej Majkowski, Warsaw University of Technology (Poland)
Andrzej Cichocki, FRP RIKEN (Japan)
The paper presents principal component analysis (PCA) approach to the reduction of noise contaminating the data. The PCA performs the role of lossy compression and decompression. The compression/decompression provides the means of coding the data and then recovering it with some losses, dependent on the realized compression ratio. In this process some part of information contained in the data is lost. When the loss tolerance is equal to the noise strength, the noise and the loss tolerance are augmented and the decompressed signal is deprived of noise. This way of noise filtering has been checked on the examples of 1-dimensional and 2-dimensional data and the results of numerical experiments have been included in the paper.
Jose C. Principe, University of Florida (U.S.A.)
Dongxin Xu, University of Florida (U.S.A.)
Chuan Wang, University of Florida (U.S.A.)
On-line learning rules for both Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) with Fisher criterion are analyzed under the same framework, and a generalized Oja's rule for both is derived. For the LDA problem, the relationship between the Fisher criterion and the criterion of minimizing Mean Square Error (MSE) is discussed. The experiments show that the convergence speed of the generalized Oja's rule as an adaptive method for Fisher Criterion is much faster than that of gradient descent method for MSE criterion.
Clifford Sze-Tsan Choy, The Hong Kong Polytechnic University (Hong Kong)
Wan-Chi Siu, The Hong Kong Polytechnic University (Hong Kong)
In this paper, we propose the Distortion Sensitive Competitive Learning (DSCL) algorithm for codebook design in image vector quantization. The algorithm is based on the equidistortion principle for asymptotically optimal vector quantizer after Gersho (1979) and recently from Ueda and Nakano (1994). The DSCL is simple and efficient in that a single weight vector update is performed per training vector, and the processing speed of the DSCL on sequential or multiprocessor environment can further be improved by applying a modified partial distance elimination (MPDE) method. Simulations indicate that the DSCL outperforms some recently proposed neural algorithms, including the ``Neural-Gas'' from Martinetz et. al. (1993) and the DEFCL from Butler and Jiang (1996). In combining with the MPDE, the DSCL is faster than the ``Neural-Gas'' up to a factor of 45 times on a sequential machine, and yet arrives at better codebooks with the same number of iterations.
Mehmet Ertugrul Çelebi, ITU EE Dept., Istanbul (Turkey)
Cüneyt Güzeliş, ITU EE Dept., Istanbul (Turkey)
In this paper, a 3-D Cellular Neural Network (CNN) is applied for restoration of degraded images. It is known that regularized or Maximum a Posteriori estimation based image restoration problems can be formulated as the minimization of the Lyapunov function of the discrete-time Hopfield network. Recently, this Lyapunov function based design method has been extended to the continuous-time Hopfield network and to the continuous-time CNN operating either in a binary steady-state output mode or in a real-valued steady-state output mode. This paper considers 3-D CNN in the binary mode, which needs eight binary (nonredundant) neurons only for each image pixel thus reducing the computational overhead, and introduces a hardware annealing approach to overcome bad local minima problem due to binary mode of operation and nonredundant representation.
Pascal Fleury, EPFL (Switzerland)
Olivier Egger, EPFL (Switzerland)
Current developments in digital image coding tend to involve more and more complex algorithms, and require therefore an increasing amount of computation. To improve the overall system performance, some schemes apply a different coding algorithms to separate parts of an image according to the content of this subimage. Such schemes are referred to as dynamic coding schemes. Applying the best suited coding algorithm to a part of an image will lead to an improved coding quality, but implies an algorithm selection phase. Current selection methods require the computation of the reconstructed image after coding and decoding with all the selected algorithms in order to choose the best method. Some other schemes use ways of pruning the search in the algorithm space. Both approaches suffer from a heavy computational load. Furthermore, the computational complexity is increased even more if the parameters have to be adjusted for a given algorithm during the search. This paper describes a way to predict the coding quality of a region of the input image for any given coding method. The system will then be able to select the best suited coding algorithm for each region according to the predicted quality. This prediction scheme has low complexity, and also enables the adjustment of algorithm specific parameters during the search.