Chair: Tulay Adali, University of Maryland, USA
Chanin - Nilubol, Georgia Institute of Technology (U.S.A.)
Quoc H. Pham, Georgia Institute of Technology (U.S.A.)
Russell M. Mersereau, Georgia Institute of Technology (U.S.A.)
Mark J. T. Smith, Georgia Institute of Technology (U.S.A.)
Mark A. Clements, Georgia Institute of Technology (U.S.A.)
This paper discusses the application of Hidden Markov Models (HMMs) to solve the Translational and Rotational Invariant Automatic Target Recognition (TRIATR) problem associated with SAR imagery. This approach is based on a cascade of these stages: preprocessing, feature extraction and selection, and classification. Preprocessing and feature extraction and selection involve successive applications of extraction operations from measurements of the Radon transform of target chips. The features which are invariant to changes in rotation, position and shifts, although not to changes in scale are optimized through the use of feature selection techniques. The classification stage successively takes as its inputs the multidimensional multiple observation sequences, parameterizes them statistically using continuous density models to capture target and background appearance variability, and thus results in the TRIATR-HMMs. Experimental results have demonstrated that the recognition rate is as high as 99% over both the training set and the testing set.
Zhongkang Lu, The Hong Kong Polytechnic University (Hong Kong)
Zheru Chi, The Hong Kong Polytechnic University (Hong Kong)
Pengfei Shi, Shanghai Jiaotong University (China)
Most algorithms for segmenting connected handwritten digit strings are based on the analysis of the foreground pixel distributions and the features on the upper/lower contours of the image. In this paper, a new approach is presented to segment connected handwritten two-digit strings based on the thinning of background regions. The algorithm first locates several feature points on the background skeleton of the digit image. Possible segmentation paths are then constructed by matching these feature points. With geometric property measures, these segmentation paths are ranked using fuzzy rules generated from a decision-tree approach. Finally, the top ranked segmentation paths are tested one by one by an optimized nearest neighbour classifier until one of these candidates is accepted based on an acceptance riterion.Experimental results on NIST special database 3 show that our approach can achieve a correct classification rate of 92.4% with only 4.7% of digit strings rejected, which compares favorably with the other techniques tested.
I-Jong Lin, Princeton University (U.S.A.)
Sun-Yuan Kung, Princeton University (U.S.A.)
This paper introduces a learning algorithm for a neural structure,Directed Acyclic Graphs (DAGs) that is structurally based, i.e.reduction and manipulation of internal structure are directly linked tolearning. This paper extends the concepts of DAG templatematching to a neural structure with capabilities forgeneralization. DAG-Learning is derived from concepts in FiniteState Transducers, Hidden Markov Models, and Dynamic Time Warping toform an algorithmic framework within which many adaptive signaltechniques such as Vector Quantization, K-Means,Approximation Networks, etc., may be extended to temporalrecognition. The paper provides a concept of path-based learning to allowcomparison among Hidden Markov Models (HMMs), Finite State Transducers(FSTs) and DAG-Learning. The paper also outlines the DAG-Learning processand provides results from the DAG-Learning algorithmover test set of isolated cursive handwriting characters.
J.G.A. Dolfing, Philips GmbH Forschungslaboratorien (Germany)
This paper addresses the problem of on-line, writer-independent, unconstrained handwriting recognition. Based on hidden Markov models (HMM), we focus on the construction and use of word models which are robust towards contextual character shape variations and variations due to ligatures and diacriticals with the objective of an improved word error rate. We compare the performance and complexity of contextual hidden Markov models with a `pause' model for ligatures. While the common contextual models lead to a word error rate reduction of 12.7%-38% at the cost of almost six times more character models, the pause model improves the word error rate by 15%-25% and adds only a single model to the recognition system. The results for a mixed-style word recognition task on two test sets with vocabularies of 200 (up to 98% correct words) and 20,000 words (up to 88.6% correct words) are given.
Adriana Dumitras, University of British Columbia, Vancouver (Canada)
Faouzi Kossentini, University of British Columbia, Vancouver (Canada)
In this paper, we present a video chrominance subsampling method using feedforward neural networks. Experimental results show that our method outperforms spatial subsampling obtained via lowpass filtering and decimation both objectively and subjectively. Other advantages of our algorithm are computational efficiency and low memory requirements. Moreover, no pre- or post-processing is required by our method.
Haruo Kobayashi, Gunma University (Japan)
Takashi Matsumoto, Waseda University (Japan)
There are two dynamics issues in vision chips: (i) The temporal dynamics issue due to the parasitic capacitors in a CMOS chip, and (ii) the spatial dynamics issue due to the regular array of processing elements in a chip. These issues has already been discussed previously for the resistor network with only associated parasitic capacitances. However, in this paper we consider also parasitic inductances as well as parasitic capacitances for a more precise network dynamics model. We show that in some cases the temporal stability condition for the network with parasitic inductances and capacitances is equivalent to that for the network with only parasitic capacitances, but in general they are not equivalent. We also show that the spatial stability conditions are equivalent in both cases.
Joan-Maria Mas Ribés, U.C.L.-Laboratoire de Télécommunications (Belgium)
Benoît Simon, Global One sa (Belgium)
Benoît Macq, U.C.L.-Laboratoire de Télécommunications (Belgium)
Iterated Transformation Theory (ITT) coding, also known as Fractal Coding, in its original form, allows fast decoding but suffers from long encoding times. During the encoding step, a large number of block best-matching searches have to be performed which leads to a computationally expensive process. We present in this paper a new method that significantly reduces the computational load of ITT based image coding. Both domain and range blocks of the image are transformed into the frequency domain. Domain blocks are then used to train a two dimensional Kohonen Neural Network (KNN) forming a code book similar to Vector Quantization coding. The property of KNN (and Self-Organizing Feature Maps in general) which maintains the input space topology allows to perform a neighboring search to find the piecewise transformation between domain and range blocks.
A. S. Y. Wong, City University of Hong Kong (Hong Kong)
K. W. Wong, City University of Hong Kong (Hong Kong)
C. S. Leung, University of Wollongong (Australia)
In combining principal and minor components analysis, a parallel extraction method based on recursive least square algorithm is suggested to extract the principal components of the input vectors. After the extraction, the error covariance matrix obtained in the learning process is used to perform minor components analysis. The minor components found are then pruned so as to achieve a higher compression ratio. Simulation results show that both the convergent speed and the compression ratio are improved, which in turn indicate that our method effectively combines the extraction of the principal components and pruning of the minor components.
Vijay P Mani, University of Winconsin-Madison (U.S.A.)
Yu Hen Hu, University of Winconsin-Madison (U.S.A.)
Surekha Palreddy, University of Winconsin-Madison (U.S.A.)
In this paper, we investigate a modular architecture for ECG beat classification. The feature space is divided into distinct regions and individual classifiers are developed for each region. We compare different combination strategies, and feature space partition strategies. We also describe a novel, batch modular learning method that can be used to incrementally improve the performance of the modular network.
Hyo-Kyung Sung, Kyungpook National University (Korea)
Heung-Moon Choi, Kyungpook National University (Korea)
An efficient nonlinear restoration of spatially varying blurred images with noise is presented using a self-organizing neural network (SONN). The proposed method can effectively restore the blurred images by using the region classification and learning property of SONN adapted for the blur sensitivity of the receptive field. Receptive fields are adaptively overlapped to eliminate the block effect within the restored images. The proposed method eliminates the need to calculate the gradient, gradient step size, or Hessian of error surface, which affect the performance of the least squares method or of the constraint optimization. Simulation results for the space-variant blurred pepper image show the performance improvement of about 4.86 dB or 3.57 dB, as compared to that of the Richardson-Lucy algorithm or that of conventional neural networks, respectively.