Image Sequence Analysis

Home

Prediction and Search Techniques for RD-Optimized Motion Estimation in a Very Low Bit Rate Video Coding Framework

Authors:

Yuen-Wen Lee, University of British Columbia (Canada)
Faouzi Kossentini, University of British Columbia (Canada)
Rabab Ward, University of British Columbia (Canada)
Mark J.T. Smith, Georgia Institute of Technology (U.S.A.)

Volume 4, Page 2861

Abstract:

Prediction and search techniques are introduced for efficient rate-distortion optimized motion estimation in a very low bit rate video coding framework. For prediction, three types of predictors are considered: mean, weighted mean, and median. Prediction allows us to constrain the motion vector search to a small diamond-shaped area whose center is the predicted motion vector. The size of the search area is further constrained by employing a probabilistic model. We evaluate two models, both of which permit the contraction or the expansion of the search area as a function of the local statistics of the motion flow. The proposed techniques are analyzed in the context of a very low bit rate DCT-based video coding framework, where a rate-distortion criterion is used for motion estimation as well as for 8 x 8 block coding mode selection. A particular resulting very low bit rate video coder is shown experimentally to outperform the H.263 TMN5 simulation model in terms of encoding speed and compression performance, simultaneously.

ic972861.pdf

TOP

Network Driven Motion Estimation For Portable Video Terminals

Authors:

Wendi B. Rabiner, MIT (U.S.A.)
Anantha P. Chandrakasan, MIT (U.S.A.)

Volume 4, Page 2865

Abstract:

Motion estimation has been shown to help significantly in the compression of video sequences. However, since most motion estimation algorithms require a large amount of computation, it is undesirable to use them in power constrained applications, such as battery operated wireless video terminals. This paper presents an approach to reducing the power dissipation of wireless video terminals in a networked environment by exploiting the predictability of object motion. Since the location of an object in the current frame can be predicted from its location in previous frames, it is possible to optimally partition the motion estimation computation between battery operated portable devices and high powered compute servers on the wired network. This can achieve a reduction in the number of operations performed at the encoder for motion estimation by over two orders of magnitude while introducing minimal degradation to the decoded video compared with full search encoder-based motion estimation.

ic972865.pdf

TOP

An Optical Flow Based Motion Compensation Algorithm for Very Low Bit-Rate Video Coding

Authors:

Shu Lin, New Jersey Institute of Technology, NJ (U.S.A.)
Yun Q. Shi, New Jersey Institute of Technology, NJ (U.S.A.)
Ya-Qin Zhang, David Sarnoff, NJ (U.S.A.)

Volume 4, Page 2869

Abstract:

In this paper, we propose an efficient compression algorithm for very low bit-rate video applications. The algorithm is based on (1) optical-flow motion estimation to achieve more accurate motion prediction fields; (2) DCT-coding of the motion vectors from the optical-flow estimation to further reduce the motion overheads; and (3) region adaptive threshold technique to match optical flow motion prediction and minimize the residual errors. Unlike the classic block-matching based discrete cosine transformation (DCT) video coding schemes in MPEG 1/2 and H.261/3, the proposed algorithm uses optical flow for motion compensation and the DCT is applied to the optical flow field instead of predictive errors. Thresholding techniques are used to treat different regions to complement optical flow technique and to efficiently code residual data. While maintaining comparable peak signal to noise ratio(PSNR) and computational complexity with that of ITU-T H.263/TMN5, the reconstructed video frames of the proposed coder are free of annoying blocking artifacts, and hence visually much more pleasant.

ic972869.pdf

TOP

Multi-Resolution Motion Estimation

Authors:

Gregory J. Conklin, Cornell University, Ithaca, NY (U.S.A.)
Sheila S. Hemami, Cornell University, Ithaca, NY (U.S.A.)

Volume 4, Page 2873

Abstract:

Spatial multi-resolution video sequences provide video at multiple frame sizes, allowing extraction of only the resolution or bit rate required by the user. This paper proposes fine-to-coarse motion estimation (ME) for multi-resolution video coding. While coarse-to-fine ME, used in previously proposed coding schemes, can provide a better estimate at the coarsest resolution, it is outperformed by fine-to-coarse ME at finer resolutions due to the inability of coarse-to-fine ME to accurately track motion at finer resolutions. At the finest resolution, fine-to-coarse ME provides a PSNR improvement of up to 1 dB, for the sequences tested, and better visual quality at all resolutions. In addition, fine-to-coarse ME provides more accurate and thus more compressible motion estimates.

ic972873.pdf

TOP

Harbour Image Sequences Analysis for Control and Monitoring

Authors:

Paolo Gamba, University of Pavia (Italy)
Alessandro Mecocci, University of Siena (Italy)

Volume 4, Page 2877

Abstract:

A system devoted to ship traffic control in a harbour environment is proposed, where optic flow approach and monocular image sequences are used. In each frame the scene is segmented, moving and still objects are found, and the movement of each ship is completely tracked. Partial and/or total occlusions are correctly handled by means of suitable grouping algorithms. Quantitative motion estimation is obtained by an iterative procedure that extract precise 3D information from the monocular sequence. The implemented version of the system is able to monitor and control a harbour environment with a substantially low computational effort.

ic972877.pdf

TOP

Motion Compensated Video Compression Using Adaptive Transformations

Authors:

Zafer Diab, GRPR (Canada)
Paul Cohen, GRPR (Canada)

Volume 4, Page 2881

Abstract:

Block-based motion compensation fails to maintain an acceptable level of prediction error which makes the transmission of this error impossible for very low bit-rate coding owing to the small bit allocation. The reason is that the motion model assumed in block-based techniques cannot approximate the motion in the real world precisely. To develop an effective motion compensation method for very low bit-rate video coding, we address the issue of adopting more sophisticated motion model than block-based. The motion model discussed here is based on the representation of optical flow in its principal components domain. The performance of motion compensation based on this model is compared with MPEG using the PSNR mesure and qualitative experiments. Both of these criterias show a gain of compression in the ordre of 30 percent.

ic972881.pdf

TOP

An Efficient Implementation of Affine Transformation Using One-Dimensional FFT's

Authors:

Erwin Pang, University of Toronto (Canada)
Dimitrios Hatzinakos, University of Toronto (Canada)

Volume 4, Page 2885

Abstract:

In this paper, we propose a new decomposition scheme and an efficient interpolation algorithm for affine transformation of a digital image. We try to reconstruct the affine-transformed image by resampling it with the highest possible quality, lowest complexity and throughput rate. Based on the proposed decomposition, the transform is completed by a sequence of 3-pass translations and a scaling operation where each of them is one-dimensional in nature. This method preserves quality and guarantees simplicity. We place the emphasis on the feasibility of a parallel implementation that can benefit from pipeline technologies. Further, an efficient FFT-based implementation of this new algorithm is suggested. Experimental evidence of the effectiveness and robustness of the proposed method is reported. The problem is relevant to video transmission, image registration, and computer graphics manipulation.

ic972885.pdf

TOP

Periodic Pan Compensation For Reduced Complexity Video Compression

Authors:

Charles D. Creusere, NAWC (U.S.A.)

Volume 4, Page 2889

Abstract:

To reduce the complexity of a video encoder, we introduce a new approach to hybrid DPCM-tranform video compression in which pan compensation is performed outside the feedback loop. While the basic idea is concep-tually similar to the pan compensation algorithm proposed by Taubman and Zakhor for their 3D subband coder, our method is different in that it continually tracks and updates the image in the feedback loop in the same way as a con-ventional hybrid coder. Using both residual energy and reconstruction error as metrics, we show that pan compen-sation implemented outside the feedback loop compares very favorably to similar compensation implemented within the conventional hybrid-transform framework. Furthermore, if the spatial coder used to compress the re-sidual images outputs an embedded bit stream, then the complete system is spatially scaleable.

ic972889.pdf

TOP

Feature Extraction Methods for Consistent Spatio-Temporal Image Sequence Classification Using Hidden Markov Models

Authors:

Peter Morguet, TUM (Germany)
Manfred Lang, TUM (Germany)

Volume 4, Page 2893

Abstract:

In this paper a general and efficient approach for representing and classifying image sequences by Hidden Markov Models (HMMs) is presented. A consistent modeling of spatial and temporal information is achieved by extracting different low level image features. These implicitly convert the image intensities into probability density values, while preserving the geometry of the image. The resulting so called image density functions are contained in the states of the HMM. First results of applying the approach to the classification of dynamic hand gestures demonstrate the performance of the modeling.

ic972893.pdf

TOP

Enhancement of Video Data Using Motion-Compensated Postprocessing Techniques

Authors:

Péter Csillag, KFKI-MSZKI (Hungary)
Lilla Böröczky, IBM (U.S.A.)

Volume 4, Page 2897

Abstract:

In many video coding schemes, especially at low bitrates, spatial and temporal subsampling of the image sequences is considered. This is realized by leaving out rows and columns from the images, and skipping whole frames at the transmitter. To get the best possible quality image sequence at the receiver side, the skipped portion of the video should be reconstructed using advanced motion-compensated (MC) postprocessing techniques. Our paper mainly focuses on the restoration / generation of unknown frames of the sequence at time instances, where the original scene has not been sampled, or which were skipped from the original sequence in the transmitter. This enhancement of the temporal resolution is performed using our advanced MC interpolation algorithm, utilizing an accelerated motion model and motion-based segmentation with proper handling of covered and uncovered areas. The algorithm can be used to avoid jerkiness and blurring of the restored image sequences.

ic972897.pdf

TOP

New Improved Feature Extraction Methods for Real-Time High Performance Image Sequence Recognition

Authors:

Gerhard Rigoll, Duisburg University (Germany)
Andreas Kosmala, Duisburg University (Germany)

Volume 4, Page 2901

Abstract:

This paper describes new feature extraction methods which can be used very effectively in combination with statistical methods for image sequence recognition. Although these feature extraction methods can be used for a wide variety of image sequence processing applications, the target application presented in this paper is gesture recognition. The novel feature extraction methods have been integrated into an HMM-based gesture recognition system and led to substantial improvements for this system. It turned out that the new features are not only able to describe the gesture characteristics much better than the old features, but additionally they also led to a dramatic reduction in dimensionality of the feature vector used for representing each frame of the image sequence. This resulted in the fact that it was possible to use the novel features in combination with a new architecture for statistical image sequence recognition. The result of this investigation is a high performance gesture recognition system with significantly improved recognition rates and real-time capabilities.

ic972901.pdf

TOP

Adaptive stripe based patch matching for depth estimation

Authors:

Fatih Murat Porikli, Polytechnic University (U.S.A.)
Yao Wang, Polytechnic University (U.S.A.)
Cassandra Swain, Polytechnic University (U.S.A.)

Volume 4, Page 2905

Abstract:

In this contribution, a novel stereo matching technique for depth estimation in stereoscopic image pairs is presented. The input image pair is preprocessed in the intensity domain and edge maps together with an adaptive mesh in which individual elements approximate linearly modeled regions are obtained. Then, an iterative stripe based, quadrilateral patch matching technique is employed to estimate the depth map from the image pair in a hierarchical manner. Finally, the resultant map is postprocessed to smooth the depth map at the patch borders. The quality of test results demonstrates the effectiveness of the technique.

ic972905.pdf

TOP

Further developments on `headcam': joint estimation of camera rotation+gain group of transformations for wearable bi-foveated cameras

Authors:

Steve Mann, MIT (U.S.A.)

Volume 4, Page 2909

Abstract:

An eyeglass-mounted camera system with wearable multimedia computer (`WearCam') was recently proposed. In particular, `WearCam' contains two miniature cameras: one (wide-angle in landscape orientation) provides the overall contextual information from the wearer's perspective, while the other (telephoto, in portrait orientation) provides close-up details, such as faces. This `bi-foveated' scheme was found to work well within the context of a recently proposed model of image motion characterized by a projective (homographic) coordinate transformation together with a gain transformation. Applications of `WearCam' include personal safety (crime prevention and personal documentary), perceptual intelligence/situational awareness (in the context of personal wearable multimedia), and homographic modeling (wearable, tetherless computer-mediated reality). A pencigraphic image representation is presented where the photometric response function of the camera is determined to within a constant, and the registered images are assembled into a photometric environment map, yielding an estimate, to within a constant scale factor, of the number of photons of light coming from each angle, toward the wearer.

ic972909.pdf

TOP

Adaptive Constrained Least Squares Restoration for Removal of Blocking Artifacts in Low Bit Rate Video Coding

Authors:

Andre Kaup, Siemens AG (Germany)

Volume 4, Page 2913

Abstract:

For high compression ratios current video coding standards produce noticeable blocking and ringing noise due to a rigid block structure and coarse quantization. We propose a new method for reduction of these coding artifacts based on spatially adaptive constrained least squares restoration. The proposal is numerically simple and yields visually convincing results for intra as well as inter coded images. As post-processing technique it is compatible to all existing image and video coding standards.