ICASSP99 Multimedia Analysis and Retrieval

Multimedia Analysis and Retrieval
Home Full List of Titles 1: Speech Processing CELP Coding Large Vocabulary Recognition Speech Analysis and Enhancement Acoustic Modeling I ASR Systems and Applications Topics in Speech Coding Speech Analysis Low Bit Rate Speech Coding I Robust Speech Recognition in Noisy Environments Speaker Recognition Acoustic Modeling II Speech Production and Synthesis Feature Extraction Robust Speech Recognition and Adaptation Low Bit Rate Speech Coding II Speech Understanding Language Modeling I 2: Speech Processing, Audio and Electroacoustics, and Neural Networks Acoustic Modeling III Lexical Issues/Search Speech Understanding and Systems Speech Analysis and Quantization Utterance Verification/Acoustic Modeling Language Modeling II Adaptation /Normalization Speech Enhancement Topics in Speaker and Language Recognition Echo Cancellation and Noise Control Coding Auditory Modeling, Hearing Aids and Applications of Signal Processing to Audio and Acoustics Spatial Audio Music Applications Application - Pattern Recognition & Speech Processing Theory & Neural Architecture Signal Separation Application - Image & Nonlinear Signal Processing 3: Signal Processing Theory & Methods I Filter Design and Structures Detection Wavelets Adaptive Filtering: Applications and Implementation Nonlinear Signals and Systems Time/Frequency and Time/Scale Analysis Signal Modeling and Representation Filterbank and Wavelet Applications Source and Signal Separation Filterbanks Emerging Applications and Fast Algorithms Frequency and Phase Estimation Spectral Analysis and Higher Order Statistics Signal Reconstruction Adaptive Filter Analysis Transforms and Statistical Estimation Markov and Bayesian Estimation and Classification 4: Signal Processing Theory & Methods II, Design and Implementation of Signal Processing Systems, Special Sessions, and Industry Technology Tracks System Identification, Equalization, and Noise Suppression Parameter Estimation Adaptive Filters: Algorithms and Performance DSP Development Tools VLSI Building Blocks DSP Architectures DSP System Design Education Recent Advances in Sampling Theory and Applications Steganography: Information Embedding, Digital Watermarking, and Data Hiding Speech Under Stress Physics-Based Signal Processing DSP Chips, Architectures and Implementations DSP Tools and Rapid Prototyping Communication Technologies Image and Video Technologies Automotive Applications / Industrial Signal Processing Speech and Audio Technologies Defense and Security Applications Biomedical Applications Voice and Media Processing Adaptive Interference Cancellation 5: Communications, Sensor Array and Multichannel Source Coding and Compression Compression and Modulation Channel Estimation and Equalization Blind Multiuser Communications Signal Processing for Communications I CDMA and Space-Time Processing Time-Varying Channels and Self-Recovering Receivers Signal Processing for Communications II Blind CDMA and Multi-Channel Equalization Multicarrier Communications Detection, Classification, Localization, and Tracking Radar and Sonar Signal Processing Array Processing: Direction Finding Array Processing Applications I Blind Identification, Separation, and Equalization Antenna Arrays for Communications Array Processing Applications II 6: Multimedia Signal Processing, Image and Multidimensional Signal Processing, Digital Signal Processing Education Multimedia Analysis and Retrieval Audio and Video Processing for Multimedia Applications Advanced Techniques in Multimedia Video Compression and Processing Image Coding Transform Techniques Restoration and Estimation Image Analysis Object Identification and Tracking Motion Estimation Medical Imaging Image and Multidimensional Signal Processing Applications I Segmentation Image and Multidimensional Signal Processing Applications II Facial Recognition and Analysis Digital Signal Processing Education Author Index A B C D E F G H I J K L M N O P Q R S T U V W X Y Z	Time-Series Active Search for Quick Retrieval of Audio and Video Authors: Kunio Kashino, Gavin Smith, Hiroshi Murase, Page (NA) Paper number 1792 Abstract: This paper proposes a search method that can quickly detect and locate known sound (video) in a long audio (video) stream. The method is based on active search. Active search reduces the number of candidate matches between reference and input signals by approximately 10 to 100 times compared to exhaustive search, while guaranteeing the same retrieval accuracy. We proposed a quick search method in our previous paper, and here we focus on improvement of the accuracy. Thus the feature used has been extended to the audio power spectrum and temporal division of the histogram windows has been introduced to incorporate time information. Tests carried out under practical circumstances clearly show the accuracy improvement. The proposed method is still so fast that it can correctly retrieve a 15-s commercial in a 6-h recording of TV broadcasting within 2 s, once the features are calculated. IC991792.PDF (From Author) IC991792.PDF (Rasterized) TOP Content-Based Video Indexing of TV Broadcast News Using Hidden Markov Models Authors: Stefan Eickeler, Stefan Müller, Page (NA) Paper number 1683 Abstract: This paper presents a new approach to content-based video indexing using Hidden Markov Models (HMMs). In this approach one feature vector is calculated for each image of the video sequence. These feature vectors are modeled and classified using HMMs. This approach has many advantages compared to other video indexing approaches. The system has automatic learning capabilities. It is trained by presenting manually indexed video sequences. To improve the system we use a video model, that allows the classification of complex video sequences. The presented approach works three times faster than real-time. We tested our system on TV broadcast news. The rate of 97.3% correctly classified frames shows the efficiency of our system. IC991683.PDF (From Author) IC991683.PDF (Rasterized) TOP Hierarchical Classification of Audio Data for Archiving and Retrieving Authors: Tong Zhang, C.-C. Jay Kuo, Page (NA) Paper number 1600 Abstract: A hierarchical system for audio classification and retrieval based on audio content analysis is presented in this paper. The system consists of three stages. The first stage is called the coarse-level audio segmentation and classification, where audio recordings are segmented and classified into speech, music, several types of environmental sounds, and silence, based on morphological and statistical analysis of temporal curves of short-time features of audio signals. In the second stage, environmental sounds are further classified into finer classes such as applause, rain, birds' sound, etc. This fine-level classification is based on time-frequency analysis of audio signals and use of the hidden Markov model (HMM) for classification. In the third stage, the query-by-example audio retrieval is implemented where similar sounds can be found according to an input sample audio. It is shown that the proposed system has achieved an accuracy higher than 90% for coarse-level audio classification. Examples of audio fine classification and audio retrieval are also provided. IC991600.PDF (From Author) IC991600.PDF (Rasterized) TOP A Fast Audio Classification from MPEG Coded Data Authors: Yasuyuki Nakajima, Yang Lu, Masaru Sugano, Akio Yoneyama, Hiromasa Yanagihara, Akira Kurematsu, Page (NA) Paper number 2299 Abstract: Audio information classification becomes a very important task for such purposes as automatic keyword spotting and other content-based audio-visual query system. In this paper, we describe a fast and accurate audio data classification method on MPEG coded data domain. Firstly silent segments are detected using a robust approach for different recording conditions. Then the non-silent segments are classified into three types, music, speech, and applause using temporal density, bandwidth and center frequency of subband energy. In order to be robust for a variety of audio sources as much as possible, we use Bayes discriminant function for multivariate Gaussian distribution instead of manually adjusting a threshold for each discriminator. In the experiment, every one-second MPEG audio data is classified and about 90% of audio and speech segments have been successfully detected. As for the detection speed, less than 20% of MPEG audio decoding processing power is required. IC992299.PDF (From Author) IC992299.PDF (Rasterized) TOP Image Retrieval Based on Energy Histograms of The Low Frequency DCT Coefficients Authors: Jose A Lay, Ling Guan, Page (NA) Paper number 1221 Abstract: With the increasing popularity of the use of compressed images, an intuitive approach for lowering computational complexity towards a practically efficient image retrieval system is to propose a scheme that is able to perform retrieval computation directly in the compressed domain. In this paper, we investigate the use of energy histograms of the low frequency DCT coefficients as features for the retrieval of DCT compressed images. We propose a feature set that is able to identify similarities on changes of image-representation due to several lossless DCT transformations. We then use the features to construct an image retrieval system based on the real-time image retrieval model. We observe that the proposed features are sufficient for performing high level retrieval on medium size image databases. And by introducing transpositional symmetry, the features can be brought to accommodate several lossless DCT transformations such as horizontal and vertical mirroring, rotating, transposing, and transversing. IC991221.PDF (From Author) IC991221.PDF (Rasterized) TOP Texture Features for DCT-Coded Image Retrieval and Classification Authors: Yu-Len Huang, Department of Computer Science and Information Engineering, National Chung Cheng University, Taiwan, R.O.C. (Taiwan) Ruey-Feng Chang, Department of Computer Science and Information Engineering, National Chung Cheng University, Taiwan, R.O.C. (Taiwan) Page (NA) Paper number 1241 Abstract: The multiresolution wavelet transform has been shown to be an effective technique and achieved very good performance for texture analysis. However, a large number of images are compressed by the methods based on discrete cosine transform (DCT). Hence, the image decompression of inverse DCT is needed to obtain the texture features based on the wavelet transform for the DCT-coded image. This paper proposes the use of the multiresolution reordered features for texture analysis. The proposed features are directly generated by using the DCT coefficients from the DCT-coded image. Comparisons with the subband-energy features extracted from the wavelet transform, conventional DCT using the Brodatz texture database indicate that the proposed method provides the best texture pattern retrieval accuracy and obtains much better correct classification rate. The proposed DCT based features are expected to be very useful and efficient for texture pattern retrieval and classification in large DCT-coded image databases. The detail simulation results can be found in web page: http://www.cs.ccu.edu.tw/~hyl/mrdct/. IC991241.PDF (From Author) IC991241.PDF (Rasterized) TOP An Efficient Low-Dimensional Color Indexing Scheme for Region-Based Image Retrieval Authors: Yining Deng, B. S. Manjunath, Page (NA) Paper number 2075 Abstract: In this work, an efficient low-dimensional color indexing scheme for region-based image retrieval is presented. The colors in each image region are first quantized so that only a small number of cluster centroids are needed to represent the region color information. The proposed color feature descriptor consists of these quantized colors and their percentages in the region. A similarity distance measure is defined and shown to be equivalent to the quadratic color histogram distance measure. The quantized colors are indexed in the 3-D color space so that high-dimensional indexing can be avoided. During the search process, each quantized color in the query is used as a separate cue to find matches containing that color. The matches from all the query colors are then joined to obtain the final retrievals. Experimental results show that the proposed scheme is fast and accurate compared to the color histogram approach. IC992075.PDF (From Author) IC992075.PDF (Rasterized) TOP Vector-Wavelet Based Scalable Indexing And Retrieval Systems For Large Color Image Archives Authors: Elif Albuz, Erturk D Kocalar, Ashfaq A Khokhar, Page (NA) Paper number 2208 Abstract: This paper presents an efficient content based indexing and retrieval mechanism based on vector wavelet coefficients of color images. We use highly decorrelated wavelet coefficient planes to acquire a search efficient feature space. The feature space is subsequently indexed using properties of the all the images in the database. Therefore the feature key of an image does not only correspond to the content of the image itself but also how much the image is different from the other images being stored in the database. The search time depends only on the number of images similar to the query image but not on the size of the entire database. The system is scalable and provides fast retrievals. We show that in a database of 1000 images, query search takes less than 50 msec, on a 266 MHz Pentium processor compared to several seconds of retrieval time in the earlier systems proposed in the literature. IC992208.PDF (From Author) IC992208.PDF (Rasterized) TOP