Multimedia Analysis and Retrieval

Home
Full List of Titles
1: Speech Processing
CELP Coding
Large Vocabulary Recognition
Speech Analysis and Enhancement
Acoustic Modeling I
ASR Systems and Applications
Topics in Speech Coding
Speech Analysis
Low Bit Rate Speech Coding I
Robust Speech Recognition in Noisy Environments
Speaker Recognition
Acoustic Modeling II
Speech Production and Synthesis
Feature Extraction
Robust Speech Recognition and Adaptation
Low Bit Rate Speech Coding II
Speech Understanding
Language Modeling I
2: Speech Processing, Audio and Electroacoustics, and Neural Networks
Acoustic Modeling III
Lexical Issues/Search
Speech Understanding and Systems
Speech Analysis and Quantization
Utterance Verification/Acoustic Modeling
Language Modeling II
Adaptation /Normalization
Speech Enhancement
Topics in Speaker and Language Recognition
Echo Cancellation and Noise Control
Coding
Auditory Modeling, Hearing Aids and Applications of Signal Processing to Audio and Acoustics
Spatial Audio
Music Applications
Application - Pattern Recognition & Speech Processing
Theory & Neural Architecture
Signal Separation
Application - Image & Nonlinear Signal Processing
3: Signal Processing Theory & Methods I
Filter Design and Structures
Detection
Wavelets
Adaptive Filtering: Applications and Implementation
Nonlinear Signals and Systems
Time/Frequency and Time/Scale Analysis
Signal Modeling and Representation
Filterbank and Wavelet Applications
Source and Signal Separation
Filterbanks
Emerging Applications and Fast Algorithms
Frequency and Phase Estimation
Spectral Analysis and Higher Order Statistics
Signal Reconstruction
Adaptive Filter Analysis
Transforms and Statistical Estimation
Markov and Bayesian Estimation and Classification
4: Signal Processing Theory & Methods II, Design and Implementation of Signal Processing Systems, Special Sessions, and Industry Technology Tracks
System Identification, Equalization, and Noise Suppression
Parameter Estimation
Adaptive Filters: Algorithms and Performance
DSP Development Tools
VLSI Building Blocks
DSP Architectures
DSP System Design
Education
Recent Advances in Sampling Theory and Applications
Steganography: Information Embedding, Digital Watermarking, and Data Hiding
Speech Under Stress
Physics-Based Signal Processing
DSP Chips, Architectures and Implementations
DSP Tools and Rapid Prototyping
Communication Technologies
Image and Video Technologies
Automotive Applications / Industrial Signal Processing
Speech and Audio Technologies
Defense and Security Applications
Biomedical Applications
Voice and Media Processing
Adaptive Interference Cancellation
5: Communications, Sensor Array and Multichannel
Source Coding and Compression
Compression and Modulation
Channel Estimation and Equalization
Blind Multiuser Communications
Signal Processing for Communications I
CDMA and Space-Time Processing
Time-Varying Channels and Self-Recovering Receivers
Signal Processing for Communications II
Blind CDMA and Multi-Channel Equalization
Multicarrier Communications
Detection, Classification, Localization, and Tracking
Radar and Sonar Signal Processing
Array Processing: Direction Finding
Array Processing Applications I
Blind Identification, Separation, and Equalization
Antenna Arrays for Communications
Array Processing Applications II
6: Multimedia Signal Processing, Image and Multidimensional Signal Processing, Digital Signal Processing Education
Multimedia Analysis and Retrieval
Audio and Video Processing for Multimedia Applications
Advanced Techniques in Multimedia
Video Compression and Processing
Image Coding
Transform Techniques
Restoration and Estimation
Image Analysis
Object Identification and Tracking
Motion Estimation
Medical Imaging
Image and Multidimensional Signal Processing Applications I
Segmentation
Image and Multidimensional Signal Processing Applications II
Facial Recognition and Analysis
Digital Signal Processing Education

Author Index
A B C D E F G H I
J K L M N O P Q R
S T U V W X Y Z

Time-Series Active Search for Quick Retrieval of Audio and Video

Authors:

Kunio Kashino,
Gavin Smith,
Hiroshi Murase,

Page (NA) Paper number 1792

Abstract:

This paper proposes a search method that can quickly detect and locate known sound (video) in a long audio (video) stream. The method is based on active search. Active search reduces the number of candidate matches between reference and input signals by approximately 10 to 100 times compared to exhaustive search, while guaranteeing the same retrieval accuracy. We proposed a quick search method in our previous paper, and here we focus on improvement of the accuracy. Thus the feature used has been extended to the audio power spectrum and temporal division of the histogram windows has been introduced to incorporate time information. Tests carried out under practical circumstances clearly show the accuracy improvement. The proposed method is still so fast that it can correctly retrieve a 15-s commercial in a 6-h recording of TV broadcasting within 2 s, once the features are calculated.

IC991792.PDF (From Author) IC991792.PDF (Rasterized)

TOP


Content-Based Video Indexing of TV Broadcast News Using Hidden Markov Models

Authors:

Stefan Eickeler,
Stefan Müller,

Page (NA) Paper number 1683

Abstract:

This paper presents a new approach to content-based video indexing using Hidden Markov Models (HMMs). In this approach one feature vector is calculated for each image of the video sequence. These feature vectors are modeled and classified using HMMs. This approach has many advantages compared to other video indexing approaches. The system has automatic learning capabilities. It is trained by presenting manually indexed video sequences. To improve the system we use a video model, that allows the classification of complex video sequences. The presented approach works three times faster than real-time. We tested our system on TV broadcast news. The rate of 97.3% correctly classified frames shows the efficiency of our system.

IC991683.PDF (From Author) IC991683.PDF (Rasterized)

TOP


Hierarchical Classification of Audio Data for Archiving and Retrieving

Authors:

Tong Zhang,
C.-C. Jay Kuo,

Page (NA) Paper number 1600

Abstract:

A hierarchical system for audio classification and retrieval based on audio content analysis is presented in this paper. The system consists of three stages. The first stage is called the coarse-level audio segmentation and classification, where audio recordings are segmented and classified into speech, music, several types of environmental sounds, and silence, based on morphological and statistical analysis of temporal curves of short-time features of audio signals. In the second stage, environmental sounds are further classified into finer classes such as applause, rain, birds' sound, etc. This fine-level classification is based on time-frequency analysis of audio signals and use of the hidden Markov model (HMM) for classification. In the third stage, the query-by-example audio retrieval is implemented where similar sounds can be found according to an input sample audio. It is shown that the proposed system has achieved an accuracy higher than 90% for coarse-level audio classification. Examples of audio fine classification and audio retrieval are also provided.

IC991600.PDF (From Author) IC991600.PDF (Rasterized)

TOP


A Fast Audio Classification from MPEG Coded Data

Authors:

Yasuyuki Nakajima,
Yang Lu,
Masaru Sugano,
Akio Yoneyama,
Hiromasa Yanagihara,
Akira Kurematsu,

Page (NA) Paper number 2299

Abstract:

Audio information classification becomes a very important task for such purposes as automatic keyword spotting and other content-based audio-visual query system. In this paper, we describe a fast and accurate audio data classification method on MPEG coded data domain. Firstly silent segments are detected using a robust approach for different recording conditions. Then the non-silent segments are classified into three types, music, speech, and applause using temporal density, bandwidth and center frequency of subband energy. In order to be robust for a variety of audio sources as much as possible, we use Bayes discriminant function for multivariate Gaussian distribution instead of manually adjusting a threshold for each discriminator. In the experiment, every one-second MPEG audio data is classified and about 90% of audio and speech segments have been successfully detected. As for the detection speed, less than 20% of MPEG audio decoding processing power is required.

IC992299.PDF (From Author) IC992299.PDF (Rasterized)

TOP


Image Retrieval Based on Energy Histograms of The Low Frequency DCT Coefficients

Authors:

Jose A Lay,
Ling Guan,

Page (NA) Paper number 1221

Abstract:

With the increasing popularity of the use of compressed images, an intuitive approach for lowering computational complexity towards a practically efficient image retrieval system is to propose a scheme that is able to perform retrieval computation directly in the compressed domain. In this paper, we investigate the use of energy histograms of the low frequency DCT coefficients as features for the retrieval of DCT compressed images. We propose a feature set that is able to identify similarities on changes of image-representation due to several lossless DCT transformations. We then use the features to construct an image retrieval system based on the real-time image retrieval model. We observe that the proposed features are sufficient for performing high level retrieval on medium size image databases. And by introducing transpositional symmetry, the features can be brought to accommodate several lossless DCT transformations such as horizontal and vertical mirroring, rotating, transposing, and transversing.

IC991221.PDF (From Author) IC991221.PDF (Rasterized)

TOP


Texture Features for DCT-Coded Image Retrieval and Classification

Authors:

Yu-Len Huang, Department of Computer Science and Information Engineering, National Chung Cheng University, Taiwan, R.O.C. (Taiwan)
Ruey-Feng Chang, Department of Computer Science and Information Engineering, National Chung Cheng University, Taiwan, R.O.C. (Taiwan)

Page (NA) Paper number 1241

Abstract:

The multiresolution wavelet transform has been shown to be an effective technique and achieved very good performance for texture analysis. However, a large number of images are compressed by the methods based on discrete cosine transform (DCT). Hence, the image decompression of inverse DCT is needed to obtain the texture features based on the wavelet transform for the DCT-coded image. This paper proposes the use of the multiresolution reordered features for texture analysis. The proposed features are directly generated by using the DCT coefficients from the DCT-coded image. Comparisons with the subband-energy features extracted from the wavelet transform, conventional DCT using the Brodatz texture database indicate that the proposed method provides the best texture pattern retrieval accuracy and obtains much better correct classification rate. The proposed DCT based features are expected to be very useful and efficient for texture pattern retrieval and classification in large DCT-coded image databases. The detail simulation results can be found in web page: http://www.cs.ccu.edu.tw/~hyl/mrdct/.

IC991241.PDF (From Author) IC991241.PDF (Rasterized)

TOP


An Efficient Low-Dimensional Color Indexing Scheme for Region-Based Image Retrieval

Authors:

Yining Deng,
B. S. Manjunath,

Page (NA) Paper number 2075

Abstract:

In this work, an efficient low-dimensional color indexing scheme for region-based image retrieval is presented. The colors in each image region are first quantized so that only a small number of cluster centroids are needed to represent the region color information. The proposed color feature descriptor consists of these quantized colors and their percentages in the region. A similarity distance measure is defined and shown to be equivalent to the quadratic color histogram distance measure. The quantized colors are indexed in the 3-D color space so that high-dimensional indexing can be avoided. During the search process, each quantized color in the query is used as a separate cue to find matches containing that color. The matches from all the query colors are then joined to obtain the final retrievals. Experimental results show that the proposed scheme is fast and accurate compared to the color histogram approach.

IC992075.PDF (From Author) IC992075.PDF (Rasterized)

TOP


Vector-Wavelet Based Scalable Indexing And Retrieval Systems For Large Color Image Archives

Authors:

Elif Albuz,
Erturk D Kocalar,
Ashfaq A Khokhar,

Page (NA) Paper number 2208

Abstract:

This paper presents an efficient content based indexing and retrieval mechanism based on vector wavelet coefficients of color images. We use highly decorrelated wavelet coefficient planes to acquire a search efficient feature space. The feature space is subsequently indexed using properties of the all the images in the database. Therefore the feature key of an image does not only correspond to the content of the image itself but also how much the image is different from the other images being stored in the database. The search time depends only on the number of images similar to the query image but not on the size of the entire database. The system is scalable and provides fast retrievals. We show that in a database of 1000 images, query search takes less than 50 msec, on a 266 MHz Pentium processor compared to several seconds of retrieval time in the earlier systems proposed in the literature.

IC992208.PDF (From Author) IC992208.PDF (Rasterized)

TOP