Spacer ICASSP '98 Main Page

Spacer
General Information
Spacer
Conference Schedule
Spacer
Technical Program
Spacer
    Overview
    50th Annivary Events
    Plenary Sessions
    Special Sessions
    Tutorials
    Technical Sessions
    
By Date
    May 12, Tue
May 13, Wed
May 14, Thur
May 15, Fri
    
By Category
    AE    ANNIV   
COMM    DSP   
IMDSP    MMSP   
NNSP    PLEN   
SP    SPEC   
SSAP    UA   
VLSI   
    
By Author
    A    B    C    D    E   
F    G    H    I    J   
K    L    M    N    O   
P    Q    R    S    T   
U    V    W    X    Y   
Z   

    Invited Speakers
Spacer
Registration
Spacer
Exhibits
Spacer
Social Events
Spacer
Coming to Seattle
Spacer
Satellite Events
Spacer
Call for Papers/
Author's Kit

Spacer
Future Conferences
Spacer
Help

Abstract -  SP2   


 
SP2.1

   
An Error Correction Approach Based on the MAP Algorithm Combined with Hidden Markov Models
T. Yonezaki, K. Yoshida, T. Yagi  (Matsushita Communication Ind. Co. Ltd., Japan)
The error correction approach which based on a hidden Markov model (HMM) is proposed. The occurrence probability of a code sequence, which is delivered by the HMMs, is used as the measure for the maximum a posteriori probability (MAP) algorithm. The MAP algorithm is based on the assumption that the source is a discrete-time finite-state Markov process, and the HMM which models a Markov source suits well for speech data. Therefore this combination would be useful for a speech coding system. The proposing approach is adapted to the code sequence quantizing line spectrum frequency (LSF) parameters. When the code sequence is sent over a binary symmetry channel (BSC), the proposing approach with 16-state HMMs improves in code error rate and degradation of cepstrum distortion at about 27% and 39% respectively for 3% random errors.
 
SP2.2

   
Quantization of the Spectral Envelope for Sinusoidal Coders
T. Eriksson, H. Kang, Y. Stylianou  (AT&T Labs - Research, USA)
In an effort to efficiently code the spectral envelope of speech signals for wideband speech coding based on sinusoidal models, a robust computation of discrete cepstrum coefficients and their quantization is investigated. A parameterization of the spectral envelope has been proposed which is based on discrete cepstral coefficients using regularization techniques. This paper presents an efficient quantization scheme for these coefficients in order to use them in applications like speech coding. We present results which show a 35% reduction in bitrate when compare to simple scalar quantization. To verify the efficiency of the proposed quantization schemes, informal listening tests were performed in the context of a sinusoidal coder.
 
SP2.3

   
Robust Speech Mode Based LSF Vector Quantization for Low Bit Rate Coders
S. Nandkumar, K. Swaminathan, U. Bhaskar  (Hughes Network Systems, USA)
Robust vector quantization of LSF parameters at a low bit rate is essential for voice coders operating below 5 Kbps. A novel aspect of the proposed technique is the use of decorrelated residual LSF vectors from speech mode based backward prediction along with a multi-stage VQ design. Rates as low as 12 bits per 20 ms speech frame for the stationary voiced speech mode and 22 bits/frame for unvoiced and non-stationary voiced frames are shown to result in efficient quantization. In our classification scheme, spectrally stationary voiced frames constitute around 30% of active speech frames resulting in a minimum average bit rate of 19 bits/frame. Objective VQ performance is compared with cellular standard coders such as IS-641 and IS-127. The proposed VQ has been integrated into a speech mode based 4.8 Kbps coder resulting in subjective performance close to that of the 7.4 Kbps IS-641 coder.
 
SP2.4

   
A New General Distance Measure For Quantization of LSF and Their Transformed Coefficients
H. Vu, L. Lois  (Technical University of Budapest, Hungary)
In this paper, we have developed a new general distance measure that not only can be used in a vector quantization (VQ) of the line spectrum frequency (LSF) parameters but performes well in the LSF transformed domain. The new distance is based on the spectral sensitivity of LSF and their transformed coefficients. In addition, the fix scaling factor is used to decrease the sensitivity of spectral error at higher frequencies. Experimental results have shown that the proposed distance measure leads to as good as or better perfomance of VQ compared to other methods in the field of LSF coding. The use of this distance as the weighting function of the LSFs' transformed parameters is also suggested.
 
SP2.5

   
Variable Model Order LPC Quantization
P. Ojala, A. Lakaniemi  (Nokia Research Center, Finland)
This paper presents a new method to apply variable bit-rate predictive quantization of the variable model order LPC parameters. In addition, the method is employed to interpolate the parameters within the analysis frame. The LPC model order selection algorithm of this work is based on the characteristics of the input signal and on the performance of the LPC model. Hence, the variable bit-rate LPC quantization is source controlled. The number of quantized parameters needs to be identical in successive frames to be able to apply the predictive quantization and to interpolate parameters inside the frame. Therefore, the order of the LPC model of the previous frame needs to be expanded or reduced to be the same as the current frame LPC model. The advantage of variable model order LPC quantization is the lowered average bit-rate compared to fixed rate while the speech quality remains the same.
 
SP2.6

   
Optimal Transform for Segmented Parametric Speech Coding
D. Mudugamuwa, A. Bradley  (Royal Melbourne Institute of Technology, Australia)
In voice coding applications where there is no constraint on the encoding delay, such as store and forward message systems or voice storage, segment coding techniques can be used to achieve a reduction in data rate without compromising the level of distortion. For low data rate linear predictive coding schemes, increasing the encoding delay allows one to exploit any long term temporal stationarities on an interframe basis, thus reducing the transmission bandwidth or storage needs of the speech signal. Transform coding has previously been applied in low data rate speech coding to exploit both the interframe and the intraframe correlation. This paper investigates the potential for optimising the transform for segmented parametric representation of speech.
 
SP2.7

   
Two Novel Lossless Algorithms to Exploit Index Redundancy in VQ Speech Compression
S. Sridharan  (Queensland University of Technology, Australia);   J. Leis  (University of Southern Queensland, Australia)
We address the problem of speech compression at very low rates, with the short-term spectrum compressed to less than 20 bits per frame. Current techniques apply structured vector quantization (VQ) to the short-term synthesis filter coefficients to achieve rates of the order of 24 to 26 bits per frame. In this paper we show that temporal correlations in the VQ index stream can be introduced by dynamic codebook ordering, and that these correlations can be exploited by lossless coding approaches to reduce the number of bits per frame of the VQ scheme. The use of lossless coding ensures that no additional distortion is introduced, unlike other interframe techniques. We then detail two constructive algorithmswhich are able to exploit this redundancy. The first method is adelayed-decision approach, which dynamically adapts the VQ codebook toallow for efficient entropy coding of the index stream. The second isbased on a vector sub-codebook approach, and does not incur anyadditional delay. Experimental results are presented for both methodsto validate the approach.
 
SP2.8

   
Multi Codebook Vector Quantization of LPC Parameters
C. Xydeas, T. Chapman  (University of Manchester, UK)
This paper presents a novel and efficient variable bit rate LPC quantization approach. The proposed MCVQ framework allows a Dynamic Programming based minimum quantization distortion partitioning and quantization process to be performed on input LSP vector tracks in time. Variable duration segments of LSP vector tracks are classified into one of a finite number of language related events. Specific codebooks, designed optimally for each event type, are then employed to vector quantize the individual LSP vectors of a given segment.
 
SP2.9

   
Personal Speech Coding
W. Jia, W. Chan  (Illinois Institute of Technology, USA)
In existing speech coding systems, all quantizer codebooks are designed to suit the statistical and perceptual characteristics of speech signals of a population of speakers. However, an individual's speech signal does not exhibit, even over a long time, the entire range of characteristics of the population. With the advent of the personal communication systems, personal information might become available and be used to improve the rate-distortion performance of speech coders. In this paper we assess the potential gain of personal speech coding by designing codebooks for individual speakers. Spectral quantization, excitation and pitch lag codebooks of existing CELP coders are redesigned. The gains appear to be modest, suggesting that we need to use a different coding framework, Amongst the components, the spectral quantizer seems to be most amenable to personalization.
 
SP2.10

   
A Genetic Approach to the Design of General-Tree-Structured Vector Quantizers for Speech Coding
L. Tseng, S. Yang  (National Chung Hsing University, Taiwan, ROC)
The full-search vector quantization suffers from spending much time searching the whole codebook sequentially. Recently, several tree-structured vector quantizers had been proposed. But almost all trees used are binary trees and hence the training samples contained in each node are forced to be divided into two clusters artificially. We present a general-tree-structured vector quantizer that is based on a genetic clustering algorithm.This genetic clustering algorithm can divide the training samples contained in each node into more natural clusters. A distortion threshold is used to guarantee the quality of coding. Also, the Huffman coding is used to achieve the optimal bit rate after the general-tree-structured coder was constructed. An experiment on speech coding was conducted. A comparison of the performance of this vector quantizer and the other two tree-structured vector quantizers is also given.
 

< Previous Abstract - SP1

SP3 - Next Abstract >