Audio Coding & Transducer

Home

Audio Coding Using Sinusoidal Excitation Representation

Authors:

Wen-Whei Chang, National Chiao-Tung University (Taiwan)
De-Yu Wang, National Chiao-Tung University (Taiwan)
Li-Wei Wang, National Chiao-Tung University (Taiwan)

Volume 1, Page 311

Abstract:

Most LPC-based audio coders employ simplistic noise-shaping operations to perform psychoacoustic control of quantization noise. In this paper, we report on new approaches to exploiting perceptual masking in the design of adaptive quantization of LPC excitation parameters. Due to its localized spectral sensitivity, sinusoidal excitation representation is preferred to spectrally flat signals for use in excitation modeling. Simulation results indicate that the proposed multisinusoid excited coder can deliver high quality audio reproduction at the rate of 72 kb/s.

ic970311.pdf

TOP

Optimum Bit Allocation and Decomposition for High Quality Audio Coding

Authors:

Xiang Wei, University of Central Lancashire (U.K.)
Martyn J. Shaw, University of Central Lancashire (U.K.)
Martin R. Varley, University of Central Lancashire (U.K.)

Volume 1, Page 315

Abstract:

Current audio compression schemes are capable of reducing the per channel bit rate of high quality audio signals from 16 bits per sample to around 2-4 bits per sample. In these schemes, knowledge of psychoacoustics is utilised and a uniform or nonuniform frequency decomposition method is used. In this paper we derive the optimum bit allocation to achieve the highest perceptual quality under a fixed bit rate, for an arbitrarily decomposed, critically sampled, filter bank. The resultant optimum bit allocation gives rise to a shaped reconstruction noise floor approximately parallel to the masking threshold level. Perceptual coding gain is defined and should be maximized for an optimum decomposition performed by the filter bank. Optimum band splitting is discussed and it is pointed out that decomposition in the manner of critical band splitting does not lead to optimal performance.

ic970315.pdf

TOP

The D5 Lattice Quantization For A 64 KBit/S Low-Delay Subband Audio Coder With A 15 KHz Bandwidth

Authors:

Karine Hay, ENST-Br, Dept. SC. (France)
S. Saoudi, ENST-Br, Dept. SC. (France)
L. Mainard, CCETT, Servive RCS/SDA (France)

Volume 1, Page 319

Abstract:

A new method for coding generic audio signals at 64 kbit/s in the 20-15000 Hz bandwidth with a low delay is presented. It combines subband coding, Low Delay CELP algorithm and cascaded filterbanks. Our earlier works shown that, when using an equal bit rate on each subband, the resulting audio quality was not appropriate. We propose here a new technique based on lattice quantization to avoid the search complexity of the statistical vector quantization. It allows an adaptive bit rate allocation in each subband. Experimental results assessing the validity of the proposed method are also presented.

ic970319.pdf

TOP

An Experimental Audio Codec Based on Warped Linear Prediction of Complex Valued Signals

Authors:

Aki Härmä, Helsinki University of Technology (Finland)
Unto K. Laine, Helsinki University of Technology (Finland)
Matti Karjalainen, Helsinki University of Technology (Finland)

Volume 1, Page 323

Abstract:

Bark-scale warped linear prediction [WLP] is a very potential core for a monophonic perceptual audio codec. In the current paper the WLP scheme is extended for processing complex valued signals (CWLP). Three different methods of converting a stereo signal to one complex valued signal are introduced. The philosophy behind the coding scheme is to integrate some aspects of modern wideband audio coding (e.g. perceptuality and stereo signal processing) into one computational element in order to find a more holistic and economic way of processing.

ic970323.pdf

TOP

High Quality Low Complexity Scalable Wavelet Audio Coding

Authors:

William Kurt Dobson, U.S. Robotics (U.S.A.)
Jiankan Jack Yang, U.S. Robotics (U.S.A.)
Kevin J. Smart, U.S. Robotics (U.S.A.)
Feng Kathy Guo, U.S. Robotics (U.S.A.)

Volume 1, Page 327

Abstract:

This paper presents an audio coder for real-time multimedia applications. To achieve high quality at low bit rate, the audio coder uses a wavelet packet decomposition to transform the audio data into the wavelet domain, and a psychoacoustic model is used to minimize quantization noise. The wavelet packet decomposition tree structures were chosen in a way to closely mimic the critical bands in a psychoacoustic model. Instead of determining the masking thresholds in the Fourier domain, the wavelet coefficients are used to drive the psychoacoustic model directly. Most of the standard industrial sampling frequencies are supported by this coder. An efficient bit rate control scheme was designed such that the audio coder operates at virtually any desired bit rate level. The audio coder achieves near perceptually lossless quality at or below 80 kb/s for most audio sources. Real-time encoding/decoding is possible by using only a fraction of a Pentium or faster CPU.

ic970327.pdf

2076_a.wav "The Jack" by AC/DC prior to encoding-decoding
2076_b.wav "The Jack" by AC/DC following encoding-decoding (45 kb/s)
2076_c.wav "The Jack" by AC/DC following encoding-decoding (35 kb/s)
2076_d.wav "Eruption" by Van Halen prior to encoding-decoding
2076_e.wav "Eruption" by Van Halen following encoding-decoding (55 kb/s)
2076_f.wav "Eruption" by Van Halen following encoding-decoding (30 kb/s)

TOP

An Efficient Tonal Component Coding Algorithm For MPEG-2 Audio NBC

Authors:

Yuichiro Takamizawa, NEC Corporation (Japan)
Masahiro Iwadare, NEC Corporation (Japan)
Akihiko Sugiyama, NEC Corporation (Japan)

Volume 1, Page 331

Abstract:

This paper proposes a tonal component coding algorithm for a codec that employs a transform followed by Huffman coding, such as MPEG-2 Audio NBC (Non-Backward Compatible). After the input audio signal is mapped onto a frequency domain, the proposed algorithm withdraws local maximum components that degrade coding efficiency. By this withdrawal, the flatness of the spectrum increases and the efficiency in Huffman coding is improved. The withdrawn components are encoded separately as side information. When the frequency resolution of the time/frequency mapping is high, this algorithm works more effectively since local maximum samples appear more frequently with such a mapping. Simulation results show that this algorithm achieves as much as 11% bit reduction per frame and improves the coding efficiency in 41% of all the audio frames.

ic970331.pdf

TOP

Spectral Amplitude Warping (SAW) for Noise Spectrum Shaping in Audio Coding

Authors:

Roch Lefebvre, University of Sherbrooke (Canada)
Claude Laflamme, University of Sherbrooke (Canada)

Volume 1, Page 335

Abstract:

In this paper, we present a new approach to shape the coding noise in speech and audio coders. The approach, called Spectral Amplitude Warping (SAW), consists essentially of a pre- and post-processing which apply a non-linear transformation to the signal short-term spectrum prior to, and after, encoding. Since it is possible to view SAW as a separate entity from the coder, the noise shaping capability of an existing coder can be improved without modifying the coder itself. Using SAW as a pre- and post-process to the G.722 wideband speech coding standard, it was found in an informal listening test that the quality of the 64 kb/s operating mode can be achieved at only 48 kb/s. The price to be paid is an additional delay.

ic970335.pdf

TOP

A fast noise-scaling algorithm for uniform quantization in audio coding schemes

Authors:

Carlos A. Serantes, Universidad de Vigo (Spain)
Antonio S. Pena, Universidad de Vigo (Spain)
Nuria González-Prelcic, Universidad de Vigo (Spain)

Volume 1, Page 339

Abstract:

A new bit assignment algorithm is presented. Its goals are the simultaneous assignment on all subbands in a few steps of an iterative calculus, the use of memory to achieve a better speed of convergence and the consideration of a deformable error curve. The basis of the algorithm is discussed and also other considerations that are likely to arise in practice. Finally, an example of performance is given.

ic970339.pdf

TOP

Pyramid Vector Coding for high quality audio compression

Authors:

Daniele Cadel, Cefriel (Italy)
Giorgio Parladori, Alcatel Telecom (Italy)

Volume 1, Page 343

Abstract:

Target of this work is the high quality audio coding at low bit rate. It will be shown how the Pyramid Vector Coding (PVC) can conveniently replace the classical Huffman Coding technique in audio compression systems, giving also an advantage in the bit allocation procedure. The compression performances can be further improved by fixing an upper limit value of the vector components.

ic970343.pdf

TOP

Subband Audio Coding with Synthesis Filters Minimizing a Perceptual Criterion

Authors:

Karine Gosse, ENST Paris (France)
François Moreau de Saint-Martin, CCETT (France)
Xavier Durot, CCETT (France)
Pierre Duhamel, ENST Paris (France)
Jean-Bernard Rault, CCETT (France)

Volume 1, Page 347

Abstract:

The design of filter banks for source coding purposes classically relies on the perfect reconstruction (PR) property. However, several recent studies have shown that taking the quantization noise into account in the design could yield noticeable reduction of the mean square reconstruction error. The purpose of this study is to show that perceptual improvement can also be obtained in the particular audio coding context by relaxing the PR constraint. In this context, the mean square error is not relevant any more, and we define a new perceptual distortion criterion, making use of a simplified ear model, the MPE (Mean Perceptual Error). Then, synthesis filters are optimized so as to minimize this MPE. Finally, this MMPE (Minimum MPE) filter bank is included in an audio coding scheme. Compared to the corresponding PR filter bank-based scheme by the means of POM (Perceptual Objective Measure), they show an improved audio quality.

ic970347.pdf

TOP

New Results in Low Bitrate Audio Coding Using a Combined Harmonic-Wavelet Representation

Authors:

Simon Boland, Queensland University of Technology (Australia)
Mohamed Deriche, Queensland University of Technology (Australia)

Volume 1, Page 351

Abstract:

In this paper, we propose a new combined harmonic-wavelet representation for audio where a harmonic analysis-synthesis scheme is used, first, to approximate each audio frame as a sum of several sinusoids. Then, the difference between the original signal and the reconstructed harmonic signal is analyzed using a wavelet filtering scheme. After each step (harmonic analysis & wavelet filtering), parameters are quantized and encoded. Compared to previously proposed methods, our audio coder uses different harmonic analysis-synthesis and wavelet filtering schemes. We use the Total Least Squares (TLS)-Prony algorithm for the harmonic analysis-scheme, and an M-band wavelet transform for analyzing the residual. Altogether, our proposed coder is capable of delivering excellent audio signal quality at encoder bitrates of 60-70 kb/s.

ic970351.pdf

TOP

Adaptive Inverse Control of Weakly Nonlinear Systems

Authors:

Wolfgang J. Klippel, Dresden (Germany)

Volume 1, Page 355

Abstract:

A weak nonlinear plant can be linearized and will track an input signal if the plant is preceded by a nonlinear controller which approximates the inverse of the plant's transfer function. Present techniques for adjusting the controller adaptively to the plant require an additional nonlinear adaptive filter to perform a separate system identification. Straightforward update algorithms can not directly update the filter parameters in the controller because the transfer function of the plant might cause instabilities in the adaptive process. This problem is overcome by performing additional linear filtering to the nonlinear state vector and/or error signal. Novel filtered-A and filtered-E modifications of the stochastic gradient based methods are presented which are capable to update generic as well as special block-oriented nonlinear filter architectures.