Session W1D Speech Enhancement I

Chairperson Hynek Hermansky Oregon Graduate Inst. of Science and Tech., USA

Home

Residual Noise Suppression Using Psychoacoustic Criteria

Authors: Tim Haulick, Klaus Linhard and Peter Schrogmeier

Daimler Benz AG, Research and Technology, Wilhelm-Runge-Str. 11 D-89081 Ulm, Germany e-mail: haulick@dbag.ulm.daimlerbenz.com

Volume 3 pages 1395 - 1398

ABSTRACT

Speech enhancement techniques using spectral subtraction have the drawback of generating residual noise with a musical character, so-called musical noise. We developed a new post-processing method for suppressing this musical residual noise. In this method, the auditory masking threshold is calculated twice, once before the spectral subtraction and once again afterwards. This ensures that all audible spectral signal components above the thresholds are detected. Audible components which are only present at the output are candidates for musical noise. Depending on their spectral bandwidth and time duration, they may be processed additionally. Using this post-processing, the distortion of the speech signal is not noticeable and musical noise is not audible even at low signal-to-noise ratios of about 0 dB.

A0053.pdf

TOP

PROCESSING LINEAR PREDICTION RESIDUAL FOR SPEECH ENHANCEMENT

Authors: B. Yegnanarayana (1), Carlos Avendano (2), Hynek Hermansky (2), and P. Satyanarayana Murthy (1)

(1) Department of Computer Science and Engineering Indian Institute of Technology, Madras 600 036, India (2)Department of Electrical Engineering Oregon Graduate Institute of Science & Technology Portland, Oregon, USA

Volume 3 pages 1399 - 1402

ABSTRACT

In this paper we propose a method for enhancement of speech in the presence of additive noise. The objective is to selectively enhance the high SNR regions in the noisy speech in the temporal and spectral domains, without causing significant distortion in the resulting enhanced speech. This is proposed to be done at three different levels: (a) At the gross level, by identifying the regions of speech and noise in the temporal domain, (b) At the finer level, by identifying the regions of high and low SNR portions in the noisy speech, and (c) At the short-time spectrum level, by enhancing the spectral peaks over spectral valleys. Processing of noisy speech for enhancement involves mostly weighting the LP residual samples. The weighted residual samples are used to excite the time- varying LP filter to produce enhanced speech.

A0247.pdf

TOP

COMBINED ACOUSTIC ECHO CONTROL AND NOISE REDUCTION FOR MOBILE COMMUNICATIONS

Authors: Stefan Gustafsson and Rainer Martin

Institute of Communication Systems and Data Processing Aachen University of Technology D-52056 Aachen, Germany Tel: +49 241 806976; fax: +49 241 8888186 e-mail: gus@ind.rwth-aachen.de

Volume 3 pages 1403 - 1406

ABSTRACT

In this paper an acoustic echo compensator with an additional frequency domain adaptive filter for combined residual echo and noise reduction is proposed. The algorithm delivers high echo attenuation as well as high near end speech quality over a wide range of signal-to-noise conditions. The system makes use of a standard time domain echo compensator of low order, after which the proposed adaptive filter, which is motivated by means of a minimum mean square error approach, is placed in the sending path. In contrast to other combined systems [1, 2, 3], our method uses an explicit estimate of the power spectral density of the residual echo after echo compensation. The separate estimations of the power spectral densities of the residual echo and the background noise, respectively, are then flexibly combined, such that in the processed signal a low level of intentionally left background noise will effectively mask the residual echo.

A0255.pdf

TOP

A Nonstationary Autoregressive HMM and Its Application to Speech Enhancement

Authors: Ki Yong Lee* and Jae Yeol Rheem**

*Dept. of Electronics Engr., Changwon National University Changwon, Kyungnam-Do 641-773, Korea Tel. +82-551-79-7527, Fax:+82-551-81-5070, E-mail: kylee@sarim.changwon.ac.kr ** Dept. of Electronics Engr., Korea Institute of Technology and Education, Chonan 330-600, Korea

Volume 3 pages 1407 - 1410

ABSTRACT

Since speech sounds, such as fricative, glides, liquids, diphthongs, and transition regions between phones, reveal the most notable nonstationary nature, we propose the nonstationary autoregressive (AR) HMM with state-dependent polynomial function for modeling the nature of speech. Then, the nonstationary AR model has parameters depend on the states of the Markov chain. It is designed to handle the speech signal at the frame level, where it is represented by the signal, rather than dealing with feature vectors directly. Also, we proposed a new speech enhancement based on the nonstationary AR HMM and the IMM algorithm under white noise condition. The proposed enhancement is the weighted sum of the parallel Kalman filters with interacting rule by IMM algorithm. The simulation results shows that the proposed method offers performance gains relative to the previous results [7] with slightly increased complexity.

A0408.pdf

TOP

SPECTRAL SUBTRACTION AND MEAN NORMALIZATION IN THE CONTEXT OF WEIGHTED MATCHING ALGORITHMS

Authors: Nestor Becerra Yoma*, Fergus R. McInnes, Mervyn A. Jack

Centre for Communication Interface Research, University of Edinburgh 80 South Bridge, Edinburgh EH1 lHN, U.K. E-Mail:nestor@ccir.ed.ac.uk

Volume 3 pages 1411 - 1414

ABSTRACT

Additive and convolutional noises are the main problems to be solved in order to make speech recognition successful in real applications. A model for additive noise is used to deduce a spectral subtraction (SS estimation and to show that the channel transfer function could be effectively removed alter the additive noise being cancelled by SS. Then, SS and mean normalization are tested in com- bination with a weighting procedure to reduce the influence ol the rectilying lunction. All the experiments were done in the context ol weighted matching algorithms and the approaches proved effective in cancelling both additive noise and the transmission channel function.

A0417.pdf

TOP

IMPROVING THE INTELLIGIBILITY OF NOISY SPEECH USING AN AUDIBLE NOISE SUPPRESSION TECHNIQUE

Authors: D. E. Tsoukalas, J. Mourjopoulos, and G. Kokkinakis

Wire Communications Laboratory Electrical & Computer Engineering Dept. University of Patras, 261 10, Greece Tel: +30 61 991722, FAX: +30 61 991855, E-mail: tsoukala@wcl.ee.upatras.gr

Volume 3 pages 1415 - 1418

ABSTRACT

This paper presents some novel results concerning the problem of enhancing speech degraded by wideband additive noise. The enhancement scheme proposed in this work is based on the utilisation of the Auditory Masking mechanism as a measure for the definition and subsequent suppression of the frequency audible noise components. Accordingly, the enhancement technique minimises only those noise components responsible for audible signal degradations, so that the underlying speech signal quality is only minimally degraded. Extensive subjective and objective tests have shown that, after enhancement, the intelligibility of the processed signal can be improved even at very low S/N ratios.

A0520.pdf