ABSTRACT
Speech enhancement techniques using spectral subtraction have the drawback of generating residual noise with a musical character, so-called musical noise. We developed a new post-processing method for suppressing this musical residual noise. In this method, the auditory masking threshold is calculated twice, once before the spectral subtraction and once again afterwards. This ensures that all audible spectral signal components above the thresholds are detected. Audible components which are only present at the output are candidates for musical noise. Depending on their spectral bandwidth and time duration, they may be processed additionally. Using this post-processing, the distortion of the speech signal is not noticeable and musical noise is not audible even at low signal-to-noise ratios of about 0 dB.
ABSTRACT
In this paper we propose a method for enhancement of speech in the presence of additive noise. The objective is to selectively enhance the high SNR regions in the noisy speech in the temporal and spectral domains, without causing significant distortion in the resulting enhanced speech. This is proposed to be done at three different levels: (a) At the gross level, by identifying the regions of speech and noise in the temporal domain, (b) At the finer level, by identifying the regions of high and low SNR portions in the noisy speech, and (c) At the short-time spectrum level, by enhancing the spectral peaks over spectral valleys. Processing of noisy speech for enhancement involves mostly weighting the LP residual samples. The weighted residual samples are used to excite the time- varying LP filter to produce enhanced speech.
ABSTRACT
In this paper an acoustic echo compensator with an additional frequency domain adaptive filter for combined residual echo and noise reduction is proposed. The algorithm delivers high echo attenuation as well as high near end speech quality over a wide range of signal-to-noise conditions. The system makes use of a standard time domain echo compensator of low order, after which the proposed adaptive filter, which is motivated by means of a minimum mean square error approach, is placed in the sending path. In contrast to other combined systems [1, 2, 3], our method uses an explicit estimate of the power spectral density of the residual echo after echo compensation. The separate estimations of the power spectral densities of the residual echo and the background noise, respectively, are then flexibly combined, such that in the processed signal a low level of intentionally left background noise will effectively mask the residual echo.
ABSTRACT
Since speech sounds, such as fricative, glides, liquids, diphthongs, and transition regions between phones, reveal the most notable nonstationary nature, we propose the nonstationary autoregressive (AR) HMM with state-dependent polynomial function for modeling the nature of speech. Then, the nonstationary AR model has parameters depend on the states of the Markov chain. It is designed to handle the speech signal at the frame level, where it is represented by the signal, rather than dealing with feature vectors directly. Also, we proposed a new speech enhancement based on the nonstationary AR HMM and the IMM algorithm under white noise condition. The proposed enhancement is the weighted sum of the parallel Kalman filters with interacting rule by IMM algorithm. The simulation results shows that the proposed method offers performance gains relative to the previous results [7] with slightly increased complexity.
ABSTRACT
Additive and convolutional noises are the main problems to be solved in order to make speech recognition successful in real applications. A model for additive noise is used to deduce a spectral subtraction (SS estimation and to show that the channel transfer function could be effectively removed alter the additive noise being cancelled by SS. Then, SS and mean normalization are tested in com- bination with a weighting procedure to reduce the influence ol the rectilying lunction. All the experiments were done in the context ol weighted matching algorithms and the approaches proved effective in cancelling both additive noise and the transmission channel function.
ABSTRACT
This paper presents some novel results concerning the problem of enhancing speech degraded by wideband additive noise. The enhancement scheme proposed in this work is based on the utilisation of the Auditory Masking mechanism as a measure for the definition and subsequent suppression of the frequency audible noise components. Accordingly, the enhancement technique minimises only those noise components responsible for audible signal degradations, so that the underlying speech signal quality is only minimally degraded. Extensive subjective and objective tests have shown that, after enhancement, the intelligibility of the processed signal can be improved even at very low S/N ratios.