5-2 Zero Crossing Rate

[english][all]

(請注意:中文版本並未隨英文版本同步更新!)

「過零率」(Zero Crossing Rate,簡稱 ZCR)是在每個音框中,音訊通過零點的次數,具有下列特性:

To avoid DC bias, usually we need to perform mean subtraction on each frame. Here is an straightforward example of ZCR:

Example 1: zcr01.mwaveFile='csNthu.wav'; frameSize=256; overlap=0; au=myAudioRead(waveFile); y=au.signal; fs=au.fs; frameMat=enframe(y, frameSize, overlap); frameNum=size(frameMat, 2); for i=1:frameNum frameMat(:,i)=frameMat(:,i)-mean(frameMat(:,i)); % mean justification end zcr=sum(frameMat(1:end-1, :).*frameMat(2:end, :)<0); sampleTime=(1:length(y))/fs; frameTime=((0:frameNum-1)*(frameSize-overlap)+0.5*frameSize)/fs; subplot(2,1,1); plot(sampleTime, y); ylabel('Amplitude'); title(waveFile); subplot(2,1,2); plot(frameTime, zcr, '.-'); xlabel('Time (sec)'); ylabel('Count'); title('ZCR');

We can use the function "frame2zcr" to simplify the above example:

Example 2: zcr02.mwaveFile='csNthu.wav'; frameSize=256; overlap=0; au=myAudioRead(waveFile); y=au.signal; fs=au.fs; frameMat=enframe(y, frameSize, overlap); frameNum=size(frameMat, 2); zcr=frame2zcr(frameMat); sampleTime=(1:length(y))/fs; frameTime=frame2sampleIndex(1:frameNum, frameSize, overlap)/fs; subplot(2,1,1); plot(sampleTime, y); ylabel('Amplitude'); title(waveFile); subplot(2,1,2); plot(frameTime, zcr, '.-'); xlabel('Time (sec)'); ylabel('Count'); title('ZCR');

Example 3: zcrWithShift.mwaveFile='csNthu.wav'; frameSize=256; overlap=0; au=myAudioRead(waveFile); y=au.signal; fs=au.fs; frameMat=enframe(y, frameSize, overlap); frameNum=size(frameMat,2); volume=frame2volume(frameMat); [minVolume, index]=min(volume); shiftAmount=2*max(abs(frameMat(:,index))); % shiftAmount is equal to twice the max. abs. sample value within the frame of min. volume method=1; zcr1=frame2zcr(frameMat, method); zcr2=frame2zcr(frameMat, method, shiftAmount); sampleTime=(1:length(y))/fs; frameTime=frame2sampleIndex(1:frameNum, frameSize, overlap)/fs; subplot(2,1,1); plot(sampleTime, y); ylabel('Amplitude'); title(waveFile); subplot(2,1,2); plot(frameTime, zcr1, '.-', frameTime, zcr2, '.-'); xlabel('Time (sec)'); ylabel('Count'); title('ZCR'); legend('ZCR without shift', 'ZCR with shift');

一般而言,在計算過零率時,需注意下列事項:

  1. 由於有些訊號若恰好位於零點,此時過零率的計算就有兩種,出現的效果也會不同。因此必須多加觀察,才能選用最好的作法。
  2. 大部分都是使用音訊的原始整數值來進行,才不會因為使用浮點數訊號,在減去直流偏移(DC Bias)時,造成過零率的增加。

在以下範例中,我們使用兩種不同的方法來計算過零率:

Example 4: zcrOn8bit.mwaveFile='csNthu8b.wav'; frameSize=256; overlap=0; au=myAudioRead(waveFile); y=au.signal; fs=au.fs; nbits=au.nbits; y=y*2^nbits/2; frameMat=enframe(y, frameSize, overlap); frameNum=size(frameMat, 2); for i=1:frameNum frameMat(:,i)=frameMat(:,i)-round(mean(frameMat(:,i))); % Zero justification end zcr1=sum(frameMat(1:end-1, :).*frameMat(2:end, :)<0); % Method 1 zcr2=sum(frameMat(1:end-1, :).*frameMat(2:end, :)<=0); % Method 2 sampleTime=(1:length(y))/fs; frameNum=size(frameMat, 2); frameTime=((0:frameNum-1)*(frameSize-overlap)+0.5*frameSize)/fs; subplot(2,1,1); plot(sampleTime, y); ylabel(waveFile); subplot(2,1,2); plot(frameTime, zcr1, '.-', frameTime, zcr2, '.-'); title('ZCR'); xlabel('Time (sec)'); legend('Method 1', 'Method 2');

在上述的範例中,我們使用了兩種方式來計算過零率,得到的效果雖然不同,但趨勢是一致的。(另外有一種情況,當錄音環境很安靜時,靜音的訊號值都在零點或零點附近附近跳動時,此時是否計算位於零點的過零率,就會造成很大的差別。)如果取樣頻率提高,得到的結果也會不同:

Example 5: zcrOn16bit.mwaveFile='csNthu.wav'; frameSize=256; overlap=0; au=myAudioRead(waveFile); y=au.signal; fs=au.fs; nbits=au.nbits; y=y*2^nbits/2; frameMat=enframe(y, frameSize, overlap); frameNum=size(frameMat, 2); for i=1:frameNum frameMat(:,i)=frameMat(:,i)-round(mean(frameMat(:,i))); % Zero justification end zcr1=sum(frameMat(1:end-1, :).*frameMat(2:end, :)<0); % Method 1 zcr2=sum(frameMat(1:end-1, :).*frameMat(2:end, :)<=0); % Method 2 sampleTime=(1:length(y))/fs; frameTime=((0:frameNum-1)*(frameSize-overlap)+0.5*frameSize)/fs; subplot(2,1,1); plot(sampleTime, y); ylabel(waveFile); subplot(2,1,2); plot(frameTime, zcr1, '.-', frameTime, zcr2, '.-'); title('ZCR'); xlabel('Time (sec)'); legend('Method 1', 'Method 2');

若要偵測聲音的開始和結束,通常稱為「端點偵測」(Endpoint Detection)或「語音偵測」(Speech Detection),最簡單的方法就是使用音量和過零率來判別,相關細節會在後續章節說明。


Audio Signal Processing and Recognition (音訊處理與辨識)