14-7 LCS and Edit Distance

[english][all]

(請注意:中文版本並未隨英文版本同步更新!)

若是以音符為單位來進行比對,我們就必須先將輸入向量切成音符。假設音高向量可以表示成 pv(i), i=1~n,那麼最簡單的切音符方法,可以說明如下:

for i=1:n
	if |pv(i)-pv(i-1)|>q & 音符夠長
		切出一個音符
	else
		將 pv(i) 加入目前的音符
	end
end

以下就是一個簡單的範例,來切出音符:

Example 1: noteSegment01.mpitchVector=[49.21841 48.33041 48.33041 47.76274 48.04425 48.04425 47.76274 48.33041 48.04425 48.04425 48.04425 47.48574 47.48574 47.48574 47.48574 47.48574 47.76274 48.04425 48.04425 47.76274 47.76274 47.76274 47.21309 47.21309 47.21309 47.21309 47.21309 51.48682 51.48682 51.48682 51.48682 51.83658 51.83658 51.83658 51.83658 51.83658 54.92247 54.09792 54.09792 54.09792 54.09792 54.09792 54.09792 54.92247 54.92247 54.92247 54.92247 54.50529 54.50529 54.92247 54.50529 54.50529 54.09792 54.50529 55.34996 54.92247 54.92247 54.50529 54.50529 54.50529 54.50529 54.50529 54.50529 54.50529 54.50529 54.50529 54.50529 53.31086 53.31086 53.31086 53.31086 56.23796 55.78827 55.78827 56.23796 56.69965 56.69965 56.69965 57.17399 56.69965 56.69965 56.69965 56.69965 56.69965 56.69965 56.69965 57.17399 57.17399 57.17399 56.69965 56.69965 56.69965 56.69965 56.69965 56.69965 56.69965 59.76274 59.76274 59.76274 59.21309 59.21309 59.21309 58.68037 58.68037 58.68037 58.68037 54.09792 54.09792 54.09792 54.50529 54.50529 54.09792 54.09792 54.50529 54.92247 54.50529 54.50529 54.09792 53.69992 53.69992 53.69992 53.69992 53.69992 53.69992 53.69992 53.69992 53.69992 53.69992 53.69992 53.69992 53.69992 53.69992 53.69992 53.69992 53.69992 53.69992 53.69992 53.69992 50.47805 51.48682 51.83658 52.19354 52.55803 52.19354 52.55803 51.83658 51.83658 52.55803 52.93035 52.93035 52.93035 52.93035 52.93035 51.14399 51.14399 54.50529 53.31086 52.55803 52.19354 52.19354 52.19354 52.55803 52.93035 54.09792 54.50529 54.92247 55.78827 56.23796 56.23796 55.78827 55.34996 54.09792 54.09792 54.09792 51.48682 50.15444 50.15444 50.80782 51.14399 51.14399 51.14399 51.14399 52.19354 52.19354 51.83658 51.83658 51.83658 51.48682 51.48682 51.48682 51.83658 51.83658 51.48682 51.48682 51.48682 51.48682 51.48682 50.80782 50.80782 52.55803 51.48682 51.14399 50.80782 51.14399 51.48682 51.48682 51.48682 50.80782 50.80782 50.80782 48.91732 48.62138 48.33041 48.33041 48.33041 48.04425 48.91732 48.91732 48.91732 49.21841 49.21841 48.91732 48.62138 48.33041 48.33041 48.33041 49.83678 48.62138 48.62138 48.62138 48.62138 48.62138 48.91732 49.52484 49.52484 48.91732 48.62138 48.33041]; opt=pv2note('defaultOpt'); opt.method='simple'; opt.minPitchError=0.8; opt.minNoteDuration=0.1; showPlot=1; note=pv2note(pitchVector, opt, showPlot); %fprintf('Hit return to hear the pitch vector...'); pause; fprintf('\n'); pvPlay(pitchVector, opt.frameRate); fs=8000; fprintf('Saving pitchVector.wav...\n'); wavwrite(pv2wave(pitchVector, opt.frameRate), fs, 8, 'pitchVector.wav'); %fprintf('Hit return to hear the segmented note...'); pause; fprintf('\n'); notePlay(note, 1); fprintf('Saving noteFromPv.wav...\n'); wavwrite(note2wave(note, 1, fs), fs, 8, 'noteFromPv.wav');Saving pitchVector.wav... [Warning: WAVWRITE will be removed in a future release. Use AUDIOWRITE instead.] [> In <a href="matlab:matlab.internal.language.introspective.errorDocCallback('wavwrite', 'E:\MATLAB\R2015a\toolbox\matlab\audiovideo\wavwrite.m', 48)" style="font-weight:bold">wavwrite</a> (<a href="matlab: opentoline('E:\MATLAB\R2015a\toolbox\matlab\audiovideo\wavwrite.m',48,0)">line 48</a>) In <a href="matlab:matlab.internal.language.introspective.errorDocCallback('noteSegment01', 'D:\users\jang\books\audioSignalProcessing\example\noteSegment01.m', 10)" style="font-weight:bold">noteSegment01</a> (<a href="matlab: opentoline('D:\users\jang\books\audioSignalProcessing\example\noteSegment01.m',10,0)">line 10</a>) In <a href="matlab:matlab.internal.language.introspective.errorDocCallback('goWriteOutputFile>dummyFunction', 'D:\users\jang\books\goWriteOutputFile.m', 85)" style="font-weight:bold">goWriteOutputFile>dummyFunction</a> (<a href="matlab: opentoline('D:\users\jang\books\goWriteOutputFile.m',85,0)">line 85</a>) In <a href="matlab:matlab.internal.language.introspective.errorDocCallback('goWriteOutputFile', 'D:\users\jang\books\goWriteOutputFile.m', 55)" style="font-weight:bold">goWriteOutputFile</a> (<a href="matlab: opentoline('D:\users\jang\books\goWriteOutputFile.m',55,0)">line 55</a>)] Saving noteFromPv.wav... [Warning: WAVWRITE will be removed in a future release. Use AUDIOWRITE instead.] [> In <a href="matlab:matlab.internal.language.introspective.errorDocCallback('wavwrite', 'E:\MATLAB\R2015a\toolbox\matlab\audiovideo\wavwrite.m', 48)" style="font-weight:bold">wavwrite</a> (<a href="matlab: opentoline('E:\MATLAB\R2015a\toolbox\matlab\audiovideo\wavwrite.m',48,0)">line 48</a>) In <a href="matlab:matlab.internal.language.introspective.errorDocCallback('noteSegment01', 'D:\users\jang\books\audioSignalProcessing\example\noteSegment01.m', 13)" style="font-weight:bold">noteSegment01</a> (<a href="matlab: opentoline('D:\users\jang\books\audioSignalProcessing\example\noteSegment01.m',13,0)">line 13</a>) In <a href="matlab:matlab.internal.language.introspective.errorDocCallback('goWriteOutputFile>dummyFunction', 'D:\users\jang\books\goWriteOutputFile.m', 85)" style="font-weight:bold">goWriteOutputFile>dummyFunction</a> (<a href="matlab: opentoline('D:\users\jang\books\goWriteOutputFile.m',85,0)">line 85</a>) In <a href="matlab:matlab.internal.language.introspective.errorDocCallback('goWriteOutputFile', 'D:\users\jang\books\goWriteOutputFile.m', 55)" style="font-weight:bold">goWriteOutputFile</a> (<a href="matlab: opentoline('D:\users\jang\books\goWriteOutputFile.m',55,0)">line 55</a>)]

上述方法是最簡單的方法,還有很多改進的空間,可能的改進的方向如下:
  1. 若遇到休止(音量很低之處),也要切出一個音符。
  2. 若遇到氣音,也要切出一個音符。
  3. 若歌唱者的音很準,可將每個音符的音高四捨五入到整數值。(或在上下平移後,找出一個平移量,讓每個音符在四捨五入後,和原音符的誤差總和為最小。)
  4. 可用 DP 的方式,來找出最佳的音符切割方式,使切出的音符和原來的音高向量有最小的誤差總和。

一旦切出音符後,就要進行比對,有幾種方式可以進行。

  1. 第一種切音符的比對方法,是只考慮音符的升降,而不考慮絕對數值及時間資訊。例如,如果切出的音符音高是 [69, 60, 58, 62, 62, 67, 69],那麼對應的比對格式是 [D, D, U, S, U, U],這些字母代表相鄰音符的關係,例如 D 是 Down,S 是 Same,U 是 Up。一旦將輸入音高向量轉成這種字串向量後,我們就可以使用「最長共同子字串」(longest common subsequence, LCS)或「編輯距離」(edit distance, ED)來進行比對。
  2. 第二種切音符的比對方法,則是指考慮音符的音高資訊,而不考慮音長資訊。例如,如果切出的音符音高是 [69, 60, 58, 62, 62, 67, 69],那麼我們可以直接使用類似 DTW 的方式來進行比對,只是在比對前還必須經過音高校正(key transposition)。
  3. 第三種切音符的比對方法,則是同時考慮音符的音高和音長資訊,此時我們還是可以使用類似 DTW 的方式來進行比對,只是要將音高和音長分別考慮來求取距離,同時也必須考慮音高校正和音長校正。

Audio Signal Processing and Recognition (音訊處理與辨識)