17-5 Digit Recognition: Changing MFCC Dimensions and Gaussian Component Numbers (??辨?:改變MFCC維度?Gaussian?數)

Old Chinese version

In this section, we shall change both the numbers of Gaussians as well as the dimensions of acoustic feature vectors.

In the next example, we use 13-dimensional MFCC and plot the recognition rates of inside and outside tests as functions of the number of Gaussians:

Example 1: htk/chineseDigitRecog/training/htkMixture01.mhtkPrm=htkParamSet; maxMixNum=8; for i=1:maxMixNum htkPrm.mixtureNum=i; fprintf('====== %d/%d\n', i, maxMixNum); [trainRR(i), testRR(i)]=htkTrainTest(htkPrm); end plot(1:maxMixNum, trainRR, 'o-', 1:maxMixNum, testRR, 'o-'); xlabel('No. of mixtures'); ylabel('Recog. rate (%)'); legend('Inside test', 'Outside test');====== 1/8 Pruning-Off Pruning-Off Pruning-Off Pruning-Off Pruning-Off ====== 2/8 Pruning-Off Pruning-Off Pruning-Off Pruning-Off Pruning-Off ====== 3/8 Pruning-Off Pruning-Off Pruning-Off Pruning-Off Pruning-Off ====== 4/8 Pruning-Off Pruning-Off Pruning-Off Pruning-Off Pruning-Off ====== 5/8 Pruning-Off Pruning-Off Pruning-Off Pruning-Off Pruning-Off ====== 6/8 Pruning-Off Pruning-Off Pruning-Off Pruning-Off Pruning-Off ====== 7/8 Pruning-Off Pruning-Off Pruning-Off Pruning-Off Pruning-Off ====== 8/8 Pruning-Off Pruning-Off Pruning-Off Pruning-Off Pruning-Off

In the above example, we have packed the corpus training and test in an m-file function htkTrainTest.m. Each time this function is invoked, it will do feature extraction. If the feature type is fixed, we can omit this part by changing this function for saving computation time.

We can also use MFCC of dimensions 13, 26, and 39 to plot the recognition rates of inside and outside tests as functions of the number of Gaussian components, as shown in the following example:

Example 2: htk/chineseDigitRecog/training/htkMixtureMfcc01.m% Get the RR when feature dim. and mixture no. are changing htkPrm=htkParamSet; maxMixNum=8; for i=1:maxMixNum htkPrm.mixtureNum=i; fprintf('====== %d/%d\n', i, maxMixNum); [trainRR(i,1), testRR(i,1)]=htkTrainTest(htkPrm); end htkPrm.feaCfgFile='mfcc26.cfg'; htkPrm.feaType='MFCC_E_D_Z'; htkPrm.feaDim=26; htkPrm.streamWidth=[26]; for i=1:maxMixNum htkPrm.mixtureNum=i; fprintf('====== %d/%d\n', i, maxMixNum); [trainRR(i,2), testRR(i,2)]=htkTrainTest(htkPrm); end htkPrm.feaCfgFile='mfcc39.cfg'; htkPrm.feaType='MFCC_E_D_A_Z'; htkPrm.feaDim=39; htkPrm.streamWidth=[39]; for i=1:maxMixNum htkPrm.mixtureNum=i; fprintf('====== %d/%d\n', i, maxMixNum); [trainRR(i,3), testRR(i,3)]=htkTrainTest(htkPrm); end plot( 1:maxMixNum, trainRR(:,1), '^-b', 1:maxMixNum, testRR(:,1), 'o-b', ... 1:maxMixNum, trainRR(:,2), '^-g', 1:maxMixNum, testRR(:,2), 'o-g', ... 1:maxMixNum, trainRR(:,3), '^-r', 1:maxMixNum, testRR(:,3), 'o-r'); xlabel('No. of mixtures'); ylabel('Recog. rate (%)'); legend('13D, Inside test', '13D, Outside Test', '26D, Inside Test', '26D, Outside test', '39D, Inside test', '39D, Outside test', 'Location', 'BestOutside');====== 1/8 Pruning-Off Pruning-Off Pruning-Off Pruning-Off Pruning-Off ====== 2/8 Pruning-Off Pruning-Off Pruning-Off Pruning-Off Pruning-Off ====== 3/8 Pruning-Off Pruning-Off Pruning-Off Pruning-Off Pruning-Off ====== 4/8 Pruning-Off Pruning-Off Pruning-Off Pruning-Off Pruning-Off ====== 5/8 Pruning-Off Pruning-Off Pruning-Off Pruning-Off Pruning-Off ====== 6/8 Pruning-Off Pruning-Off Pruning-Off Pruning-Off Pruning-Off ====== 7/8 Pruning-Off Pruning-Off Pruning-Off Pruning-Off Pruning-Off ====== 8/8 Pruning-Off Pruning-Off Pruning-Off Pruning-Off Pruning-Off ====== 1/8 Pruning-Off Pruning-Off Pruning-Off Pruning-Off Pruning-Off ====== 2/8 Pruning-Off Pruning-Off Pruning-Off Pruning-Off Pruning-Off ====== 3/8 Pruning-Off Pruning-Off Pruning-Off Pruning-Off Pruning-Off ====== 4/8 Pruning-Off Pruning-Off Pruning-Off Pruning-Off Pruning-Off ====== 5/8 Pruning-Off Pruning-Off Pruning-Off Pruning-Off Pruning-Off ====== 6/8 Pruning-Off Pruning-Off Pruning-Off Pruning-Off Pruning-Off ====== 7/8 Pruning-Off Pruning-Off Pruning-Off Pruning-Off Pruning-Off ====== 8/8 Pruning-Off Pruning-Off Pruning-Off Pruning-Off Pruning-Off ====== 1/8 Pruning-Off Pruning-Off Pruning-Off Pruning-Off Pruning-Off ====== 2/8 Pruning-Off Pruning-Off Pruning-Off Pruning-Off Pruning-Off ====== 3/8 Pruning-Off Pruning-Off Pruning-Off Pruning-Off Pruning-Off ====== 4/8 Pruning-Off Pruning-Off Pruning-Off Pruning-Off Pruning-Off ====== 5/8 Pruning-Off Pruning-Off Pruning-Off Pruning-Off Pruning-Off ====== 6/8 Pruning-Off Pruning-Off Pruning-Off Pruning-Off Pruning-Off ====== 7/8 Pruning-Off Pruning-Off Pruning-Off Pruning-Off Pruning-Off ====== 8/8 Pruning-Off Pruning-Off Pruning-Off Pruning-Off Pruning-Off

The above example take a longer time since it involves 24 cases of training and test. (3 dimensions 8 Gaussians = 24 cases).
Audio Signal Processing and Recognition (TBzP)