17-4 Digit Recognition: Changing Acoustic Models (?¸å?辨è?:改變Model?®ä?)

¦b«e´X¸`ªº»¡©ú¡A§Ú­Ì¬O¥H¤@­Ó¤¤¤å­µ¸`¨Ó§@¬°¤@­Ó»y­µ¼Ò«¬¡]Acoustic Model¡^¡A¦b¥»¸`¤¤¡A§Ú­Ì±N¨C¤@­Ó­µ¸`©î¦¨Phone¡A¨Ã¥H¦¹Phone¬°»y­µ¼Ò«¬¡A¦¹ºØ©î¸Ñ¤è¦¡ºÙ¬° Monophone¡A¥H§O©ó¥k¬ÛÃö¡]Right-context dependent, RCD¡^ªºBiphone¡C³o¨Ç¸ê°T¬O°O¿ý¦b digitMonophone.pam¡A¦p¤U

­ì©lÀÉ¡]htk/chineseDigitRecog/training/digitMonophone.pam¡^¡G¡]¦Ç¦â°Ï°ì«ö¨â¤U§Y¥i«þ¨©¡^
ba	b a
er	er
jiou	j i o u
ling	l i ng
liou	l i o u
qi	q i
san	s a n
si	s i
sil	sil
wu	w u
i	i

¦]¦¹¥u»Ý±N«e´X¸`½d¨Ò¤¤ªº digitSyl.pam §ï¬° digitMonophone.pam¡A§Y¥i¶i¦æ°V½m¤Î¿ëÃѲv´ú¸Õ¡A½Ð¨£¤U¦C½d¨Ò¡G

Example 1: htk/chineseDigitRecog/training/goMonophone13.mhtkPrm=htkParamSet; htkPrm.pamFile='digitMonophone.pam'; htkPrm.phoneMlfFile='digitMonophone.mlf'; htkPrm.mnlFile='digitMonophone.mnl'; disp(htkPrm) [trainRR, testRR]=htkTrainTest(htkPrm); fprintf('Inside test = %g%%, outside test = %g%%\n', trainRR, testRR); pamFile: 'digitMonophone.pam' feaCfgFile: 'mfcc.cfg' waveDir: '..\waveFile' sylMlfFile: 'digitSyl.mlf' phoneMlfFile: 'digitMonophone.mlf' mnlFile: 'digitMonophone.mnl' grammarFile: 'digit.grammar' feaType: 'MFCC_E' feaDim: 13 mixtureNum: 3 stateNum: 3 streamWidth: 13 Pruning-Off Pruning-Off Pruning-Off Pruning-Off Pruning-Off Inside test = 79.24%, outside test = 75.89%

¦¹®É©Ò²£¥Íªº Monophone ¦Cªí¦p¤U¡G

­ì©lÀÉ¡]htk/chineseDigitRecog/training/output/digitMonophone.mnl¡^¡G¡]¦Ç¦â°Ï°ì«ö¨â¤U§Y¥i«þ¨©¡^
sil
l
i
ng
er
s
a
n
w
u
o
q
b
j

¦Ó¹ïÀ³©ó Monophone ªº mlf Àɮצp¤U¡G

Example¡]htk/chineseDigitRecog/training/output/digitMonophone.mlf¡^¡G

­Y§ï¥Î26ºûªºMFCC¡A¥i¨£¤U¦C½d¨Ò¡G

Example 2: htk/chineseDigitRecog/training/goMonoPhone26.mhtkPrm=htkParamSet; htkPrm.pamFile='digitMonophone.pam'; htkPrm.phoneMlfFile='digitMonophone.mlf'; htkPrm.mnlFile='digitMonophone.mnl'; htkPrm.feaCfgFile='mfcc26.cfg'; htkPrm.feaType='MFCC_E_D_Z'; htkPrm.feaDim=26; htkPrm.streamWidth=[26]; disp(htkPrm) [trainRR, testRR]=htkTrainTest(htkPrm); fprintf('Inside test = %g%%, outside test = %g%%\n', trainRR, testRR); pamFile: 'digitMonophone.pam' feaCfgFile: 'mfcc26.cfg' waveDir: '..\waveFile' sylMlfFile: 'digitSyl.mlf' phoneMlfFile: 'digitMonophone.mlf' mnlFile: 'digitMonophone.mnl' grammarFile: 'digit.grammar' feaType: 'MFCC_E_D_Z' feaDim: 26 mixtureNum: 3 stateNum: 3 streamWidth: 26 Pruning-Off Pruning-Off Pruning-Off Pruning-Off Pruning-Off Inside test = 83.71%, outside test = 87.5%

­Y§ï¥Î39ºûªºMFCC¡A¥i¨£¤U¦C½d¨Ò¡G

Example 3: htk/chineseDigitRecog/training/goMonoPhone39.mhtkPrm=htkParamSet; htkPrm.pamFile='digitMonophone.pam'; htkPrm.phoneMlfFile='digitMonophone.mlf'; htkPrm.mnlFile='digitMonophone.mnl'; htkPrm.feaCfgFile='mfcc39.cfg'; htkPrm.feaType='MFCC_E_D_A_Z'; htkPrm.feaDim=39; htkPrm.streamWidth=[39]; disp(htkPrm) [trainRR, testRR]=htkTrainTest(htkPrm); fprintf('Inside test = %g%%, outside test = %g%%\n', trainRR, testRR); pamFile: 'digitMonophone.pam' feaCfgFile: 'mfcc39.cfg' waveDir: '..\waveFile' sylMlfFile: 'digitSyl.mlf' phoneMlfFile: 'digitMonophone.mlf' mnlFile: 'digitMonophone.mnl' grammarFile: 'digit.grammar' feaType: 'MFCC_E_D_A_Z' feaDim: 39 mixtureNum: 3 stateNum: 3 streamWidth: 39 Pruning-Off Pruning-Off Pruning-Off Pruning-Off Pruning-Off Inside test = 84.6%, outside test = 89.29%


Audio Signal Processing and Recognition (­µ°T³B²z»P¿ëÃÑ)