17-2 HTK Example: Digit Recognition (HTK ������������������������������)

¦³Ãö©ó CHMM ªº¹ê§@¡A§Ú­Ì¤j³¡¤À¬O±Ä¥Î HTK (Hidden Markov Model Toolket) ¨Ó¶i¦æ»y®Æªº¾ã²z©M°V½m¡C¥H¤U±N¨Ï¥Î¤@­Ó²³æªº½d¨Ò¡A¨Ó»¡©ú HTK ªº¨Ï¥Î¡C¦b³o­Ó½d¨Ò¤¤¡A§Ú­Ì±N¶i¦æ¤¤¤å¼Æ¦r¿ëÃÑ¡]±q 0 ¨ì 9¡^ªº¤u§@¡A¥]§t¨Ï¥Î HTK ¶i¦æÁn¾Ç¼Ò«¬ªº°V½m¡A¨Ã­pºâ¬ÛÃöªº¿ëÃѲv¡C

¥»­¶¨Ï¥Î¤F´X­Ó¯S©wªºªþÀɦW¡]mlf¡Bscp¡Btemplate¡Bcfg¡Bpam¡Binit¡Bhmm¡Bnet¡Bgrammar µ¥¡^¡A­Y­n¥¿½TÅã¥Ü³o¨ÇÀɮשóÂsÄý¾¹¡A½Ð¨ú®ø³o¨ÇªþÀɦW©Ò¹ïÀ³ªºÀ³¥Îµ{¦¡¡A¥H«Kª½±µ±NÀɮפº®e§e²{©óÂsÄý¾¹¡C¡]­Y¤£¨ú®ø³o¨Ç¬ÛÃöªºÀ³¥Îµ{¦¡¡AÂsÄý¾¹±N¸õ¥X¹ï¸Üµøµ¡¡A¸ß°Ý¬O§_¶i¦æ¤U¸ü¡C¡^§A¥i¥H¨Ï¥Î¤U¨Òªº batch ÀɮרӨú®ø³o¨ÇªþÀɦW©MÀ³¥Îµ{¦¡ªºÃö³s¡G

­ì©lÀÉ¡]htk/delAssociation.bat¡^¡G¡]¦Ç¦â°Ï°ì«ö¨â¤U§Y¥i«þ¨©¡^
assoc .mlf=
assoc .scp=
assoc .template=
assoc .cfg=
assoc .pam=
assoc .init=
assoc .hmm=
assoc .net=
assoc .grammar=

Hint
­Y­n«ü©w¤@­Ó°ÆÀɦWªºÃö³sÀ³¥Îµ{¦¡¡A¥i¥Î assoc ©R¥O¡C¨Ò¦p¡A­Y­n¨ú®ø°ÆÀɦW¬° scp ªº¬ÛÃöÀ³¥Îµ{¦¡¡A¥i¥H¦b DOS ©R¥Oµøµ¡¤U¿é¤J¡uassoc .scp=¡v¡C

±µµÛ¡A§A¥²¶·¤U¸ü¤U¦CÀɮסG

½Ð¶}±Ò DOS µøµ¡¡A¶i¤J chineseDigitRecog/training ¥Ø¿ý«á¡Aª½±µ¦b DOS µøµ¡¤U°õ¦æ goSyl13.bat¡A§Y¥i¶i¦æ°V½m¡A¨Ã­pºâ¿ëÃѲv¡C§A¤]¥i¥H¦b MATLAB ¤U¡A¶i¤J¦¹¥Ø¿ý¡A¨Ã°õ¦æ goSyl13.m¡A¥i¥H±o¨ì¬Û¦Pªºµ²ªG¡C

¦b¶i¦æ©Ò¦³ªº¤u§@¤§«e¡A§Ú­Ì­n¤â°Ê·Ç³Æ¨â­ÓÀɮסA²Ä¤@­ÓÀɮ׬O digitSyl.pam¡A«ü©ú¦p¦ó±N¨C¤@­Ó¼Æ¦rªº«÷­µ¡]Phonetic Alphabets¡^©î¸Ñ¦¨Án¾Ç¼Ò«¬¡]Acoustic Models¡^¡A§Ú­Ì¥Ø«eªº§@ªk¡A¬O±N¨C¤@­Ó­µ¸`¬Ý¦¨¤@­ÓÁn¾Ç¼Ò«¬¡A¦p¤U¡G

­ì©lÀÉ¡]htk/chineseDigitRecog/training/digitSyl.pam¡^¡G¡]¦Ç¦â°Ï°ì«ö¨â¤U§Y¥i«þ¨©¡^
ba	ba
er	er
jiou	jiou
ling	ling
liou	liou
qi	qi
san	san
si	si
sil	sil
wu	wu
i	i

¥H¤Wªº©î¸Ñ¤èªk¡A¬OÄÝ©ó¤ñ¸û²²¤ªº¤èªk¡A©Ò±o¨ìªº¿ëÃѮĪG¤]·|¤ñ¸û®t¡C«áÄò·|¦A»¡©ú¤ñ¸ûºë²Óªº¤èªk¡A¨Ò¦p¥H Monophone ©Î¬O Biphone ¨Ó°µ¬°Án¾Ç¼Ò«¬¡C

Hint
pam ¥Nªí phonetic alphabets to model¡C

²Ä¤G­ÓÀɮ׫h¬O digitSyl.mlf¡A¬ö¿ý¨C¤@­Ó­µ°TÀɮשҹïÀ³ªº¤å¥y¤º®e¡]¥H­µ¸`¬°³æ¦ì¡^¡A¦p¤U¡G

Example¡]htk/chineseDigitRecog/training/digitSyl.mlf¡^¡G

¦b¤W­zÀɮפ¤¡Asil ¥NªíÀR­µ¡A¦]¦¹·í§Ú­Ì¦b ling «e«á³£·|¥[¤W sil¡A´N¥Nªí "0" ªºµo­µ«e«á³£¦³ÀR­µ¡C

Hint
  • mlf ¥Nªí master label file¡A¥Î¨Ó°O¿ý¨C­Ó»y­µÀɮתº¹ïÀ³¤º®e¡A¥i¥H¨Ï¥Î­µ¸`¡]Syllables¡^©Î¬O­µ¯À¡]Phoneme¡^¬°³æ¦ì¡C
  • ¥Ñ©ó HTK ¬O¥HÀɮצWºÙ¨Ó¹ïÀ³¦Ü¤å¥y¤º®e¡A©Ò¥H¨C¤@­Ó­µ°T¯S¼xªºÀɮצWºÙ¡A¥²¶·¬O¿W¤@µL¤Gªº¡C¡]­µ°TÀɮפ§¦WºÙ¡A«h¤£»Ý°ß¤@¡C¡^

¥H¤U±N±N°w¹ï MATLAB «ü¥OÀÉ goSyl13.m ¤Î Batch «ü¥OÀÉ goSyl13.bat ¨Ó¶i¦æ»¡©ú¡A³oÃä¦@¥]§t¤F¤T¤j¶µ¤u§@¡G

  1. »y­µ¯S¼xªºÂ^¨ú¡G­pºâ MFCC
  2. Án¾Ç¼Ò«¬ªº°V½m¡G¨Ï¥Î EM ¨Ó¨D¨ú³Ì¨Î°Ñ¼Æ
  3. Án¾Ç¼Ò«¬ªº®Ä¯àµû¦ô¡G¿ëÃѲvªº­pºâ
¤À§O»¡©ú¦p¤U¡C

  1. »y­µ¯S¼xªºÂ^¨ú¡G­pºâ MFCC

    1. ²£¥Í¿é¥X¥Ø¿ý
      §Ú­Ì¥ý²£¥Í¤T­Ó©ñ¸m¿é¥XÀɮתº¥Ø¿ý¡G
      • output¡G©ñ¸m¦UºØ¿é¥XÀɮסC
      • output\feature¡G©ñ¸m»y­µ¯S¼xÀɮסC
      • output\hmm¡G©ñ¸m°V½m¹Lµ{©Ò²£¥Íªº HMM °Ñ¼ÆÀɮסC
      MATLAB «ü¥O¦p¤U¡G
      mkdir('output');
      mkdir('output/feature');
      mkdir('output/hmm');
      
      Batch «ü¥O¦p¤U¡G
      for %%i in (output output\feature output\hmm) do mkdir %%i > nul 2>&1
      
      ­Y¥Ø¿ý¤w¸g¦s¦b¡A«h Batch «ü¥O¤£·|¦L¥X¥ô¦óĵ§i°T®§¡C

      Hint
      ¤W­zªº Batch «ü¥O¡A¬O¦ì©ó goSyl13.bat ªº¤º®e¡CŪªÌ­Y­n±N¤W­z«ü¥O«þ¨©¦Ü DOS µøµ¡°õ¦æ¡A½Ð°È¥²±N %%i §ï¦¨ %i¡A¥H¤UÃþ±À¡C

    2. ²£¥Í digitSyl.mnl ¤Î digitSylPhone.mlf
      ¥ý²£¥Í syl2phone.scp ÀɮסAMATLAB «ü¥O¦p¤U¡G fid=fopen('output\syl2phone.scp', 'w'); fprintf(fid, 'EX'); fclose(fid); Batch «ü¥O¦p¤U¡G @echo EX > output\syl2phone.scp ©Ò²£¥Íªº syl2phone.scp Àɮפº®e¦p¤U¡G

      ­ì©lÀÉ¡]htk/chineseDigitRecog/training/output/syl2phone.scp¡^¡G¡]¦Ç¦â°Ï°ì«ö¨â¤U§Y¥i«þ¨©¡^
      EX 
      

      ¨ä¤¤ªº¡uEX¡v¬O¥Nªí Expand¡A¨ä·N¸q¬O¡u¥Ñ­µ¸`¦r¦ê®i¶}¦¨phone¦r¦ê¡v¡A±N³Q¥Î©ó HTK ªº HLEd «ü¥O¡A¸Ô¨£¤U­z¡C

      ±µµÛ¡A§Ú­Ì¥i¥H¨Ï¥Î¤U¦C HTK ªº HLEd «ü¥O¨Ó²£¥Í digitSyl.mnl ¤Î digitSylPhone.mlf¡G

      HLEd -n output\digitSyl.mnl -d digitSyl.pam -l * -i output\digitSylPhone.mlf output\syl2phone.scp digitSyl.mlf
      
      ¦b¤W­z«ü¥O¤¤¡A§Ú­Ì¨Ï¥ÎÃC¦â¨Óªí¥ÜÀɮתº¥Î³~¡AÂŦâ¥Nªí¬O¿é¤JÀɮסA¬õ¦â¥Nªí¬O¿é¥XÀɮסC¿é¥XÀÉ®× digitSyl.mnl ¦C¥X©Ò¦³¥Î¨ìªºÁn¾Ç¼Ò«¬¡A¦p¤U¡G

      ­ì©lÀÉ¡]htk/chineseDigitRecog/training/output/digitSyl.mnl¡^¡G¡]¦Ç¦â°Ï°ì«ö¨â¤U§Y¥i«þ¨©¡^
      sil
      ling
      i
      er
      san
      si
      wu
      liou
      qi
      ba
      jiou
      

      Hint
      mnl ¥Nªí model name list¡C

      ¦Ó digitSylPhone.mlf «h¬O±N digitSyl.mlf ªº­µ¸`¸ê°TÂà´«¦¨¬° phone ¸ê°Tªºµ²ªG¡A¥H«K¥Î©óÁn¾Çªº°V½m¡AÀɮצp¤U¡G

      Example¡]htk/chineseDigitRecog/training/output/digitSylPhone.mlf¡^¡G

      Hint
      ¥Ñ©ó§Ú­Ì¬O¨Ï¥Î¤@­Ó­µ¸`¨Ó§@¬°¤@­ÓÁn¾Ç¼Ò«¬¡A¦]¦¹ digitSylPhone.mlf ©M ­ì¥ýªº digitSyl.mlf ªº¤º®e«ê¥©¤@¼Ë¡C

    3. ²£¥Í wav2fea.scp
      ±µµÛ§Ú­Ì­n¶i¦æ»y­µ¯S¼xÂ^¨ú¡A¤]´N¬O­n±N¨C¤@­Ó wav ÀÉ®×Âà´«¦¨ MFCC¡A­º¥ý§Ú­Ì¥ý²£¥Í wav2fea.scp¡AMATLAB «ü¥O¦p¤U¡G
      wavDir='..\waveFile';
      waveFiles=recursiveFileList(wavDir, 'wav');
      outFile='output\wav2fea.scp';
      fid=fopen(outFile, 'w');
      for i=1:length(waveFiles)
      	wavePath=strrep(waveFiles(i).path, '/', '\');
      	[a,b,c,d]=fileparts(wavePath);
      	fprintf(fid, '%s\t%s\r\n', wavePath, ['output\feature\', b, '.fea']);
      end
      fclose(fid);
      
      Batch «ü¥O¦p¤U¡G
      (for /f "delims=" %%i in ('dir/s/b wave\*.wav') do @echo %%i output\feature\%%~ni.fea)> output\wav2fea.scp
      
      ©Ò²£¥Íªº wav2fea.scp ¥Î¨Ó³W½d wav ÀɮשM©Ò²£¥Íªº­µ°T¯S¼xÀɮתº¹ïÀ³Ãö«Y¡A¨ä¤º®e¦p¤U¡G

      Example¡]htk/chineseDigitRecog/training/output/wav2fea.scp¡^¡G

      ¥Ñ¤W­zÀÉ®×¥i¥H¬Ý¥X¡A©Ò¦³²£¥Íªº»y­µ¯S¼xÀɮסA±N¥H fea ¬°ªþÀɦW¡A¨Ã©ñ¸m¦b output\feature ¥Ø¿ý¤U¡C

    4. ¨Ï¥Î HCopy.exe ¶i¦æ»y­µ¯S¼x©â¨ú
      ±µµÛ¡A§Ú­Ì´N¥i¥H¨Ï¥Î HTK ªº hcopy.exe «ü¥O¨Ó²£¥Í MFCC ÀɮסG
      HCopy -C mfcc.cfg -S output\wav2fea.scp
      
      ¦b¤W­z«ü¥O¤¤¡Amfcc.cfg ¬O¤@­Ó°Ñ¼ÆÀÉ¡A¥Î¨Ó³W½d²£¥Í MFCC ªº°Ñ¼Æ¡A¨ä¤º®e¦p¤U¡G Error: D:\users\jang\books\audiosignalprocessing/example/htk/chineseDigitRecog/training/mfcc.cfg does not exist!