22-3 �y������

Åý¹q¸£¯à°÷Å¥À´¤Hªº¹ï¸Ü¡A¤@ª½¬O¤HÃþªø¤[¥H¨Óªº¹Ú·Q¡Cªñ¦~¨Ó¥Ñ©ó¹q¸£³t«×ªº´£¤É¡A»y­µ¿ëÃѪºÀ³¥Î¤]¶V¨Ó¶V´¶¹M¡A¨Ò¦p´¼¼z«¬¤â¾÷ªº»y­µÀ³¥Î¡]¦pÄ«ªG¤â¾÷ªº Siri »y­µ§U²z ©Î¬O¦w¨ô¤â¾÷ªº»y­µÂà¤å¦r¡^¡B´¼¼z­µ½c¡]¦p Amazon Alexa¡BGoogle Home¡^µ¥¡A³£¬O²`¤J¤H­Ì¤é±`¥Í¬¡ªº»y­µ¿ëÃѹê»ÚÀ³¥Î¡C

»y­µ¿ëÃѪºÀ³¥Î¡A¥i¥H®Ú¾Ú¤£¦Pªº¤è¦¡¨Ó¤ÀÃþ¡C²Ä¤@ºØ¤è¦¡¡A¬O®Ú¾Ú»y­µ¿ëÃѨt²Îªº¨Ï¥ÎªÌ¨Ó¤ÀÃþ¡G

²Ä¤GºØ¤è¦¡¡A¬O®Ú¾Ú»y­µ¿ëÃѨt²Îªº¥\¯à¨Ó¤ÀÃþ¡A¨Ì·ÓÃø«×¨Ó°Ï¤À¡A¥i¥H¦C¥X¦p¤U¡G¡G

¦b«Ø¥ß»y­µ¿ëÃѨt²Î¤§«e¡A§Ú­Ì¥²¶·¥ý±q»y­µ°T¸¹¤¤¤Á¥X­µ®Ø¡AµM«á±q­µ®Ø¤¤©â¥X¸ò­µ¦â¬ÛÃöªº¯S¼x¡A´Á¤¤³Ì±`¥Îªº¯S¼x´N¬O MFCC¡A³o¬O¤@­Ó¦b»y­µ¿ëÃѳ̱`¥Î¨ìªº¯S¼x¡A¨C¤@­Ó­µ®Ø³q±`¥i¥H©â¥X 13¡B26 ©Î 39 ºûªº MFCC ¦V¶q¡A³o¤è­±ªº»¡©ú©Î­pºâ¥N½X¡A³£¥i¥H¥Ñºô¸ô¤W¬d¨ì¡C

Hint

®Ú¾Ú¤W­±ªº¤ÀÃþ¡A³Ì²³æªº»y­µ¿ëÃѨt²Î¡A´N¬O¡u»yªÌ¬ÛÃöªº»y­µ©R¥O¿ëÃѨt²Î¡v¡A³q±`´N¬O¡u¥Î¦Û¤vªºÁn­µ¤ñ¹ï¦Û¤vªºÁn­µ¡v¡A¨Ò¦p¦­´Áªº¤â¾÷¡]¦p Sony Ericson T18¡^¡A§A¥i¥H¹w¿ý´X²Õ»y­µ¡A¨C¤@­Ó»y­µ¹ïÀ³¨ì¤@²Õ¹q¸Ü¸¹½X¡A¨Ò¦p¡u©ÔÄÑ¡v¹ïÀ³¨ì©ÔÄÑ©±ªº¹q¸Ü¡A¦]¦¹·í§A¹ï¤â¾÷³Û¡u©ÔÄÑ¡v®É¡A¨t²Î·|¹ï§A¿é¤JªºÁn­µ¥H¤Î¤w¸g¹w¿ý¦nªºÁn­µ¶i¦æ¤ñ¹ï¡A­Y¤ñ¹ï¥¿½T¡A¤â¾÷´N·|¦Û¦æ¼·¹q¸Ü¨ì©ÔÄÑ©±¡C¡]¦ý¬O¡A¹ï¹q¸£¦Ó¨¥¡A¤HÃþªº»y­µÅܤƫ׷¥¤j¡A­Y¬Oª÷«°ªZªº©f©f¹ïª÷«°ªZªº¤â¾÷³Û¡u©ÔÄÑ¡v¡A¤£¨£±o¦³®Ä¡A¦]¬°¤º³¡¥Î¨Ó¤ñ¹ïªº¿ý­µ¬Oª÷«°ªZªºÁn­µ¡A¦Ó¤£¬O¥L©f©fªºÁn­µ¡C¡^

Hint
T18 ªº©ÔÄѼs§i¡]¨Ï¥Î»yªÌ¬ÛÃöªº»y­µ©R¥O¿ëÃÑ¡^¡A2000¦~¡Ghttps://www.youtube.com/watch?v=M4hFuUYBGf0

­Y­n«Ø¥ß¡u»yªÌ¬ÛÃöªº»y­µ©R¥O¿ëÃÑ¡v¨t²Î¡A³Ì°ò¥»ªº¤èªk¡A´N¬O¨Ï¥Î°ÊºA®É¶¡§á¦±¡]dynamic time warping, DTW¡^¨Ó¶i¦æ¤ñ¹ï¡A³o¬O¤@­Ó°ò©ó°ÊºA³W¹º¡]dynamic programming, DP¡^ªº¤èªk¡A¥¦¥i¥H®Ú¾ÚÁ¿¸Üªº­µ¦â¨Ó¶i¦æ¤ñ¹ï¡A¦P®É¤]·|°w¹ï¤£¦Pªº»y­µ³t«×¨Ó¶i¦æ§½³¡¦ùÁY¡A¥H¹F¨ì³Ì¦nªº¹ï¦ì¡]alignment¡^®ÄªG¡A¥Ü·N¹Ï¦p¤U¡C


¹Ï 5.¡GDTW ¤ñ¹ï«á©Ò±o¨ìªº³Ì¨Î¸ô®|¹Ï¡C

¥H¤W­z½d¨Ò¦Ó¨¥¡A¦bX¶b©MY¶bªº»y­µ¤º®e³£¬O¡u²MµØ¤j¾Ç¡v¡A¦ý¬OY¶bªº»y­µ¤ñ¸û¥­Ã­¡AX¶bªº»y­µ«h¬O«e­±§Ö«á­±ºC¡A¸g¥Ñ DTW ªº¹ï¦ì¡A¥i¥H§ä¨ì¨âªÌ³Ì¨Îªº¹ï¦ì¡A¶i¦Ó¨D¥X¨â¬q»y­µªº³Ìµu¶ZÂ÷¡C¦]¦¹¡A­Y­n«Ø¥ß¤@­Ó¡u»yªÌ¬ÛÃöªº»y­µ©R¥O¿ëÃѨt²Î¡v¡A¥u­n½Ð¨Ï¥ÎªÌ¥ý¹w¿ý¤@²Õ»y­µ©R¥O¡]¨C¤@­Ó»y­µ©R¥O¥i¥H¿ý»s¦h¦¸¡A¨Ò¦p¤T¦¸¡^¡A·í¨Ï¥ÎªÌµo¥X´ú¸Õ»y­µ®É¡A´N¥i¥H¶i¦æºÝÂI°»´ú¡]endpoint detection¡^¨Ã­pºâ MFCC¡A³Ì«á®³³o¤@²Õ MFCC ©M¹w¿ý»y­µ©R¥Oªº MFCC ¨Ó¶i¦æ DTW ¤ñ¹ï¡A¶ZÂ÷³Ìµuªºªº»y­µ©R¥O¡A´N¬O§Ú­Ì­n§äªºµª®×¡C

¦Ó³Ì½ÆÂøªº»y­µ¿ëÃѨt²Î¡A´N¬O¡u»yªÌµLÃöªº¹ï¸Ü¨t²Î¡v¡A¨Ò¦pÄ«ªG¤â¾÷ªº Siri »y­µ§U²z¡B¨È°¨»¹ªº Alexa ´¼¼z­µ½c¡A¥H¤Î Google Home »y­µ§U²z¡A³o¨Ç¨t²Î´N¹³¬OµêÀÀ§U²z¯ë¡A³£¥i¥H©M¤H­Ì¶i¦æ²³æªº¹ï¸Ü¡A¦P®ÉÂÇ¥ÑÁA¸Ñ¨Ï¥ÎªÌªº·N¹Ï¡AÀ°¦£¤H­Ì°µ¤@¨Ç²³æ¨Æ±¡¡A¦p¹w©w¨®²¼¡B¬d¸ß¤Ñ®ð©Î¹q¼vµ¥¡C­Y­n«Øºc¦¹Ãþ¨t²Î¡A¨º´N­n§ï¥Î¤ñ¸û½ÆÂøªºÁn¾Ç¼Ò«¬¨Ó¶i¦æ¡A»y­µªº¯S¼xÁÙ¬O MFCC¡A¦ý¬O§Ú­Ì­n¨Ï¥Î¤£¦PªºÁn¾Ç¼Ò«¬¨Ó¥Nªí¤£¦Pªº­µ¦â¡]¤l­µ©Î¥À­µµ¥¡^¡A¨Ã®Ú¾Ú¦¹Án¾Ç¼Ò«¬¨Óºâ¥X¤@­ÓMFCC¦V¶q©Ò¹ïÀ³ªº¾÷²v±K«×¡]probability density¡^¡CÁ|¨Ò¨Ó»¡¡A§Ú­Ì¥i¥H¦¬¶° 100 ¤H©Òµo¥Xªº¥À­µ¡u£«¡v¡A¤Á¥X¨Ó­µ®Ø«á¡A¨C¤@­Ó­µ®Ø¦A©â©â¥X 39 ºûªº MFCC ¦V¶q¡A¦A¨Ï¥Î¤@­Ó°ªºû«×ªº¾÷²v±K«×¨ç¼Æ¡]probability density function, PDF¡^¨Ó«Ø¥ß³o¨Ç MFCC ¦V¶qªºÁn¾Ç¼Ò«¬¡A¦Ó«Ø¥ß¦¹¼Ò«¬³Ì±`¥Îªº¤èªk´N¬O³Ì¤j¦üµM²v¦ô´úªk¡]maximum likelihood estimate, MLE¡^¡C¤@¯ë³Ì±`¥Îªº PDF ¬O GMM (Gaussian mixture models)¡A¬O¥Ñ¤@²Õ°ª´µ¾÷²v±K«×¨ç¼Æ¡]Gaussian PDF¡^ªº¥[Åv¥­§¡©Ò²Õ¦¨¡A®Ú¾Ú³Ì¤j¦üµM²v¦ô´úªk¡A§Ú­Ì´N¥i¥H®Ú¾Ú©Òµ¹ªº¤@²Õ MFCC ¦V¶q¨Ó­pºâ GMM ªº³Ì¨Î°Ñ¼Æ­È¡A¥]§t¨C¤@­Ó°ª´µ¾÷²v±K«×¨ç¼Æªº¥­§¡¦V¶q¡]mean vector¡^©M¦@Åܲ§¯x°}¡]covariance matrix¡^¡A¥H¤Î³o¨Ç¨ç¼Æªº¥[ÅvÅv­«¡]weighting factors¡^¡C

¥H¤U¬O¨Ï¥Î°ª´µ PDF ¤Î GMM PDF ¨Ó¹ï 1-D ¸ê®Æ¶i¦æ«Ø¼Òªº¨å«¬½d¨Ò¡G


¹Ï 5.¡G¤@ºû°ª´µ PDF ªº½d¨Ò¡C


¹Ï 5.¡G¤@ºû GMM PDF ªº½d¨Ò¡A¦¹ GMM PDF ¥Ñ¤T­Ó°ª´µ PDF ªº¥[Åv¥­§¡©Ò²Õ¦¨¡C

¥H¤U¬O¨Ï¥Î°ª´µ PDF ¤Î GMM PDF ¨Ó¹ï 2-D ¸ê®Æ¶i¦æ«Ø¼Òªº¨å«¬½d¨Ò¡G


¹Ï 5.¡G¤Gºû°ª´µ PDF ªº½d¨Ò¡C


¹Ï 5.¡G¤Gºû GMM PDF ªº½d¨Ò¡A¦¹ GMM PDF ¥Ñ¥|­Ó°ª´µ PDF ªº¥[Åv¥­§¡©Ò²Õ¦¨¡C

¨Ì·Ó MLE ªº¤èªk¡A§Ú­Ì¤]¥i¥H¹ï 39-D ªº MFCC ¨Ó¶i¦æ«Ø¼Ò¡A¥u¤£¹L¬O§Ú­Ì«ÜÃø¥Î²³æªº¦±­±¹Ï©Îµ¥°ª½u¨ÓÀ˵ø«Ø¼Ò¤§«áªºµ²ªG¡C

Hint
¤@¯ëPDF©Òºâ¥X¨Óªº¼Æ­È¡A¬O¾÷²v±K«×¡A¦ý¬O¦b¹ê»Ú¹Bºâ¤¤¡A§Ú­Ì±`­n¹ï¾÷²v±K«×¶i¦æ³s­¼¡A¾É­P¼Æ­È¶V¨Ó¶V¤p¦ÓÅý¹q¸£ªº¼Æ­È¹Bºâ®e©ö²£¥Í»~®t¡A¬°¤FÁקK¦¹°ÝÃD¡A§Ú­Ì³q±`±N¾÷²v±K«×¨ú¹ï¼Æ¡A¦P®É±N¡u³s­¼¡v§ï¬°¡u³s¥[¡v¡A¥H­°§C¹q¸£ªº¼Æ­È¹Bºâ»~®t¡C¦]¦¹¡A§Ú­Ì³q±`±N¡u¾÷²v±K«×ªº¹ï¼Æ­È¡v¡]log probability density¡^ºÙ¬°¦üµM²v¡]likelihood¡^¡C

¨Ï¥Î GMM ¨Ó«Ø¥ßÁn¾Ç¼Ò«¬¡AÁÙ¬O¤@­Ó¤ñ¸û°ò¥»ªº¤èªk¡C¦pªG§Ú­Ì¦Ò¼{µo­µÀH®É¶¡¦ÓÅܪº±¡ªp¡A¨º»ò¨Ï¥Î¤@­Ó³æ¤@ªºPDF¨Ó«Ø¥ßÁn¾Ç¼Ò«¬¬O¤£¦X²zªº¡C¨Ò¦p¥À­µ¡u£¯¡v¦bµo­µªº¹Lµ{¤¤¡A§Ú­Ìªº¼L§Î¬O³sÄòÅܤƪº¡A°ò¥»¤W¬O¥Ñ¡u£«¡vÅܨì¡u£¸¡v¡A¦]¦¹­Y­n«Ø¥ß§óºë·ÇªºÁn¾Ç¼Ò«¬¡A§Ú­Ì¥i¥H§ï¥ÎHMM¡]hidden Markov models¡^¡A³o¬O¤@­Ó¥Î©ó´y­z§Ç¦C¡]sequences¡^ªº¾÷²v±K«×¨ç¼Æ¡A¨C¤@­ÓHMM¥Ñ¼Æ­Óª¬ºA¡]state¡^©Ò²Õ¦¨¡A¨C¤@­Óª¬ºA´N¬O¤@­ÓÀRºAªºPDF¡A¦Óª¬ºA¤§¶¡ªºÂಾ¥i¥H¥ÑÂಾ¾÷²v¡]transition probability¡^¨Óªí¥Ü¡C¥H¤U¬O¤@­Ó¨ã¦³¤T­Óª¬ºAªºHMM¼Ò«¬ªº¥Ü·N¹Ï¡G


¹Ï 5.¡G¨ã¦³¤T­Óª¬ºAªº HMM ¥Ü·N¹Ï¡C

Á|¨Ò¨Ó»¡¡A­Y¨Ï¥ÎHMM¨Ó¥Nªí¡u£¯¡vªºÁn¾Ç¼Ò«¬¡A¨º§Ú§Ú­Ì¥i¥H¨Ï¥Î3­Óª¬ºA¡A¨C¤@­Óª¬ºA´N¬O¥Ñ¤@­ÓGMM¨Ó¥Nªí¡Aª¬ºA¤§¶¡ªºÂಾ¥i¥H¨Ï¥Î¤@­Ó 3x3 ªºÂಾ¾÷²v¯x°}¡]transition probability matrix¡^¨Ó¥Nªí¡C³o­ÓÁn¾Ç¼Ò«¬ªº°Ñ¼Æ¡]¥]§t¤T­ÓGMMªº°Ñ¼Æ¥H¤ÎÂಾ¾÷²v¯x°}¡^¡A¤]¬O¥ÑMLEªº¤èªk¨Ó­pºâ±o¥X¡A¦ý¥Ñ©ó¨Æ¥ý¨Ã¤£ª¾¹D¨C¤@­Ó­µ®ØªºMFCC¦V¶q¬OÄÝ©ó­þ¤@­Óª¬ºA¡A¦]¦¹¦b¹ê°µ¤W¥²¶·³v¦¸¶i¦æ¤À°t¡A³Ì«á¹F¤j³Ì¤jªº¦üµM²v¡A³o­Ó¤èªkºÙ¬°¤ÀÂ_¦¡ k-means (segmental k-means)¡A¨BÆJ¦p¤U¡G
  1. ¹ï©ó¨C¤@­Ó»y¥y¡A¨Ï¥ÎDP¨Ó±N»y¥yªºMFCC¦V¶q¤À°t¨ì¨C¤@­Óª¬ºA¡C
  2. ¹ï©ó¨C¤@­Óª¬ºA¡A®Ú¾Ú³Q¤À¬£¨ìªº©Ò¦³MFCC¦V¶q¨Ó­pºâ¹ïÀ³ªºGMM³Ì¨Î°Ñ¼Æ¡C
  3. ®Ú¾Ú¨C¤@­Ó­µ®Ø©Ò³Q¤À°t¨ìªºª¬ºA¡A¨Ó­pºâÂಾ¾÷²v¯x°}¡C
  4. ¸õ¦^¨BÆJ¤@¡Aª½¨ì©Ò¦³ªº°Ñ¼Æ¦¬ÀÄ¡C

¨Ï¥ÎHMM¨Óªí¥Ü¤@­ÓÁn¾Ç¼Ò«¬¡A³q±`±o¨ìªº®ÄªG·|§ó¦n¡A¦]¬°¥¦¯à°÷ªí¥Ü¤@­Óµo­µÀH®É¶¡¦ÓÅܤƪº²{¶H¡C

¦b¹ê»Ú¿ëÃѨt²Î¤¤¡A§Ú­Ì³q±`·|§ó¥J²Ó¦a±N©Ò¦³µo­µ°Ï¤À¬°§ó°ò¥»ªº°ò¥»µo­µ³æ¦ì¡AºÙ¬°­µ¯À¡]phoeme¡^¡A³o¬O¤HÃþ»y­µ¤¤¡A¯à°÷°Ï§O¤£¦Pµo­µªº³Ì¤pÁn­µ³æ¦ì¡A¦]¦¹§Ú­Ì·|®Ú¾Ú­µ¯À¨Ó«Ø¥ßÁn¾Ç¼Ò«¬¡A¦Ó¤£¬O³æ¥Hª`­µ²Å¸¹¤¤ªº¤l­µ©Î¥À­µ¨Ó«Ø¥ß¼Ò«¬¡C¨Ò¦p¡G

¨Ò¦p¡A­Y¤£¦Ò¼{Án½Õ¡A¡u§A¦n¡vªºª`­µ²Å¸¹¬O¡u£z£¸-£~£±¡v¡Aº~»y«÷­µ¬O¡uni-hao¡v¡AÂà´«¦¨­µ¯Àªºµ²ªG«h³£¬O¡un_i-h_a_u¡v¡C

¦¹¥~¡A¬°¤F¯à°÷§óºë·Ç¦a§ì¥X¤£¦Pªºµo­µ¡A§Ú­Ì·|±N­µ¯À¦A²Ó¤À¦UºØ±¡ªp¨Ó¶i¦æÁn¾Ç¼Ò«¬ªº«Ø¼Ò¡A¥H¡u¥­¦w¡v¡]£u£¸£¶-£³ ©Î ping-an¡^¬°¨Ò¡G

¥i¥H·Q¨£¡A¨Ï¥Îmonophone¨Ó«Ø¥ßªºÁn¾Ç¼Ò«¬·|¤ñ¸û²Ê²¤¡A¦ý¬O¦û¥ÎªÅ¶¡¤p¡A¥B»Ý­nªº°V½m¸ê®Æ¶q©M­pºâ¸ê·½»Ý¨D³£¤ñ¸û¤Ö¡F¦Ó¨Ï¥Îtriphone«Ø¥ßªºÁn¾Ç¼Ò«¬·|¤ñ¸ûºë½o¡A¦ý¬O¦û¥ÎªÅ¶¡¤j¡A¥B»Ý­nªº°V½m¸ê®Æ¶q©M­pºâ¸ê·½»Ý¨D³£·|¬Û¹ï¤ñ¸û¤j¡C

¦]¦¹¡A¹ï©ó¤@¥y¤å¥y¡A§Ú­Ì¥i¥H¥ýÂà¥X«÷­µ¡AµM«á®Ú¾Ú«÷­µÂà¥X­µ¯À§Ç¦C¡AµM«á´N¥i¥H±N­µ¯À§Ç¦C¦AÂà´«¦¨HMMÁn¾Ç¼Ò«¬ªº¦ê±µ¡C­Y¥H¤å¥y¡u§A¦n¡v¨Ó»¡©ú¡A«Ø¥ßbiphone sequence¡]¤£¦Ò¼{ leading silence¡^ªº¨BÆJ¦p¤U¡G

  1. Âà«÷­µ¡G§A¦n ==> £z£¸-£~£± ©Î ni-hao
  2. Âà­µ¯À¡G£z£¸-£~£± ©Î ni-hao ==> n_i-h_a_u
  3. Âàbiphone: n+i, i+h, h+a, a+u, u+sil
  4. ¦ê±µ¦¨HMM¼Ò«¬¡A¦p¤U¡G


    ¹Ï 5.¡G¹ïÀ³¨ì¡u§A¦n¡vªº HMM ¼Ò«¬¥Ü·N¹Ï¡C

°w¹ï¤@¥y»y­µ¡A§Ú­Ì¥i¥H¥ýºâ¥X¹ïÀ³ªºMFCC¦V¶q²Õ¡AµM«á´N¥i¥H±N³o­Ó¦V¶q²Õ°e¨ì³o­Ó¦ê±µªºÁn¾Ç¼Ò«¬¡A¨Ï¥Î Viterbi search ¡]³o¤]¬O¤@ºØDPªº¤èªk¡^¨Ó±o¨ì³o­Ó»y­µ¹ï©ó³o­ÓHMMªº³Ì¤j¦üµM²v¡A§Ú­Ì¥i¥H·Q¹³³o­Ó¹Lµ{Ãþ¦ü¦b¶ñªí¡A·í§Ú­Ì§¹¦¨¶ñªí¡A´N¥i¥Hª¾¹D¨C­Ó­µ®Ø­n¤À°t¨ì­þ¤@­Óª¬ºA¡A¤~¯à±o¨ì¦üµM²vªº³Ì¤j­È¡A¹Ï§Î»¡©ú¦p¤U¡G


¹Ï 5.¡G§Ú­Ì¥i¥H¨Ï¥Î Viterbi search ±N¨C¤@­Ó­µ®Ø¤À°t¨ì HMM ªºª¬ºA¡A¥H±o¨ì³Ì¤jªº¦üµM²v¡C

¥Ñ Viterbi search ©Ò±o¨ìªº¦üµM²v¡A¥i¥H·Q¹³¦¨¬O»y­µ»P¤å¥yªº²Å¦Xµ{«×¡A¦üµM²v¶V°ª¡A¥Nªí³o¤@¬q»y­µ°T¸¹¶V¦³¥i¯à¬O¹ïÀ³¨ì³o¤@­Ó¤å¥y¡C­Y¦³¥i¿ëÃѪºn­Ó©R¥O¤å¥y¡A§Ú­Ì´N¥i¥Hºâ¥Xn­Ó¦üµM²v¡A¦üµM²v³Ì°ªªº¤å¥y¡A´N¹ïÀ³¨ì»y­µ©R¥O¿ëÃѪº³Ì¥i¯àµ²ªG¡A³o´N¬O¡u»yªÌµLÃöªº»y­µ©R¥O¿ëÃÑ¡vªº°ò¥»­ì²z¡C¦b¨t²Î¹ê§@®É¡A³o­Óªí®æ¥i¯à«Ü¤j¡]¨Ò¦p¤Q¬íªº»y­µ´N¤j¬ù·|¦³1000­Ó­µ®Ø¡A5­Ó¦rªº¤å¥y´N·|²£¥Í¤j¬ù30­ÓHMMªºª¬ºA¡]°²³]¨C­Ó¦r¥­§¡¥Ñ6­ÓHMMª¬ºA¨Ó¥Nªí¡^¡A¦]¦¹§A´N¥²¶·¹ï3¸U­ÓÀx¦s®æ¶i¦æ¶ñªí¡^¡A­Y¬O¥i¿ëÃÑ©R¥O¦³¤@¸U­Ó¤å¥y¡]¥­§¡¨C¤@¥y¦³5­Ó¦r¡^¡A¨º¾ãÅé¹Bºâ´N»Ý­n¶ñ¤J3»õ­ÓÀx¦s®æ¡I¦b¹ê»Ú¹Bºâ®É¡A§Ú­Ì³q±`ÁÙ·|¶i¦æ¦UºØÀu¤Æ¤Î²¤Æ¡A¥H«K¯à°÷¹F¨ì§Y®É¿ëÃѪº­n¨D¡C

¦pªG§Ú­Ì§ó¶i¤@¨B·Q¶i¦æ§ó½ÆÂøªº¡uÅ¥¼g¡v¡A¨º´N­n¦Ò¼{¨ì¨C­Ó¤HÁ¿¸Ü®É¡A¨ì©³·|¥Î¨ì­þ¤@¨Çµü¡A¥H¤Î³o¨Çµü¦b¦ê±µ®Éªº¥i¯à©Ê¡C¥Î¨Ó­pºâ³o¨Ç¥i¯à©Êªº¼Æ¾Ç¼Ò«¬ºÙ¬°¡u»y¨¥¼Ò«¬¡v¡]language model¡^¡A©M¤§«e©Ò»¡©úªºÁn¾Ç¼Ò«¬­è¦n¦b»y­µ¿ëÃѧêºt¬Û»²¬Û¦¨ªº¨¤¦â¡C¤@¯ëªº»y¨¥¼Ò«¬¬O¥H n-gram ¼Ò«¬¬°¥D¡An-gram ´N¬On­Óµüªº¦ê±µ¡A¦]¦¹Â²³æ¦a»¡¡A¤@­Ó¼Ò«¬¥i¥H­pºâ¤@²Õµü¦ê±µ¦b¤@°_ªº¾÷²v¡C¥H­^¤å¬°¨Ò¡A­Y¤@¥y»y­µ³Q¿ëÃѦ¨¨âºØ¥i¯à¡G

³o¨â­Ó¤å¥yªºµo­µ¬Û·í±µªñ¡A¦ý¬O§Ú­Ì­Y±Ò¥Î»y¨¥¼Ò«¬¡A´N·|ª¾¹D¤@¯ë¤H·|Á¿¡uwrech a nice beach¡vªº¾÷²v»·§C©ó¡urecognize speech¡v¡A¦]¦¹¹q¸£À³¸Ó·|¿ï¾Ü²Ä¤@­Ó¤å¥y¬°¿ëÃѵ²ªG¡A³o¤]¬O§Ú­Ì­nªº¥¿½Tµª®×¡C¦b¹ê»Ú¹Bºâ®É¡A§Ú­Ì³q±`ÁÙ·|¥H¾ð¡]trees¡^©Î¹Ï¡]graphs¡^¨Ó«Ø¥ß§ó½ÆÂøªº¸ê®Æµ²ºc¡]¨Ò¦p word lattice¡^¡A¨Ã¦b¦¹¸ê®Æµ²ºc¶i¦æ¦UºØ·j´M¤ÎÀu¤Æ¡A¥H´Á±æ¦b¥i§Ô¨üªº®É¶¡¤º¡]¦p¤@¬í¡^¯à°÷¦^¶Ç¿ëÃѵ²ªG¡A¦ý³o¤è­±²o¯A¬Û·í¦h¸ê®Æµ²ºc©Mºtºâªkªº²Ó¸`¡A¦b¦¹¤£¦AÂØ­z¡C¨å«¬ªº word lattice ½d¨Ò¦p¤U¡]¨Ó·½¡Ghttp://berlin.csie.ntnu.edu.tw/SpeechProject/Research/Transcription/Acoustic_Lookahead.htm¡^¡G


¹Ï 5.¡G¨å«¬ªº word lattice¡A¥i¥[¤J»y¨¥¼Ò«¬¥H­pºâ¿ëÃѵ²ªG¡C

¤W­z¨Ï¥ÎHMMªº¤èªk¡A¤w¸g³Q¥Î¤F¼Æ¤Q¦~¡A¦ý¬Oªñ´Á¬y¦æªºDNN¡]deep neural networks¡^¤èªk¡A©Òªº¨ìªº¿ëÃѮĪG§ó¦n¡A¦ý¬O»Ý­nªº­pºâ¶q§ó¤j¡A¨ä°ò¥»·§©À¬O¨Ï¥ÎDNN¨Ó¨ú¥NGMM¡]¦]¦¹­ì¨ÓªºGMM-HMMªº¬[ºc´N³Q¨ú¥N¦¨ DNN-HMM¡^¡A¨Ã¨Ï¥ÎGPU¨Ó¶i¦æ¤j¶qªºÀu¤Æ¹Bºâ¡A©Ò¥H¤~¯à±o¨ì§ó¦nªº¿ëÃѮĪG¡C


Audio Signal Processing and Recognition (­µ°T³B²z»P¿ëÃÑ)