[english][all] (½Ðª`·N¡G¤¤¤åª©¥»¨Ã¥¼ÀH^¤åª©¥»¦P¨B§ó·s¡I)
·í§Ú̦b¤ÀªRÁnµ®É¡A³q±`¥H¡uµu®É¶Z¤ÀªR¡v¡]Short-term Analysis¡^¬°¥D¡A¦]¬°µ°T¦bµu®É¶¡¤º¬O¬Û¹ïéwªº¡C§Ú̳q±`±NÁnµ¥ý¤Á¦¨µ®Ø¡]Frame¡^¡A¨CÓµ®Øªø«×¤j¬ù¦b 20 ms ¥ª¥k¡A¦A®Ú¾Úµ®Ø¤ºªº°T¸¹¨Ó¶i¦æ¤ÀªR¡C¦b¤@Ó¯S©wµ®Ø¤º¡A§ÚÌ¥i¥HÆ[¹î¨ìªº¤TÓ¥DnÁnµ¯S¼x¥i»¡©ú¦p¤U¡G
³o¨Ç¯S¼x¥i¥Î¹Ï§Î»¡©ú¦p¤U¡G
- µ¶q¡]Volume¡^¡G¥NªíÁnµªº¤j¤p¡A¥i¥ÑÁnµ°T¸¹ªº¾_´T¨ÓÃþ¤ñ¡A¾_´T¶V¤j¡A¥Nªí¦¹Ánµªi§Îªºµ¶q¶V¤j¡Cµ¶q¤SºÙ¬°¯à¶q¡]Energy¡^©Î±j«×¡]Intensity¡^µ¥¡C
- µ°ª¡]Pitch¡^¡G¥NªíÁnµªº°ª§C¡A¥i¥Ñ°ò¥»ÀW²v¡]Fundamental Frequency¡^¨ÓÃþ¤ñ¡A³o¬O°ò¥»¶g´Á¡]Fundamental Period¡^ªºË¼Æ¡CÁnµªº°ò¥»ÀW²v¶V°ª¡A¥Nªíµ°ª¶V°ª¡F¤Ï¤§¡AÁnµªº°ò¥»ÀW²v¶V§C¡A¥Nªíµ°ª¶V§C¡C
- µ¦â¡]Timbre¡^¡G¥NªíÁnµªº¤º®e¡]¨Ò¦p^¤åªº¥Àµ¡^¡A¥i¥Ñ¨C¤@Óªi§Î¦b¤@Ó°ò¥»¶g´ÁªºÅܤƨÓÃþ¤ñ¡C¤£¦Pªºµ¦â§Y¥Nªí¤£¦Pªºµ°T¤º®e¡A¨Ò¦p¤£¦Pªº¦r¥À¦³¤£¦Pªºµoµ¡A³£¬O¥Ñ©óµ¦â¤£¦P¦Ó²£¥Í¡C
¦pªG¬O¥Î¤HÁn¨Ó»¡©ú¡A³o¨Ç»yµ¯S¼xªºª«²z·N¸q¦p¤U¡G
- µ¶q¡G¥NªíªÍ³¡À£ÁY¤O¶qªº¤j¤p¡A¤O¶q¶V¤j¡Aµ¶q¶V¤j¡C
- µ°ª¡G¥NªíÁn±a¾_°Êªº§ÖºC¡A¾_°Ê¶V§Ö¡Aµ°ª·|¶V°ª¡C
- µ¦â¡G¥Nªí¼L®B©M¦ÞÀYªº¦ì¸m©M§Îª¬¡A¤£¦Pªº¦ì¸m©M§Îª¬¡A´N·|²£¥Í¤£¦Pªº»yµ¤º®e¡C
¦³Ãö³o¨Ç»yµ¯S¼xªº§ì¨ú©M¤ÀªR¡A·|¦b«áÄò³¹¸`¦³¸Ô²Ó»¡©ú¡C¯S§Onª`·Nªº¬O¡A³o¨Ç¯S¼x³£¬O¥Nªí¡u¤H¦Õªº·Pı¡v¡A¨Ã¨S¦³¤@©wªº¼Æ¾Ç¤½¦¡¥i´M¡A©Ò¥H·í§Ú̸յۦb¡u¶q¤Æ¡v³o¨Ç¯S¼x®É¡A¥u¬O®Ú¾Ú¤@¨Ç¼Æ¾Ú©M¸gÅç¨Ó¶q¤Æ¡A¨ÓºÉ¶q¹Gªñ¤H¦Õªº·Pı¡A¦ý¨Ã¤£¥Nªí³o¨Ç¡u¶q¤Æ¡v«áªº¼Æ¾Ú©Î¤½¦¡´N¥i¥H§¹¥þ¥NªíÁnµªº¯S¼x¡C
µ°T¯S¼x©â¨úªº°ò¥»¤è¦¡¦p¤U¡G
- ±Nµ°T¤Á¦¨¤@ÓÓµ®Ø¡Aµ®Øªø«×¤j¬ù¬O 20~30 ms¡Cµ®ØY¤Ó¤j¡A´NµLªk§ì¥Xµ°TÀH®É¶¡Åܤƪº¯S©Ê¡F¤Ï¤§¡Aµ®ØY¤Ó¤p¡A´NµLªk§ì¥Xµ°Tªº¯S©Ê¡C¤@¯ë¦Ó¨¥¡Aµ®Ø¥²¶·¯à°÷¥]§t¼ÆÓµ°Tªº°ò¥»¶g´Á¡C¡]¥t¡Aµ®Øªø«×³q±`¬O 2 ªº¾ã¼Æ¦¸¤è¡AY¤£¬O¡A«h¦b¶i¦æ¡u³Å¥ß¸Âà´«¡v®É¡A»Ý¸É¹s¦Ü 2 ªº¾ã¼Æ¦¸¤è¡A¥H«K¨Ï¥Î¡u§Ö³t³Å¥ß¸Âà´«¡v¡C¡^
- Y¬O§Æ±æ¬Û¾Fµ®Ø¤§¶¡ªºÅܤƤ£¬O¤Ó¤j¡A¥i¥H¤¹³\µ®Ø¤§¶¡¦³«Å|¡A«Å|³¡¤À¥i¥H¬Oµ®Øªø«×ªº 1/2 ¨ì 2/3 ¤£µ¥¡C¡]«Å|³¡¤À¶V¦h¡A¹ïÀ³ªºpºâ¶q¤]´N¶V¤j¡C¡^
- °²³]¦b¤@Óµ®Ø¤ºªºµ°T¬Oéwªº¡A¹ï¦¹µ®Ø¨D¨ú¯S¼x¡A¦p¹L¹s²v¡Bµ¶q¡Bµ°ª¡BMFCC °Ñ¼Æ¡BLPC °Ñ¼Æµ¥¡C
- ®Ú¾Ú¹L¹s²v¡Bµ¶q¤Îµ°ªµ¥¡A¶i¦æºÝÂI°»´ú¡]Endpoint Detection¡^¡A¨Ã«O¯dºÝÂI¤ºªº¯S¼x¸ê°T¡A¥H«K¶i¦æ¤ÀªR©Î¿ëÃÑ¡C
¦b¶i¦æ¤Wz¤ÀªR®É¡A¦³´XÓ¦Wµü±`¥Î¨ì¡A»¡©ú¦p¤U¡G
- µ®ØÂI¼Æ¡]Frame Size¡^¡G¨C¤@Óµ®Ø©Ò§t¦³ªºÂI¼Æ¡C
- µ®Ø«Å|¶q¡]Frame Overlap¡^¡Gµ®Ø¤§¶¡«Å|ªºÂI¼Æ¡C
- µ®Ø¸õ¶Z¡]Frame Step or Hop Size¡^¡G¦¹µ®Ø°_ÂI©M¤U¤@Óµ®Ø°_ÂIªº¶ZÂ÷ÂI¼Æ¡Aµ¥©óµ®ØÂI¼Æ´î¥hµ®Ø«Å|¡C
- µ®Ø²v¡]Frame Rate¡^¡G¨C¬í¥X²{ªºµ®Ø¼Æ¥Ø¡Aµ¥©ó¨ú¼ËÀW²v°£¥Hµ®Ø¸õ¶Z¡C
Á|¨Ò¦Ó¨¥¡A¦pªG¨ú¼ËÀW²v fs=16000 ¥B¨C¤@Óµ®Ø©Ò¹ïÀ³ªº®É¶¡¬O 25 ms¡A«Å| 15 ms¡A¨º»ò
- Frame size = fs*25/1000 = 400 ÂI¡C
- Frame overlap = fs*15/1000 = 240 ÂI¡C
- Frame step (or hop size) = 400-240 = 160 ÂI¡C
- Frame rate = fs/160 = 100 frames/sec¡C
Audio Signal Processing and Recognition (µ°T³B²z»P¿ëÃÑ)