[english][all] (½Ðª`·N¡G¤¤¤åª©¥»¨Ã¥¼ÀH^¤åª©¥»¦P¨B§ó·s¡I)
Slides for this chapter
¦p«e¤@³¹©Òz¡A¨Ï¥Î¡uÆ[¹îªk¡v¨Óºâ¥Xµ°ª¡A¨Ã¤£¬O¤ÓÃø¡A¦ý¬OYn¹q¸£¦Û°Êºâ¥Xµ°ª¡A´N»Ýn¹ïµ°T¶i¦æ¶i¤@¨Bªº¦Û°Ê¤ÀªR¡C¨Ï¥Î¹q¸£¹ï¾ã¬qµ°T¶i¦æ§ì¨úµ°ªªº¹Lµ{¡A³q±`ºÙ¬°¡uµ°ª°lÂÜ¡v¡]Pitch Tracking¡^¡A©Ò§ì¥X¨Óªºµ°ª¸ê°T¡A¦³¤U¦CÀ³¥Î¡G
- ±Û«ß¿ëÃÑ¡]Melody Recognition¡^¡G©ÎºÙ¬°¡uó°Û¿ïºq¡v¡A¤]´N¬O¦p¦ó¥Ñ¨Ï¥ÎªÌªºó°Û¡A§ä¥Xµ¼Ö¸ê®Æ®w¤¤¶¡¹ïÀ³ªººq¡C
- °ê»yªºÁn½Õ¿ëÃÑ¡]Tone Recognition¡^¡G¿ëÃѨϥΪÌÁ¿¤@¥y¸Ü®É¡A¨C¤@Ó¦rªºÁn½Õ¡]¤@Án¡B¤GÁn¡B¤TÁn¡B¥|Ánµ¥¡^¡C
- »yµ¦X¦¨ªºÃý«ß¤ÀªR¡]Prosody Analysis¡^¤¤ªºµ°ª¤ÀªR¡G¦p¦ó¦b¦X¦¨»yµ®É¡A¨Ï¥Î³Ì¦ÛµMªºµ°ª¦±½u¡C
- »yµµû¤À¤¤ªºµ½Õµû¤À¡]Intonation Assessment¡^¡G¦p¦óµû¦ô¨Ï¥ÎªÌ»¡¸Üªº»yµ¡A¨äµ°ª¦±½u¬O§_¼Ð·Ç¡C
- »yµ¿ëÃÑ¡]Speech Recognition¡^¡G§ÚÌ¥i¥H¨Ï¥Î»y¥yªºµ°ª¨Ó´£°ª»yµ¿ëÃѪº¥¿½T²v¡C
Á`¦Ó¨¥¤§¡Aµ°ª°lÂÜ¥i»¡¬Oµ°T³B²z¹Lµ{¤¤¡A³Ì°ò¥»¤]¬O³Ì«nªº¤@Àô¡A¬ÛÃöªº¬ã¨s¡A¤]¶i¦æ¤F¼Æ¤Q¦~¡A¦]¦¹§ÚÌ¥²¶·§¹¥þÁA¸Ñ¨äì²z¡A¤~¯àÄ~Äò¶i¦æ¨ä¥L¬ÛÃöªº¤ÀªR»P³B²z¡C
µ°ª°lÂܪº°ò¥»¬yµ{¦p¤U¡G
- ±N¾ã¬qµ°T°T¸¹¤Á¦¨µ®Ø¡]Frames¡^¡A¬Û¾Fµ®Ø¤§¶¡¥i¥H«Å|¡C
- ºâ¥X¨CÓµ®Ø©Ò¹ïÀ³ªºµ°ª¡C
- ±Æ°£¤£Ã©wªºµ°ªÈ¡C¡]¥i¥Ñµ¶q¨Ó¿z¿ï¡A©Î¥Ñµ°ªÈªº½d³ò¨Ó¹LÂo¡C¡^
- ¹ï¾ã¬qµ°ª¶i¦æ¥·Æ¤Æ¡A³q±`¬O¨Ï¥Î¡u¤¤¦ì¼ÆÂoªi¾¹¡v¡]Median Filters¡^¡C
¦b¤Áµ®Øªº¹Lµ{¤¤¡A§Ṳ́¹³\¥ª¥kµ®Øªº«Å|¡A¦]¦¹§ÚÌ©w¸q¡uµ®Ø²v¡v¡]Frame Rate¡^¬O¨C¬íÄÁ©Ò¥X²{ªºµ®ØӼơA¦pªG¨ú¼ËÀW²v¬O 11025¡Aµ®Øªø«×¬O 256 ÂI¡A«Å|ÂI¼Æ¬O 84¡A¨º»òµ®Ø²v´N¬O 11025/(256-84) = 64¡A´«¥y¸Ü»¡¡A§Ú̪º¹q¸£n¯à°÷¨C¬íÄÁ³B²z 64 Óµ®Ø¡A¤~¯à¹F¨ì¡u§Y®É³B²z¡vªº¥Øªº¡C¥Ü·N¹Ï¦p¤U¡G
§ÚÌÅýµ®Ø«Å|ªº¥Ø¦a¡A¥u¬O§Æ±æ¬Û¾Fµ®Ø¤§¶¡ªºÅܤƤ£·|¤Ó¤j¡A¨Ï§ì¥X¨Óªºµ°ª¦±½u§ó¨ã¦³³sÄò©Ê¡C¦ý¬O¦b¹ê»ÚÀ³¥Î®É¡Aµ®Øªº«Å|¤]¤£¯à¤Ó¤j¡A§_«h·|³y¦¨pºâ¶qªº¹L¤j¡C¦b¿ï¾Üµ®Øªº¤j¤p®É¡A¦³¤U¦C¦Ò¶q¦]¯À¡G
- µ®Øªø«×¦Ü¤Ö¥²¶·¥]§t 2 Ó°ò¥»¶g´Á¥H¤W¡A¤~¯àÅã¥Ü»yµªº¯S©Ê¡C¤wª¾¤HÁnªºµ°ª½d³ò¤j¬ù¦b 50 Hz ¦Ü 1000 Hz ¤§¶¡¡A¦]¦¹¹ï©ó¤@Óªº¨ú¼ËÀW²v¡A§ÚÌ´N¥i¥Hpºâ¥Xµ®Øªø«×ªº³Ì¤pÈ¡C¨Ò¦p¡AY¨ú¼ËÀW²v fs = 8000 Hz¡A¨º»ò·íµ°ª f = 50 Hz¡]¨Ò¦p¨k§CµªººqÁn¡^®É¡A¨CÓ°ò¥»¶g´ÁªºÂI¼Æ¬O fs/f = 8000/50 = 160¡A¦]¦¹µ®Ø¥²¶·¦Ü¤Ö¬O 320 ÂI¡FYµ°ª¬O 1000 Hz¡]¨Ò¦p¤k°ªµªººqÁn¡^®É¡A¨CÓ°ò¥»¶g´ÁªºÂI¼Æ¬O 8000/1000 = 8¡A¦]¦¹µ®Ø¥²¶·¦Ü¤Ö¬O 16 ÂI¡C
- µ®Øªø«×¤]¤£¯à¤Ó¤j¡A¤Óªøªºµ®ØµLªk§ì¨ìµ°Tªº¯S©ÊÀH®É¶¡¦ÓÅܤƪº²Ó·L²{¶H¡A¦P®Épºâ¶q¤]·|Åܤj¡C
- µ®Ø¤§¶¡ªº«Å|§¹¥þ¬O¬Ý¹q¸£ªº¹Bºâ¯à¤O¨Ó¨M©w¡AY«Å|¦h¡Aµ®Ø²v´N·|Åܤj¡Apºâ¶q´N¸òµÛÅܤj¡CY«Å|¤Ö¡]¬Æ¦Ü¥i¥H¤£«Å|©Î¸õÂI¡^¡Aµ®Ø²v´N·|Åܤp¡Apºâ¶q¤]¸òµÛÅܤp¡C
¥Ñ¤@Óµ®Øpºâ¥Xµ°ªªº¤èªk«Ü¦h¡A¥i¥H¤À¬°®É°ì©MÀW°ì¨â¤jÃþ¡G
- ®É°ì¡]Time Domain¡^
- ACF: Autocorrelation function
- AMDF: Average magnitude difference function
- SIFT: Simple inverse filter tracking
- ÀW°ì¡]Frequency Domain¡^
- Harmonic product spectrum method
- Cepstrum method
³o¨Ç¤èªk±N¦b¥H¤U¦U¤p¸`¤¶²Ð¡C
Audio Signal Processing and Recognition (µ°T³B²z»P¿ëÃÑ)