[english][all] (½Ðª`·N¡G¤¤¤åª©¥»¨Ã¥¼ÀH^¤åª©¥»¦P¨B§ó·s¡I)
Slides
¦pªG§ÚÌ°²³]¦bµ¹©wªº¸ê®Æ¶°¤¤¡A¨C¤@ºûªº¸ê®Æ³£¬O¿W¥ßªº¡A¦b¦¹°²³]¤U¡A¨C¤@Ãþ¸ê®ÆªºPDF¥i¥H²¤Æ¦¨¦¹Ãþ¸ê®Æ¦b¨C¤@ºûªºPDFªº¼¿n¡C´«¥y¸Ü»¡¡A§ÚÌ¥i¥H¥ýºâ¥X¨C¤@Ãþ¸ê®Æ¦b¨C¤@Óºû«×©Ò¹ïÀ³ªºPDF¡AµM«á±N¦P¤@Ãþ¸ê®Æªº¼ÆÓPDF¶i¦æ³s¼¡A´N¥i¥H±o¨ì³o¤@Ãþ¸ê®ÆªºPDF¡C§Ú̪º°²³]¥i¥H¨Ï¥Î¼Æ¾Ç¦¡¤l¨Óªí¥Ü¦p¤U¡G
p(X|C) = P(X1|C)P(X2|C) ... P(Xd|C) ¨ä¤¤ X = [X1, X2, ..., Xd] ¬O¤@Ó¯S¼x¦V¶q¡A¦Ó C ¥Nªí¤@Ó¯S©wÃþ§O¡C³oÓ°²³]¬Ý¨Ó¦ü¥G¹L±j¡A¤@¯ë¹ê»Ú¥@¬Éªº¸ê®Æ¦ü¥GµLªkº¡¨¬¦¹°²³]¡A¦ý¥Ñ¦¹°²³]©Ò²£¥Íªº³æ¯Â¨©¤ó¤ÀÃþ¾¹¡]naive Bayes classifier¡A²ºÙ NBC¡^«o¬O¬Û·í¦³¹ê¥Î©Ê¡A¨ä¿ëÃѮįà±`±`¤£¿éµ¹¨ä¥¦§ó½ÆÂøªº¿ëÃѾ¹¡C
¦b¹ê°µ¤W¡A§Ú̳q±`°²³]¤@ºû¸ê®Æ©Ò¹ïÀ³ªºPDF¬O°ª´µ¾÷²v±K«×¨ç¦¡¡A¦b¦¹±¡ªp¤U¡A¹ïÀ³ªºNBC¨BÆJ¥i¥H»¡©ú¦p¤U¡G
- °²³]¨C¤@ÓÃþ§Oªº¸ê®Æ§¡¬O¥Ñ d ºûªº°ª´µ¾÷²v±K«×¨ç¼Æ¡]Gaussian probability density function¡^©Ò²£¥Í¡G¡G
gi(x, m, S) = (2p)-d/2*det(S)-0.5*exp[-(x-m)TS-1(x-m)/2] ¨ä¤¤ m ¬O¦¹°ª´µ¾÷²v±K«×¨ç¼Æªº¥§¡¦V¶q¡]Mean vector¡^¡AS «h¬O¨ä¦@Åܲ§¯x°}¡]Covariance matrix¡^¡A§ÚÌ¥i¥H®Ú¾Ú MLE¡A²£¥Í³Ì¨Îªº¥§¡¦V¶q m ©M¦@Åܲ§¯x°} S¡C- Y¦³»Ýn¡A¥i¥H¹ï¨C¤@Ó°ª´µ¾÷²v±K«×¨ç¼Æ¼¤W¤@ÓÅv« wi¡C
- ¦b¹ê»Ú¶i¦æ¤ÀÃþ®É¡Awi*gi(x, m, S) ¶V¤j¡A«h¸ê®Æ x ÁõÄÝ©óÃþ§O i ªº¥i¯à©Ê´N¶V°ª¡C
¦b¹ê»Ú¶i¦æ¹Bºâ®É¡A§Ú̳q±`¤£¥hpºâ wi*gi(x, m, S) ¡A¦Ó¬Opºâ log(wi*gi(x, m, S)) = log(wi) + log(gi(x, m, S))¡A¥H«KÁ׶}pºâ«ü¼Æ®É¥i¯àµo¥ÍªººØºØ°ÝÃD¡]¦pºë½T«×¤£¨¬¡Bpºâ¯Ó®É¡^¡Alog(gi(x, m, S)) ªº¤½¦¡¦p¤U¡G
log[p(ci)g(x, mi, Si)] = log(p(ci)) - (d*log(2p) + log|Si|)/2 - (x-mi)TSi-1(x-mi)/2 The decision boundary between class i and j is represented by the following trajectory:p(ci)g(x, mi, Si) = p(cj)g(x, mj, Sj). Taking the logrithm of both sides, we havelog(p(ci)) - (d*log(2p) + log|Si|)/2 - (x-mi)TSi-1(x-mi)/2 = log(p(cj)) - (d*log(2p) + log|S|j)/2 - (x-mj)TSj-1(x-mj)/2 After simplification, we have the decision boundary as the following equation:(x-mj)TSj-1(x-mj) - (x-mi)TSi-1(x-mi) = log{[|S|i p2(ci)]/[|S|j p2(cj)]} where the right-hand side is a constant. Since both (x-mj)TSj-1(x-mj) and (x-mi)TSi-1(x-mi) are quadratic, the above equation represents a decision boundary of the quadratic form in the d-dimensional feature space.¨Ò¦p¡A¦pªG¨Ï¥Î NBC ¨Ó¹ï IRIS ¸ê®Æªº²Ä¤Tºû¤Î²Ä¥|ºû¶i¦æ¤ÀÃþ¡A¥i¨Ï¥Î¤U¦C½d¨Òµ{¦¡¡G
¤W¹Ï¨q¥X¸ê®ÆÂI¡A¥H¤Î¤ÀÃþ¿ù»~ªºÂI¡]¤e¤e¡^¡C¯S§O»Ýnª`·Nªº¬O¡A¦b¤Wzªºµ{¦¡½X¤¤¡A§Ú̥Ψì classWeight¡A³o¬O¤@Ó¦V¶q¡A¥Î¨Ó«ü©w¨C¤@ÓÃþ§OªºÅv«¡A³q±`¦³¨âºØ°µªk¡G
- ¦pªGnº¡¨¬¨©¤ó¤ÀÃþªºì²z¡]½Ð¨£«áÄò³¹¸`¡^¡A¦¹Åv«¥i¥H³]©w¬O¨C¤@ÓÃþ§Oªº¸ê®ÆӼơC¡]pºâ¨CÓÃþ§Oªº¸ê®ÆӼơA¥i¥Ñ dsClassSize.m ¨Ó¹F¦¨¡C¡^
- ¦pªG¨CÓÃþ§Oªº¸ê®Æ¥X½uªº¾÷²v¬Û®t¤£¤j¡A§ÚÌ¥i±N¨C¤@ÓÃþ§OªºÅv«³£³]©w¦¨ 1¡C
§ÚÌ¥i¥Hµe¥X¨CÓÃþ§O¤Î¨CÓºû«×ªº¤@ºûPDF¨ç¼Æ¡A¥H¤Î¨ä¹ïÀ³ªº¸ê®Æ¡A½Ð¨£¤U¦C½d¨Ò¡G
§Ṳ́]¥i¥H±N¨CÓÃþ§OªºPDF¨ç¼Æ¥H¤Tºû¦±±§e²{¡A¨Ãµe¥X¨äµ¥°ª½u¡A½Ð¨£¤U¦C½d¨Ò¡G
®Ú¾Ú³o¨Ç°ª´µ±K«×¨ç¼Æ¡A§ÚÌ´N¥i¥Hµe¥X¨CÓÃþ§OªºÃä¬É¡A¦p¤U¡G
Data Clustering and Pattern Recognition (¸ê®Æ¤À¸s»P¼Ë¦¡¿ë»{)