5-5 Naive Bayes Classifiers (���������������������

[english][all]

(½Ðª`·N¡G¤¤¤åª©¥»¨Ã¥¼ÀH­^¤åª©¥»¦P¨B§ó·s¡I)

Slides

¦pªG§Ú­Ì°²³]¦bµ¹©wªº¸ê®Æ¶°¤¤¡A¨C¤@ºûªº¸ê®Æ³£¬O¿W¥ßªº¡A¦b¦¹°²³]¤U¡A¨C¤@Ãþ¸ê®ÆªºPDF¥i¥H²¤Æ¦¨¦¹Ãþ¸ê®Æ¦b¨C¤@ºûªºPDFªº­¼¿n¡C´«¥y¸Ü»¡¡A§Ú­Ì¥i¥H¥ýºâ¥X¨C¤@Ãþ¸ê®Æ¦b¨C¤@­Óºû«×©Ò¹ïÀ³ªºPDF¡AµM«á±N¦P¤@Ãþ¸ê®Æªº¼Æ­ÓPDF¶i¦æ³s­¼¡A´N¥i¥H±o¨ì³o¤@Ãþ¸ê®ÆªºPDF¡C§Ú­Ìªº°²³]¥i¥H¨Ï¥Î¼Æ¾Ç¦¡¤l¨Óªí¥Ü¦p¤U¡G

p(X|C) = P(X1|C)P(X2|C) ... P(Xd|C)

¨ä¤¤ X = [X1, X2, ..., Xd] ¬O¤@­Ó¯S¼x¦V¶q¡A¦Ó C ¥Nªí¤@­Ó¯S©wÃþ§O¡C³o­Ó°²³]¬Ý¨Ó¦ü¥G¹L±j¡A¤@¯ë¹ê»Ú¥@¬Éªº¸ê®Æ¦ü¥GµLªkº¡¨¬¦¹°²³]¡A¦ý¥Ñ¦¹°²³]©Ò²£¥Íªº³æ¯Â¨©¤ó¤ÀÃþ¾¹¡]naive Bayes classifier¡A²ºÙ NBC¡^«o¬O¬Û·í¦³¹ê¥Î©Ê¡A¨ä¿ëÃѮįà±`±`¤£¿éµ¹¨ä¥¦§ó½ÆÂøªº¿ëÃѾ¹¡C

¦b¹ê°µ¤W¡A§Ú­Ì³q±`°²³]¤@ºû¸ê®Æ©Ò¹ïÀ³ªºPDF¬O°ª´µ¾÷²v±K«×¨ç¦¡¡A¦b¦¹±¡ªp¤U¡A¹ïÀ³ªºNBC¨BÆJ¥i¥H»¡©ú¦p¤U¡G

  1. °²³]¨C¤@­ÓÃþ§Oªº¸ê®Æ§¡¬O¥Ñ d ºûªº°ª´µ¾÷²v±K«×¨ç¼Æ¡]Gaussian probability density function¡^©Ò²£¥Í¡G¡G
    gi(x, m, S) = (2p)-d/2*det(S)-0.5*exp[-(x-m)TS-1(x-m)/2]
    ¨ä¤¤ m ¬O¦¹°ª´µ¾÷²v±K«×¨ç¼Æªº¥­§¡¦V¶q¡]Mean vector¡^¡AS «h¬O¨ä¦@Åܲ§¯x°}¡]Covariance matrix¡^¡A§Ú­Ì¥i¥H®Ú¾Ú MLE¡A²£¥Í³Ì¨Îªº¥­§¡¦V¶q m ©M¦@Åܲ§¯x°} S¡C
  2. ­Y¦³»Ý­n¡A¥i¥H¹ï¨C¤@­Ó°ª´µ¾÷²v±K«×¨ç¼Æ­¼¤W¤@­ÓÅv­« wi¡C
  3. ¦b¹ê»Ú¶i¦æ¤ÀÃþ®É¡Awi*gi(x, m, S) ¶V¤j¡A«h¸ê®Æ x ÁõÄÝ©óÃþ§O i ªº¥i¯à©Ê´N¶V°ª¡C

¦b¹ê»Ú¶i¦æ¹Bºâ®É¡A§Ú­Ì³q±`¤£¥h­pºâ wi*gi(x, m, S) ¡A¦Ó¬O­pºâ log(wi*gi(x, m, S)) = log(wi) + log(gi(x, m, S))¡A¥H«KÁ׶}­pºâ«ü¼Æ®É¥i¯àµo¥ÍªººØºØ°ÝÃD¡]¦pºë½T«×¤£¨¬¡B­pºâ¯Ó®É¡^¡Alog(gi(x, m, S)) ªº¤½¦¡¦p¤U¡G

log[p(ci)g(x, mi, Si)] = log(p(ci)) - (d*log(2p) + log|Si|)/2 - (x-mi)TSi-1(x-mi)/2
The decision boundary between class i and j is represented by the following trajectory:
p(ci)g(x, mi, Si) = p(cj)g(x, mj, Sj).
Taking the logrithm of both sides, we have
log(p(ci)) - (d*log(2p) + log|Si|)/2 - (x-mi)TSi-1(x-mi)/2 = log(p(cj)) - (d*log(2p) + log|S|j)/2 - (x-mj)TSj-1(x-mj)/2
After simplification, we have the decision boundary as the following equation:
(x-mj)TSj-1(x-mj) - (x-mi)TSi-1(x-mi) = log{[|S|i p2(ci)]/[|S|j p2(cj)]}
where the right-hand side is a constant. Since both (x-mj)TSj-1(x-mj) and (x-mi)TSi-1(x-mi) are quadratic, the above equation represents a decision boundary of the quadratic form in the d-dimensional feature space.

¨Ò¦p¡A¦pªG¨Ï¥Î NBC ¨Ó¹ï IRIS ¸ê®Æªº²Ä¤Tºû¤Î²Ä¥|ºû¶i¦æ¤ÀÃþ¡A¥i¨Ï¥Î¤U¦C½d¨Òµ{¦¡¡G

Example 1: nbc01dataPlot.mDS = prData('iris'); DS.input=DS.input(3:4, :); % Only take dimensions 3 and 4 for 2d visualization plotOpt=1; % Plotting [nbcPrm, logLike, recogRate, hitIndex]=nbcTrain(DS, [], plotOpt); fprintf('Recog. rate = %f%%\n', recogRate*100); Recog. rate = 96.000000%

¤W¹Ï¨q¥X¸ê®ÆÂI¡A¥H¤Î¤ÀÃþ¿ù»~ªºÂI¡]¤e¤e¡^¡C¯S§O»Ý­nª`·Nªº¬O¡A¦b¤W­zªºµ{¦¡½X¤¤¡A§Ú­Ì¥Î¨ì classWeight¡A³o¬O¤@­Ó¦V¶q¡A¥Î¨Ó«ü©w¨C¤@­ÓÃþ§OªºÅv­«¡A³q±`¦³¨âºØ°µªk¡G

§Ú­Ì¥i¥Hµe¥X¨C­ÓÃþ§O¤Î¨C­Óºû«×ªº¤@ºûPDF¨ç¼Æ¡A¥H¤Î¨ä¹ïÀ³ªº¸ê®Æ¡A½Ð¨£¤U¦C½d¨Ò¡G

Example 2: nbcPlot00.mDS=prData('iris'); DS.input=DS.input(3:4, :); [nbcPrm, logLike, recogRate, hitIndex]=nbcTrain(DS); nbcPlot(DS, nbcPrm, '1dPdf');

§Ú­Ì¤]¥i¥H±N¨C­ÓÃþ§OªºPDF¨ç¼Æ¥H¤Tºû¦±­±§e²{¡A¨Ãµe¥X¨äµ¥°ª½u¡A½Ð¨£¤U¦C½d¨Ò¡G

Example 3: nbcPlot01.mDS=prData('iris'); DS.input=DS.input(3:4, :); [nbcPrm, logLike, recogRate, hitIndex]=nbcTrain(DS); nbcPlot(DS, nbcPrm, '2dPdf');

®Ú¾Ú³o¨Ç°ª´µ±K«×¨ç¼Æ¡A§Ú­Ì´N¥i¥Hµe¥X¨C­ÓÃþ§OªºÃä¬É¡A¦p¤U¡G

Example 4: nbcPlot02.mDS=prData('iris'); DS.input=DS.input(3:4, :); [nbcPrm, logLike, recogRate, hitIndex]=nbcTrain(DS); DS.hitIndex=hitIndex; % Attach hitIndex to DS for plotting nbcPlot(DS, nbcPrm, 'decBoundary');


Data Clustering and Pattern Recognition (¸ê®Æ¤À¸s»P¼Ë¦¡¿ë»{)