5-5 Naive Bayes Classifiers (??貝?????

[english][all]

(Ъ`NG媩åH^媩PBsI)

Slides

pGڭ̰]bwƶAC@ƳOWߪAb]UAC@ƪPDFiH²ƦƦbC@PDFnCyܻAڭ̥iHXC@ƦbC@ӺשҹPDFAMNP@ƪƭPDFisANiHoo@ƪPDFCڭ̪]iHϥμƾǦlӪܦpUG

p(X|C) = P(X1|C)P(X2|C) ... P(Xd|C)

X = [X1, X2, ..., Xd] O@ӯSxVqA C N@ӯSwOCoӰ]ݨӦGLjA@ڥ@ɪƦGLk]AѦ]Ҳͪ¨]naive Bayes classifierA² NBC^oO۷ΩʡAѮį``鵹䥦ѾC

b갵WAڭ̳q`]@ƩҹPDFOvKר禡AbpUANBCBJiHpUG

  1. ]C@OƧO d vKרơ]Gaussian probability density function^Ҳ͡GG
    gi(x, m, S) = (2p)-d/2*det(S)-0.5*exp[-(x-m)TS-1(x-m)/2]
    m OvKרƪVq]Mean vector^AS hO@ܲx}]Covariance matrix^Aڭ̥iHھ MLEAͳ̨ΪVq m M@ܲx} SC
  2. YݭnAiHC@ӰvKרƭW@v wiC
  3. bڶiɡAwi*gi(x, m, S) VjAh x ݩO i iʴNVC

bڶiBɡAڭ̳q`hp wi*gi(x, m, S) AӬOp log(wi*gi(x, m, S)) = log(wi) + log(gi(x, m, S))AHK׶}pƮɥioͪغذD]pTפBpӮɡ^Alog(gi(x, m, S)) pUG

log[p(ci)g(x, mi, Si)] = log(p(ci)) - (d*log(2p) + log|Si|)/2 - (x-mi)TSi-1(x-mi)/2
The decision boundary between class i and j is represented by the following trajectory:
p(ci)g(x, mi, Si) = p(cj)g(x, mj, Sj).
Taking the logrithm of both sides, we have
log(p(ci)) - (d*log(2p) + log|Si|)/2 - (x-mi)TSi-1(x-mi)/2 = log(p(cj)) - (d*log(2p) + log|S|j)/2 - (x-mj)TSj-1(x-mj)/2
After simplification, we have the decision boundary as the following equation:
(x-mj)TSj-1(x-mj) - (x-mi)TSi-1(x-mi) = log{[|S|i p2(ci)]/[|S|j p2(cj)]}
where the right-hand side is a constant. Since both (x-mj)TSj-1(x-mj) and (x-mi)TSi-1(x-mi) are quadratic, the above equation represents a decision boundary of the quadratic form in the d-dimensional feature space.

ҦpApGϥ NBC ӹ IRIS ƪĤTβĥ|iAiϥΤUCdҵ{G

Example 1: nbc01dataPlot.mDS = prData('iris'); DS.input=DS.input(3:4, :); % Only take dimensions 3 and 4 for 2d visualization plotOpt=1; % Plotting [nbcPrm, logLike, recogRate, hitIndex]=nbcTrain(DS, [], plotOpt); fprintf('Recog. rate = %f%%\n', recogRate*100); Recog. rate = 96.000000%

WϨqXIAHΤ~I]ee^CSOݭn`NOAbWz{XAڭ̥Ψ classWeightAoO@ӦVqAΨӫwC@OvAq`ذkG

ڭ̥iHeXCOΨCӺת@PDFơAHΨơAШUCdҡG

Example 2: nbcPlot00.mDS=prData('iris'); DS.input=DS.input(3:4, :); [nbcPrm, logLike, recogRate, hitIndex]=nbcTrain(DS); nbcPlot(DS, nbcPrm, '1dPdf');

ڭ̤]iHNCOPDFƥHTe{AõeX䵥uAШUCdҡG

Example 3: nbcPlot01.mDS=prData('iris'); DS.input=DS.input(3:4, :); [nbcPrm, logLike, recogRate, hitIndex]=nbcTrain(DS); nbcPlot(DS, nbcPrm, '2dPdf');

ھڳoǰKרơAڭ̴NiHeXCOɡApUG

Example 4: nbcPlot02.mDS=prData('iris'); DS.input=DS.input(3:4, :); [nbcPrm, logLike, recogRate, hitIndex]=nbcTrain(DS); DS.hitIndex=hitIndex; % Attach hitIndex to DS for plotting nbcPlot(DS, nbcPrm, 'decBoundary');


Data Clustering and Pattern Recognition (ƤsP˦{)