[english][all] (½Ðª`·N¡G¤¤¤åª©¥»¨Ã¥¼ÀH^¤åª©¥»¦P¨B§ó·s¡I)
Slides
¦pªG§Ú̱N¨C¤@µ§¸ê®Æµø¬°¦b°ªºûªÅ¶¡¤¤ªº¤@ÂI¡A¦Ó¥B³o¨Ç¦P¤@Ãþ§Oªº¸ê®ÆÂI¬O¥Ñ¤@Ó°ªºû°ª´µ¾÷²v±K«×¨ç¼Æ©Ò²£¥Í¡A¨º»ò§ÚÌ´N¥i¥H¨Ï¥Î MLE ªº¤èªk¨Ó¨D¥X³oÓ°ª´µ±K«×¨ç¼Æªº³Ì¨Î°Ñ¼ÆÈ¡C¨Ï¥Î³oºØ¤èªk©Ò¨D±oªº¤ÀÃþ¾¹¡AºÙ¬°¡u¤G¦¸¤ÀÃþ¾¹¡v¡]Quadratic classifier¡^¡A¦]¬°¦¹¤èªk©Ò²£¥Íªº¨Mµ¦¤À¬É½u¡]Decision Boundaries¡^³£¬O¿é¤J¯S¼xªº¤G¦¸¨ç¼Æ¡C¨ä°µªk¥i¥H²z¦p¤U¡G
- °²³]¨C¤@ÓÃþ§Oªº¸ê®Æ§¡¬O¥Ñ d ºûªº°ª´µ¾÷²v±K«×¨ç¼Æ¡]Gaussian probability density function¡^©Ò²£¥Í¡G¡G
gi(x, m, S) = (2p)-d/2*det(S)-0.5*exp[-(x-m)TS-1(x-m)/2] ¨ä¤¤ m ¬O¦¹°ª´µ¾÷²v±K«×¨ç¼Æªº¥§¡¦V¶q¡]Mean vector¡^¡AS «h¬O¨ä¦@Åܲ§¯x°}¡]Covariance matrix¡^¡A§ÚÌ¥i¥H®Ú¾Ú MLE¡A²£¥Í³Ì¨Îªº¥§¡¦V¶q m ©M¦@Åܲ§¯x°} S¡C- Y¦³»Ýn¡A¥i¥H¹ï¨C¤@Ó°ª´µ¾÷²v±K«×¨ç¼Æ¼¤W¤@ÓÅv« wi¡C
- ¦b¹ê»Ú¶i¦æ¤ÀÃþ®É¡Awi*gi(x, m, S) ¶V¤j¡A«h¸ê®Æ x ÁõÄÝ©óÃþ§O i ªº¥i¯à©Ê´N¶V°ª¡C
¦b¹ê»Ú¶i¦æ¹Bºâ®É¡A§Ú̳q±`¤£¥hpºâ wi*gi(x, m, S) ¡A¦Ó¬Opºâ log(wi*gi(x, m, S)) = log(wi) + log(gi(x, m, S))¡A¥H«KÁ׶}pºâ«ü¼Æ®É¥i¯àµo¥ÍªººØºØ°ÝÃD¡]¦pºë½T«×¤£¨¬¡Bpºâ¯Ó®É¡^¡Alog(gi(x, m, S)) ªº¤½¦¡¦p¤U¡G $$ \ln \left( p(c_i)g(\mathbf{x}|\mathbf{\mu}, \Sigma) \right) = \ln p(c_i) - \frac{d \ln(2\pi)+ \ln |\Sigma|}{2} - \frac{(\mathbf{x}-\mathbf{\mu})^T\Sigma^{-1}(\mathbf{x}-\mathbf{\mu})}{2} $$ The decision boundary between class i and j is represented by the following trajectory: $$ p(c_i)g(\mathbf{x}|\mathbf{\mu}_i, \Sigma_i) = p(c_j)g(\mathbf{x}|\mathbf{\mu}_j, \Sigma_j) $$ Taking the logrithm of both sides, we have $$ \ln p(c_i) - \frac{d \ln(2\pi)+ \ln |\Sigma_i|}{2} - \frac{(\mathbf{x}-\mathbf{\mu}_i)^T\Sigma_i^{-1}(\mathbf{x}-\mathbf{\mu}_i)}{2} = \ln p(c_j) - \frac{d \ln(2\pi)+ \ln |\Sigma_j|}{2} - \frac{(\mathbf{x}-\mathbf{\mu}_j)^T\Sigma_j^{-1}(\mathbf{x}-\mathbf{\mu}_j)}{2} $$ After simplification, we have the decision boundary as the following equation: $$ 2\ln p(c_i) + \ln |\Sigma_i| - (\mathbf{x}-\mathbf{\mu}_i)^T\Sigma_i^{-1}(\mathbf{x}-\mathbf{\mu}_i) = 2\ln p(c_j) + \ln |\Sigma_j| - (\mathbf{x}-\mathbf{\mu}_j)^T\Sigma_j^{-1}(\mathbf{x}-\mathbf{\mu}_j) $$ $$ (\mathbf{x}-\mathbf{\mu}_i)^T\Sigma_i^{-1}(\mathbf{x}-\mathbf{\mu}_i) - (\mathbf{x}-\mathbf{\mu}_j)^T\Sigma_j^{-1}(\mathbf{x}-\mathbf{\mu}_j) = \ln \left( \frac{p^2(c_i)|\Sigma_i|}{p^2(c_j)|\Sigma_j|} \right) $$ where the right-hand side is a constant. Since both $(\mathbf{x}-\mathbf{\mu}_i)^T\Sigma_i^{-1}(\mathbf{x}-\mathbf{\mu}_i)$ and $(\mathbf{x}-\mathbf{\mu}_j)^T\Sigma_j^{-1}(\mathbf{x}-\mathbf{\mu}_j)$ are quadratic, the above equation represents a decision boundary of the quadratic form in the d-dimensional feature space.
In particular, if $\Sigma_i=\Sigma_j=\Sigma$, the decision boundary is reduced to a linear equation, as follows: $$ \underbrace{(\mathbf{\mu}_i-\mathbf{\mu}_j) \Sigma}_{\mathbf{c}} \mathbf{x} = \underbrace{\mathbf{\mu}_i \Sigma \mathbf{\mu}_i - \mathbf{\mu}_j \Sigma \mathbf{\mu}_j - \ln \left( \frac{p^2(c_i)|\Sigma_i|}{p^2(c_j)|\Sigma_j|} \right)}_{constant} \Longrightarrow \mathbf{cx}=constant $$
¦b¥H¤U½d¨Ò¡A§Ú̱N¨Ï¥Î¤G¦¸¤ÀÃþ¾¹¨Ó¹ï IRIS ¸ê®Æªº²Ä¤Tºû¤Î²Ä¥|ºû¶i¦æ¤ÀÃþ¡Cº¥ý¡A§ÚÌ¥ýµe¥X¸ê®Æªº¤À§G¹Ï¡G
±µµÛ¡A§ÚÌ¥i¥H¨Ï¥Î qcTrain ¨Ó«Ø³y¤@Ó QC¡G
¥Ñ¤W¨Ò¥iª¾¡AY¥u¨Ï¥Î²Ä¤Tºû©M²Ä¥|ºû¯S¼x¡AQC ªº¿ëÃѲv¬O 98%¡A¦¹Ãþ´ú¸ÕºÙ¬°¤º³¡´ú¸Õ¡C
¤W¹Ï¨q¥X¸ê®ÆÂI¡A¥H¤Î¤ÀÃþ¿ù»~ªºÂI¡]¤e¤e¡^¡C¯S§O»Ýnª`·Nªº¬O¡A¦b¤Wzªºµ{¦¡½X¤¤¡A§Ú̥Ψì classWeight¡A³o¬O¤@Ó¦V¶q¡A¥Î¨Ó«ü©w¨C¤@ÓÃþ§OªºÅv«¡A³q±`¦³¨âºØ°µªk¡G
- ¦pªGnº¡¨¬¨©¤ó¤ÀÃþªºì²z¡]½Ð¨£«áÄò³¹¸`¡^¡A¦¹Åv«¥i¥H³]©w¬O¨C¤@ÓÃþ§Oªº¸ê®ÆӼơC¡]pºâ¨CÓÃþ§Oªº¸ê®ÆӼơA¥i¥Ñ getClassDataCount.m ¨Ó¹F¦¨¡C¡^
- ¦pªG¨CÓÃþ§Oªº¸ê®Æ¥X½uªº¾÷²v¬Û®t¤£¤j¡A§ÚÌ¥i±N¨C¤@ÓÃþ§OªºÅv«³£³]©w¦¨ 1¡C
§Ṳ́]¥i¥H±N¨CÓÃþ§Oªº°ª´µ±K«×¨ç¼Æ¥H¤Tºû¦±±§e²{¡A¨Ãµe¥X¨äµ¥°ª½u¡A½Ð¨£¤U¦C½d¨Ò¡G
®Ú¾Ú³o¨Ç°ª´µ±K«×¨ç¼Æ¡A§ÚÌ´N¥i¥Hµe¥X¨CÓÃþ§OªºÃä¬É¡A¦p¤U¡G
¨Æ¹ê¤W¡A¦p¤§«e¤§±À¾É¡A³o¨ÇÃä¬É³£¬O¤G¦¸¨ç¼Æ¡A¥i¥Ñ¤Wz¹Ï§ÎÅçÃÒ¤§¡C
¦pªG¨CÓÃþ§Oªº¸ê®Æ¶q¬Û®t¤Ó¤j¡A¤G¦¸¤ÀÃþ¾¹¥i¯à±o¨ì¬Ý¦ü¿ù»~ªºµ²ªG¡A¦ý¹ê»Ú¤W«o¬O¹ïªº¡C³oºØ±¡ªp§Ú̯d«Ý§@·~¨Ó°Q½×¡C
§Ṳ́]¥i¥H®Ú¾Ú false positive ©M false negative ©Ò±a¨Óªº cost ¨Ó¶i¦æÅv«ªº³]©w¡A«áÄò¸Ôz¡C
¥t¤@¤è±¡A¦pªG°V½m¸ê®Æ¤ñ¸û½ÆÂø¡AµLªk¥H²³æªº°ª´µ¾÷²v±K«×¨ç¼Æ¨Ó¶i¦æº¡·Nªº¤ÀÃþ¡A¦¹®É§ÚÌ´N¥i¥H¿ï¥ÎÃþ¦ü¦ý¸û½ÆÂøªº¤èªk¡A¨Ò¦p Gaussian Mixture Models¡A²ºÙ GMM¡A¸Ô¨£«áz¡C
Data Clustering and Pattern Recognition (¸ê®Æ¤À¸s»P¼Ë¦¡¿ë»{)