english version
(½Ðª`·N¡G¤¤¤åª©¥»¨Ã¥¼ÀH^¤åª©¥»¦P¨B§ó·s¡I)
¦pªG§Ú̱N¨C¤@µ§¸ê®Æªº¦UÓ¯S¼x¡]feature¡^µø¬°¥¦ªº®y¼Ð¡A¨º»ò§ÚÌ´N¥i¥H±N¥þ³¡ªº¸ê®Æµø¬°¤@¸s¤À§G¦b°ªºû«×ªÅ¶¡¤¤ªºÂI¡A¦Ó¨Cµ§¸ê®Æªº¯S¼x¼Æ¥Ø«K¥iµø¬°¸Óµ§¸ê®Æªººû«×¡CÁ|¨Ò¨Ó»¡¡AÕY§Ú̲{¦b¾Ö¦³¤T¦Êµ§§¡±a¦³¤TÓ¯S¼xªº¸ê®Æ¡A§ÚÌ«K¥i±N³o¨Ç¸ê®Æµø¬°300Ó¦b¤T«×ªÅ¶¡¤¤ªºÂI¡G
¦b¤W¹Ï¤¤¡A¥ª¹Ï©M¥k¹Ï¦³¬Û¦Pªº300µ§¸ê®Æ¡A¨Cµ§¸ê®Æ§¡§t¦³¤TÓ¯S¼x¡C¥ª¹Ïªº¸ê®Æ¬Ý¦üÂø¶ÃµL³¹¡A³o¬O¦]¬°§Ú̱N 3D ªº¹Ï§Î§ë¼v¨ì 2D ªº¥±¤Wªº½t¬G¡C¥un§Ú̵yµyÂà°Ê§Ú̪ºµø¨¤¡A«K¥i²M·¡¦a¬Ý¥X³o 300 µ§¸ê®Æ¤ÀÄÝ©ó¤TÓ¤£¦Pªº¸s²Õ¡A¦p¥k¹Ï¡C¦Ó¦p¦ó§ä¨ì¤@Ó¾A·íµø¨¤¡A¨Ó±N¸ê®Æ§ë¼v¨ì§Cºû«×ªºªÅ¶¡¡A¥H¨Ï±o¤£¦PÃþ§Oªº¸ê®Æ¯à°÷´²±o¶V¶}¶V¦n¡A´N¬O¯S¼xºé¨úªº¥Ø¼Ð¡C
§Ṳ́]¥i¥H¨Ï¥Î¼Æ¾Çªº¤è¦¡¨Ó»¡©ú¯S¼x¿ï¨ú¡C°²³]§Ú̦³¤@²Õ¯S¼x V = {v1 , v2 ,... , vd}¡A§Ú̧Ʊæ³z¹L¬YºØ¤ÀÃþ¾¹¨Ó±o¨ì¿ëÃѲv J(£»)¡A¦¹¿ëÃѲv¬O©Ò¿ï¨ú¯S¼xªº¨ç¼Æ¡F¦Ó¯S¼xºé¨úªº¥Ø¼Ð¡A´N¬On§ä¥X¤@²ÕŲ§O¯à¤O³Ì¨Îªºªº¶°¦X S¡A¨Ï±o J(S)¡ÙJ(T)¡A¨ä¤¤ S »P T ªº¤¸¯À§¡¥Ñ V ¸g¥Ñ½u©Ê²Õ¦X¦Ó¦¨¡C
º¥ý¡A§Ú̱N«e¤@Ó½d¨Òªº 3D ¸ê®ÆÂI¡A§ë¼v¨ì 2D ¥±¤W¡A´N¥i¥H¬Ý¥X LDA ªº®ÄªG¡A¦p¤U¡G
·í§Ú̧ë¼v¨ì LDA ªº«e¨âÓºû«×¡AÃþ§O¤§¶¡ªº°Ï¤À«Ü©úÅã¡A³o´N¬O LDA ªº®ÄªG¡C
¦b¤U±½d¨Ò¤¤¡A¬O§ÚÌ°w¹ï IRIS ¸ê®Æ¶i¦æ½u©ÊÃѧO¤ÀªR¡G
¦b¤Wz½d¨Ò¤¤¡A§Ú̱ĥΠ1-nearest-neighbor classifier ¥H¤Îleave-one-out ¨Ó¶i¦æ¿ëÃѲvªº´ú¸Õ¡C¦b²Ä¤@ӹϤ¤¡A§Ú̱N 150 µ§ IRIS ¸ê®Æ§ë¼v©ó²Ä¤@²Õ©M²Ä¤G²ÕÃѧO¦V¶q²Õ¦¨ªº¥±¤W¡A¥i±o¨ì 98.00% ªº¿ëÃѲv¡A¬Û·í©ó¥u¦³ 3 Ó¸ê®ÆÂIªº¤ÀÃþ¿ù»~¡CY§ë¼v©ó²Ä¤T²Õ©M²Ä¥|²ÕÃѧO¦V¶q²Õ¦¨ªº¥±¤W¡A¥i±o¨ì²Ä¤GӹϡA¥i¨£¤£¦PÃþ§Oªº¸ê®ÆÂI¦³¬Û·í¤jªº«Å|³¡¤À¡A¦]¦¹¿ëÃѲv¤]¸û§C¡A¥u¦³ 87.33%¡A¬Û·í©ó 19 Ó¤ÀÃþ¿ù»~ªº¸ê®ÆÂI¡C¡]¦b¤W¹Ï¤¤¡A©Ò¦³¤ÀÃþ¿ù»~ªº¸ê®ÆÂI³£¬O¥H¶Â¦â¤e¤e¨Óªí¥Ü¡C¡^
Y¨Ï¥Î WINE ¸ê®Æ¨Ó¶i¦æ LDA §ë¼v¡A±o¨ìªº®ÄªG¦p¤U¡G
¦P¼Ë§ÚÌ¥i¥H¬Ý¥X¡A§ë¼v¨ì«e¤Gºûªº¸ê®Æ³£·|¦³¤ñ¸û¦nªºÃþ§O¤À¶}µ{«×¡C
Y¥H KNNC ¤Î leave-one-out ¨Ó´ú¸Õ LDA §ë¼vªººû«×¹ï¿ëÃѲvªº¼vÅT¡A¥i¨Ï¥Î¤U¦C½d¨Òµ{¦¡¨Ó´ú¸Õ IRIS ¸ê®Æ¡G
¦³¦¹¤]¥i¥H¬Ý¥X¡A¨Ã«D¯S¼xӼƶV¦h¡A¿ëÃѲv¶V¦n¡C¥H¤W¨Ò¦Ó¨¥¡A³Ì¦nªº¿ëÃѲv¬O 98.00%¡Aµo¥Í¦b§ë¼v¨ì¤G«×ªÅ¶¡®É¡A¹ïÀ³ªº²V²c¯x°}¦p¤U¡G
¦b¤Wz½d¨Ò¤¤¡AY¥ª¹Ï¬° A ¯x°}¡A¥k¹Ï¬° B ¯x°}¡A«h A(i,j) ¥Nªí²Ä i Ãþ³Q¤À¨ì²Ä j ÃþªºÓ¼Æ¡A¦Ó B(i, j) «h¥Nªí²Ä i Ãþ³Q¤À¨ì²Ä j Ãþªº¾÷²v¡Aº¡¨¬ B(i, 1) + B(i, 2) + B(i, 3) = 100% for all i¡C
Y¥H¬Û¦Pªº¤è¦¡¨Ó´ú¸Õ wine ¸ê®Æ¡Aµ²ªG¦p¤U¡G
¥H¤W¨Ò¦Ó¨¥¡A³Ì¦nªº¿ëÃѲv¬O 98.31%¡Aµo¥Í¦b§ë¼v¨ì¤»«×ªÅ¶¡®É¡A¹ïÀ³ªº²V²c¯x°}¦p¤U¡G
Y§Ú̱N¤Wz½d¨Ò¼g¦¨¤@Ó¨ç¼Æ ldaPerfViaKnncLoo.m¡A«h¥i¥Î¦¹¨ç¼Æ¨Ó´ú¸Õ¡u¸ê®Æ¥¿³W¤Æ¡v¹ï©ó¿ëÃѲvªº¼vÅT¡A½Ð¨£¤U¦C½d¨Ò¡G
¦³¤Wz½d¨Ò¥i¥H¬Ý¥X¡A¹ï©ó³oÓÀ³¥Î¦Ó¨¥¡A¨Ï¥Î¤F¸ê®Æ¥¿³W¤Æ¡A¨Ï±o¿ëÃѲv´£¤É«Ü¦h¡A¤×¨ä¬O¦b¸ê®Æºû«×¬O 6 ®É¡A¿ëÃѲv¥i¹F 100%¡C
¦pªG¸ê®Æºû«×¤j©ó¸ê®Æµ§¼Æ¡A«h LDA ³q±`·|¥X²{¿ù»~ªºµ²ªG¡A¦]¬°¹Bºâ¹Lµ{¤¤¡A¼ÆÓ¯x°}¥i¯à±µªñ Singular¡A¾ÉPpºâ¥X¨Óªº eigenvalues ¦³½Æ¼Æ¡C¤@¯ë¸Ñ¨Mªº¤è®×¡A¨Æ¥ý¥Î PCA §ë¼v¨ì§CºûªÅ¶¡¡A¥H«KºÉ¶qºû«ù¸ê®ÆªºªÅ¶¡¶ZÂ÷¯S©Ê¡AµM«á¦A¨Ï¥Î LDA ¨D¨ú¹ï©ó¤ÀÃþ³Ì¨Îªº§ë¼v¤è¦¡¡C
References:
- J. Duchene and S. Leclercq, "An optimal transformation for discriminant and principal component analysis", IEEE Trans. PAMI, vol. 10, pp.978-983, 1988
More info:
Appendix:
- How to prove $T=W+B$ in the crisp case:
$$
\begin{array}{rcl}
T & = & \sum_{i=1}^n (a_i-\mu)(a_i-\mu)^T\\
& = & \sum_{i=1}^n (a_i a_i^T - \mu\mu^T)\\
& = & \sum_{i=1}^n a_i a_i^T - n \mu \mu^T\\
& = & AA^T-n\mu\mu^T\\
\end{array}
$$
$$
\begin{array}{rcl}
W & = & \sum_{k=1}^c w_k\\
& = & \sum_{k=1}^c (A_k A_k^T - n_k \mu_k \mu_k^T)\\
& = & \sum_{k=1}^c A_k A_k^T - \sum_{k=1}^c n_k \mu_k \mu_k^T\\
& = & AA^T- \sum_{k=1}^c n_k \mu_k \mu_k^T\\
\end{array}
$$
$$
\begin{array}{rcl}
B & = & \sum_{k=1}^c n_k (\mu_k-\mu)(\mu_k-\mu)^T\\
& = & \sum_{k=1}^c n_k \mu_k \mu_k^T - n\mu\mu^T\\
\end{array}
$$
Thus we have
$$
T=W+B.
$$
- How to prove $T=W+B$ in the fuzzy case:
Define
- $u_{k,i}$ = degree of point $i$ in class $k$.
- $\alpha_k = \sum_{i=1}^n u_{k,i}$ = size of class k.
Then
$$
\mu_k = \frac{\sum_{i=1}^n u_{k,i} a_i}{\sum_{i=1}^n u_{k,i}} = \frac{\sum_{i=1}^n u_{k,i} a_i}{\alpha_k} \rightarrow \alpha_k \mu_k = \sum_{i=1}^n u_{k,i}a_i.
$$
Moreover, we have the following identities:
$$
\left\{
\begin{array}{l}
\sum_{k=1}^c u_{k,i} = 1.\\
\sum_{k=1}^c \alpha_k = \sum_{i=1}^n \sum_{k=1}^c u_{k,i} = n.\\
\sum_{k=1}^c \alpha_k \mu_k = \sum_{i=1}^n (\sum_{k=1}^c u_{k,i}) a_i = \sum_{i=1}^n a_i = n \mu.\\
\end{array}
\right.
$$
Using the above identities, we have
$$
\begin{array}{rcl}
T & = & \sum_{i=1}^n (a_i-\mu)(a_i-\mu)^T\\
& = & \sum_{i=1}^n (a_i a_i^T - \mu\mu^T)\\
& = & \sum_{i=1}^n a_i a_i^T - n \mu \mu^T\\
& = & AA^T-n\mu\mu^T\\
\end{array}
$$
$$
\begin{array}{rcl}
W_k & = & \sum_{i=1}^n u_{k,i} (a_i-\mu_k)(a_i-\mu_k)^T\\
& = & \sum_{i=1}^n u_{k,i} a_i a_i^T - (\sum_{i=1}^n u_{k,i} a_i) \mu_k^T - \mu_k ({\sum_{i=1}^n u_{k,i}} a_i^T) + \sum_{i=1}^n u_{k,i}\mu_k \mu_k^T\\
& = & \sum_{i=1}^n u_{k,i} a_i a_i^T - (\alpha_k \mu_k)\mu_k^T - \mu_k (\alpha_k \mu_k^T) + \sum_{i=1}^n u_{k,i}\mu_k \mu_k^T\\
& = & \sum_{i=1}^n u_{k,i} a_i a_i^T - \alpha_k \mu_k \mu_k^T\\
\end{array}
$$
$$
\begin{array}{rcl}
W & = & \sum_{k=1}^c w_k\\
& = & \sum_{i=1}^n \sum_{k=1}^c u_{k,i} a_i a_i^T - \sum_{k=1}^c \alpha_k \mu_k \mu_k^T\\
& = & \sum_{i=1}^n a_i a_i^T - \sum_{k=1}^c \alpha_k \mu_k \mu_k^T\\
\end{array}
$$
$$
\begin{array}{rcl}
B & = & \sum_{k=1}^c \alpha_k (\mu_k-\mu)(\mu_k-\mu)^T\\
& = & \sum_{k=1}^c \alpha_k \mu \mu^T - (\sum_{k=1}^c \alpha_k \mu_k) \mu^T - \mu (\sum_{k=1}^c \alpha_k \mu_k)^T + (\sum_{k=1}^c \alpha_k) \mu \mu^T\\
& = & \sum_{k=1}^c \alpha_k \mu \mu^T - (n \mu) \mu^T - \mu (n \mu)^T + (n) \mu \mu^T\\
& = & \sum_{k=1}^c \alpha_k \mu_k \mu_k^T - n\mu\mu^T\\
\end{array}
$$
Thus we have $T=W+B$ in the fuzzy case where each data point belong to a class to a certain degree.
Data Clustering and Pattern Recognition (¸ê®Æ¤À¸s»P¼Ë¦¡¿ë»{)