11-3 LDA (線性è??¥å???

english version (½Ðª`·N¡G¤¤¤åª©¥»¨Ã¥¼ÀH­^¤åª©¥»¦P¨B§ó·s¡I)

¦pªG§Ú­Ì±N¨C¤@µ§¸ê®Æªº¦U­Ó¯S¼x¡]feature¡^µø¬°¥¦ªº®y¼Ð¡A¨º»ò§Ú­Ì´N¥i¥H±N¥þ³¡ªº¸ê®Æµø¬°¤@¸s¤À§G¦b°ªºû«×ªÅ¶¡¤¤ªºÂI¡A¦Ó¨Cµ§¸ê®Æªº¯S¼x¼Æ¥Ø«K¥iµø¬°¸Óµ§¸ê®Æªººû«×¡CÁ|¨Ò¨Ó»¡¡A­Õ­Y§Ú­Ì²{¦b¾Ö¦³¤T¦Êµ§§¡±a¦³¤T­Ó¯S¼xªº¸ê®Æ¡A§Ú­Ì«K¥i±N³o¨Ç¸ê®Æµø¬°300­Ó¦b¤T«×ªÅ¶¡¤¤ªºÂI¡G

Example 1: inputExtraction01.mDS=prData('random3'); subplot(1,2,1); dsScatterPlot3(DS); view(155, 46); subplot(1,2,2); dsScatterPlot3(DS); view(-44, 22);

¦b¤W¹Ï¤¤¡A¥ª¹Ï©M¥k¹Ï¦³¬Û¦Pªº300µ§¸ê®Æ¡A¨Cµ§¸ê®Æ§¡§t¦³¤T­Ó¯S¼x¡C¥ª¹Ïªº¸ê®Æ¬Ý¦üÂø¶ÃµL³¹¡A³o¬O¦]¬°§Ú­Ì±N 3D ªº¹Ï§Î§ë¼v¨ì 2D ªº¥­­±¤Wªº½t¬G¡C¥u­n§Ú­ÌµyµyÂà°Ê§Ú­Ìªºµø¨¤¡A«K¥i²M·¡¦a¬Ý¥X³o 300 µ§¸ê®Æ¤ÀÄÝ©ó¤T­Ó¤£¦Pªº¸s²Õ¡A¦p¥k¹Ï¡C¦Ó¦p¦ó§ä¨ì¤@­Ó¾A·íµø¨¤¡A¨Ó±N¸ê®Æ§ë¼v¨ì§Cºû«×ªºªÅ¶¡¡A¥H¨Ï±o¤£¦PÃþ§Oªº¸ê®Æ¯à°÷´²±o¶V¶}¶V¦n¡A´N¬O¯S¼xºé¨úªº¥Ø¼Ð¡C

Hint
¦b¤W­z½d¨Ò¤¤¡AŪªÌ¥i¦b MATLAB ¹Ï¶b¶i¦æ·Æ¹«ªº©ì©ñ¡A§Y¥i¬Ý¨ìµø¨¤ªº§ïÅÜ¡A«Ü¦³½ì¡A½Ð¸Õ¸Õ¬Ý¡I

§Ú­Ì¤]¥i¥H¨Ï¥Î¼Æ¾Çªº¤è¦¡¨Ó»¡©ú¯S¼x¿ï¨ú¡C°²³]§Ú­Ì¦³¤@²Õ¯S¼x V = {v1 , v2 ,... , vd}¡A§Ú­Ì§Æ±æ³z¹L¬YºØ¤ÀÃþ¾¹¨Ó±o¨ì¿ëÃѲv J(£»)¡A¦¹¿ëÃѲv¬O©Ò¿ï¨ú¯S¼xªº¨ç¼Æ¡F¦Ó¯S¼xºé¨úªº¥Ø¼Ð¡A´N¬O­n§ä¥X¤@²ÕŲ§O¯à¤O³Ì¨Îªºªº¶°¦X S¡A¨Ï±o J(S)¡ÙJ(T)¡A¨ä¤¤ S »P T ªº¤¸¯À§¡¥Ñ V ¸g¥Ñ½u©Ê²Õ¦X¦Ó¦¨¡C

­º¥ý¡A§Ú­Ì±N«e¤@­Ó½d¨Òªº 3D ¸ê®ÆÂI¡A§ë¼v¨ì 2D ¥­­±¤W¡A´N¥i¥H¬Ý¥X LDA ªº®ÄªG¡A¦p¤U¡G

Example 2: ldaRandom301.mDS=prData('random3'); DS2=lda(DS); DS2.input=DS2.input(1:2, :); DS3=lda(DS); DS3.input=DS3.input(end-1:end, :); subplot(2,1,1); dsScatterPlot(DS2); axis image xlabel('Input 1'); ylabel('Input 2'); title('Projected to the first two dim of LDA'); subplot(2,1,2); dsScatterPlot(DS3); axis image xlabel('Input 3'); ylabel('Input 4'); title('Projected to the last two dim of LDA');

·í§Ú­Ì§ë¼v¨ì LDA ªº«e¨â­Óºû«×¡AÃþ§O¤§¶¡ªº°Ï¤À«Ü©úÅã¡A³o´N¬O LDA ªº®ÄªG¡C

¦b¤U­±½d¨Ò¤¤¡A¬O§Ú­Ì°w¹ï IRIS ¸ê®Æ¶i¦æ½u©ÊÃѧO¤ÀªR¡G

Example 3: ldaIris2d01.mDS = prData('iris'); dataNum = size(DS.input, 2); DS2 = lda(DS); % ====== Projection to the first two eigenvectors DS3=DS2; DS3.input=DS2.input(1:2, :); subplot(2,1,1); [recogRate, computed] = knncLoo(DS3, [], 1); title(sprintf('LDA projection of %s data onto the first 2 discriminant vectors', DS.dataName)); xlabel(sprintf('KNNC''s leave-one-out recog. rate = %d/%d = %g%%', sum(DS3.output==computed), dataNum, 100*recogRate)); % ====== Projection to the last two eigenvectors DS3=DS2; DS3.input=DS2.input(end-1:end, :); subplot(2,1,2); [recogRate, computed] = knncLoo(DS3, [], 1); title(sprintf('LDA projection of %s data onto the last 2 discriminant vectors', DS.dataName)); xlabel(sprintf('KNNC''s leave-one-out recog. rate = %d/%d = %g%%', sum(DS3.output==computed), dataNum, 100*recogRate));

¦b¤W­z½d¨Ò¤¤¡A§Ú­Ì±Ä¥Î 1-nearest-neighbor classifier ¥H¤Îleave-one-out ¨Ó¶i¦æ¿ëÃѲvªº´ú¸Õ¡C¦b²Ä¤@­Ó¹Ï¤¤¡A§Ú­Ì±N 150 µ§ IRIS ¸ê®Æ§ë¼v©ó²Ä¤@²Õ©M²Ä¤G²ÕÃѧO¦V¶q²Õ¦¨ªº¥­­±¤W¡A¥i±o¨ì 98.00% ªº¿ëÃѲv¡A¬Û·í©ó¥u¦³ 3 ­Ó¸ê®ÆÂIªº¤ÀÃþ¿ù»~¡C­Y§ë¼v©ó²Ä¤T²Õ©M²Ä¥|²ÕÃѧO¦V¶q²Õ¦¨ªº¥­­±¤W¡A¥i±o¨ì²Ä¤G­Ó¹Ï¡A¥i¨£¤£¦PÃþ§Oªº¸ê®ÆÂI¦³¬Û·í¤jªº­«Å|³¡¤À¡A¦]¦¹¿ëÃѲv¤]¸û§C¡A¥u¦³ 87.33%¡A¬Û·í©ó 19 ­Ó¤ÀÃþ¿ù»~ªº¸ê®ÆÂI¡C¡]¦b¤W¹Ï¤¤¡A©Ò¦³¤ÀÃþ¿ù»~ªº¸ê®ÆÂI³£¬O¥H¶Â¦â¤e¤e¨Óªí¥Ü¡C¡^

­Y¨Ï¥Î WINE ¸ê®Æ¨Ó¶i¦æ LDA §ë¼v¡A±o¨ìªº®ÄªG¦p¤U¡G

Example 4: ldaWine2d01.mDS = prData('wine'); dataNum = size(DS.input, 2); DS2 = lda(DS); % ====== Projection to the first two eigenvectors DS3=DS2; DS3.input=DS2.input(1:2, :); subplot(2,1,1); [recogRate, computed] = knncLoo(DS3, [], 1); title(sprintf('LDA projection of %s data onto the first 2 discriminant vectors', DS.dataName)); xlabel(sprintf('KNNC''s leave-one-out recog. rate = %d/%d = %g%%', sum(DS3.output==computed), dataNum, 100*recogRate)); % ====== Projection to the last two eigenvectors DS3=DS2; DS3.input=DS2.input(end-1:end, :); subplot(2,1,2); [recogRate, computed] = knncLoo(DS3, [], 1); title(sprintf('LDA projection of %s data onto the last 2 discriminant vectors', DS.dataName)); xlabel(sprintf('KNNC''s leave-one-out recog. rate = %d/%d = %g%%', sum(DS3.output==computed), dataNum, 100*recogRate));

¦P¼Ë§Ú­Ì¥i¥H¬Ý¥X¡A§ë¼v¨ì«e¤Gºûªº¸ê®Æ³£·|¦³¤ñ¸û¦nªºÃþ§O¤À¶}µ{«×¡C

­Y¥H KNNC ¤Î leave-one-out ¨Ó´ú¸Õ LDA §ë¼vªººû«×¹ï¿ëÃѲvªº¼vÅT¡A¥i¨Ï¥Î¤U¦C½d¨Òµ{¦¡¨Ó´ú¸Õ IRIS ¸ê®Æ¡G

Example 5: ldaIrisDim01.mDS=prData('iris'); [featureNum, dataNum] = size(DS.input); [recogRate, computed] = knncLoo(DS); fprintf('All data ===> LOO recog. rate = %d/%d = %g%%\n', sum(DS.output==computed), dataNum, 100*recogRate); DS2 = lda(DS); recogRate=[]; for i = 1:featureNum DS3=DS2; DS3.input=DS3.input(1:i, :); [recogRate(i), computed] = knncLoo(DS3); fprintf('LDA dim = %d ===> LOO recog. rate = %d/%d = %g%%\n', i, sum(DS3.output==computed), dataNum, 100*recogRate(i)); end plot(1:featureNum, 100*recogRate, 'o-'); grid on xlabel('No. of projected features based on LDA'); ylabel('LOO recognition rates using KNNC (%)');All data ===> LOO recog. rate = 144/150 = 96% LDA dim = 1 ===> LOO recog. rate = 143/150 = 95.3333% LDA dim = 2 ===> LOO recog. rate = 147/150 = 98% LDA dim = 3 ===> LOO recog. rate = 142/150 = 94.6667% LDA dim = 4 ===> LOO recog. rate = 144/150 = 96%

¦³¦¹¤]¥i¥H¬Ý¥X¡A¨Ã«D¯S¼x­Ó¼Æ¶V¦h¡A¿ëÃѲv¶V¦n¡C¥H¤W¨Ò¦Ó¨¥¡A³Ì¦nªº¿ëÃѲv¬O 98.00%¡Aµo¥Í¦b§ë¼v¨ì¤G«×ªÅ¶¡®É¡A¹ïÀ³ªº²V²c¯x°}¦p¤U¡G

Example 6: ldaIrisConf01.mDS=prData('iris'); DS2 = lda(DS); DS3=DS2; DS3.input=DS3.input(1:2, :); [recogRate, computedOutput] = knncLoo(DS3); confMat=confMatGet(DS3.output, computedOutput); opt=confMatPlot('defaultOpt'); opt.className=DS.outputName; confMatPlot(confMat, opt);

¦b¤W­z½d¨Ò¤¤¡A­Y¥ª¹Ï¬° A ¯x°}¡A¥k¹Ï¬° B ¯x°}¡A«h A(i,j) ¥Nªí²Ä i Ãþ³Q¤À¨ì²Ä j Ãþªº­Ó¼Æ¡A¦Ó B(i, j) «h¥Nªí²Ä i Ãþ³Q¤À¨ì²Ä j Ãþªº¾÷²v¡Aº¡¨¬ B(i, 1) + B(i, 2) + B(i, 3) = 100% for all i¡C

­Y¥H¬Û¦Pªº¤è¦¡¨Ó´ú¸Õ wine ¸ê®Æ¡Aµ²ªG¦p¤U¡G

Example 7: ldaWineDim01.mDS=prData('wine'); [featureNum, dataNum] = size(DS.input); [recogRate, computed] = knncLoo(DS); fprintf('All data ===> LOO recog. rate = %d/%d = %g%%\n', sum(DS.output==computed), dataNum, 100*recogRate); DS2 = lda(DS); recogRate=[]; for i = 1:featureNum DS3=DS2; DS3.input=DS3.input(1:i, :); [recogRate(i), computed] = knncLoo(DS3); fprintf('LDA dim = %d ===> LOO recog. rate = %d/%d = %g%%\n', i, sum(DS3.output==computed), dataNum, 100*recogRate(i)); end plot(1:featureNum, 100*recogRate, 'o-'); grid on xlabel('No. of projected features based on LDA'); ylabel('LOO recognition rates using KNNC (%)');All data ===> LOO recog. rate = 137/178 = 76.9663% LDA dim = 1 ===> LOO recog. rate = 168/178 = 94.382% LDA dim = 2 ===> LOO recog. rate = 168/178 = 94.382% LDA dim = 3 ===> LOO recog. rate = 168/178 = 94.382% LDA dim = 4 ===> LOO recog. rate = 173/178 = 97.191% LDA dim = 5 ===> LOO recog. rate = 174/178 = 97.7528% LDA dim = 6 ===> LOO recog. rate = 175/178 = 98.3146% LDA dim = 7 ===> LOO recog. rate = 172/178 = 96.6292% LDA dim = 8 ===> LOO recog. rate = 173/178 = 97.191% LDA dim = 9 ===> LOO recog. rate = 170/178 = 95.5056% LDA dim = 10 ===> LOO recog. rate = 168/178 = 94.382% LDA dim = 11 ===> LOO recog. rate = 159/178 = 89.3258% LDA dim = 12 ===> LOO recog. rate = 143/178 = 80.3371% LDA dim = 13 ===> LOO recog. rate = 137/178 = 76.9663%

¥H¤W¨Ò¦Ó¨¥¡A³Ì¦nªº¿ëÃѲv¬O 98.31%¡Aµo¥Í¦b§ë¼v¨ì¤»«×ªÅ¶¡®É¡A¹ïÀ³ªº²V²c¯x°}¦p¤U¡G

Example 8: ldaWineConf01.mDS=prData('wine'); DS2 = lda(DS); DS3=DS2; DS3.input=DS3.input(1:6, :); [recogRate, computedOutput] = knncLoo(DS3); confMat=confMatGet(DS3.output, computedOutput); confMatPlot(confMat);

­Y§Ú­Ì±N¤W­z½d¨Ò¼g¦¨¤@­Ó¨ç¼Æ ldaPerfViaKnncLoo.m¡A«h¥i¥Î¦¹¨ç¼Æ¨Ó´ú¸Õ¡u¸ê®Æ¥¿³W¤Æ¡v¹ï©ó¿ëÃѲvªº¼vÅT¡A½Ð¨£¤U¦C½d¨Ò¡G

Example 9: ldaWineDim02.mDS=prData('wine'); recogRate1=ldaPerfViaKnncLoo(DS); DS2=DS; DS2.input=inputNormalize(DS2.input); % input normalization recogRate2=ldaPerfViaKnncLoo(DS2); [featureNum, dataNum] = size(DS.input); plot(1:featureNum, 100*recogRate1, 'o-', 1:featureNum, 100*recogRate2, '^-'); grid on legend({'Raw data', 'Normalized data'}, 'location', 'northOutside', 'orientation', 'horizontal'); xlabel('No. of projected features based on LDA'); ylabel('LOO recognition rates using KNNC (%)');

¦³¤W­z½d¨Ò¥i¥H¬Ý¥X¡A¹ï©ó³o­ÓÀ³¥Î¦Ó¨¥¡A¨Ï¥Î¤F¸ê®Æ¥¿³W¤Æ¡A¨Ï±o¿ëÃѲv´£¤É«Ü¦h¡A¤×¨ä¬O¦b¸ê®Æºû«×¬O 6 ®É¡A¿ëÃѲv¥i¹F 100%¡C

Hint
¸ê®Æ¥¿³W¤Æ¹ï©ó¿ëÃѲvªº¼vÅT¡AÀH¤£¦PªºÀ³¥Î»P¤£¦Pªº¤ÀÃþ¾¹¦ÓÅÜ¡A¨Ã«D¤@©w³£¥i¥H´£¤É¿ëÃѲv¡C

Hint
¨ä¹ê¡A¥»³¹©Ò±o¨ìªº¿ëÃѲv¡A·|¦³¤@ÂIÂI¬y©ó¼ÖÆ[¡C¦]¬°§Ú­Ì¬O¥Î©Ò¦³ªº¸ê®Æ¨Ó¶i¦æ LDA¡AµM«á¦A°µ KNNC ªº LOO ¿ëÃѲv´ú¸Õ¡C´«¥y¸Ü»¡¡ALDA ¤w¸g¡u°½¬Ý¡v¤F©Ò¦³ªº¸ê®Æ¡A©Ò¥H³o­Ó´ú¸Õ©Ò±o¨ìªº¿ëÃѲv·|°¾°ª¤@ÂIÂI¡C

¦pªG¸ê®Æºû«×¤j©ó¸ê®Æµ§¼Æ¡A«h LDA ³q±`·|¥X²{¿ù»~ªºµ²ªG¡A¦]¬°¹Bºâ¹Lµ{¤¤¡A¼Æ­Ó¯x°}¥i¯à±µªñ Singular¡A¾É­P­pºâ¥X¨Óªº eigenvalues ¦³½Æ¼Æ¡C¤@¯ë¸Ñ¨Mªº¤è®×¡A¨Æ¥ý¥Î PCA §ë¼v¨ì§CºûªÅ¶¡¡A¥H«KºÉ¶qºû«ù¸ê®ÆªºªÅ¶¡¶ZÂ÷¯S©Ê¡AµM«á¦A¨Ï¥Î LDA ¨D¨ú¹ï©ó¤ÀÃþ³Ì¨Îªº§ë¼v¤è¦¡¡C

References:

  1. J. Duchene and S. Leclercq, "An optimal transformation for discriminant and principal component analysis", IEEE Trans. PAMI, vol. 10, pp.978-983, 1988
More info:

Appendix:


Data Clustering and Pattern Recognition (¸ê®Æ¤À¸s»P¼Ë¦¡¿ë»{)