[english ][all ] (½Ðª`·N¡G¤¤¤åª©¥»¨Ã¥¼ÀH^¤åª©¥»¦P¨B§ó·s¡I )
¡u¯S¼x¿ï¨ú¡v¡]feature selection¡^ªº¥Ø¼Ð¬On±q즳ªº¯S¼x¤¤¬D¿ï¥X³Ì¨Îªº³¡¤À¯S¼x¡A¨Ï¨ä¿ëÃѲv¯à°÷¹F¨ì³Ì°ªÈ¡C³o¨ÇŲ§O¯à¤O¸û¦nªº¯S¼x¡A¤£¦ý¯à°÷²¤Æ¤ÀÃþ¾¹ªºpºâ¡A¦Ó¥B¤]¥i¥HÀ°§U§ÚÌÁA¸Ñ¦¹¤ÀÃþ°ÝÃDªº¦]ªGÃö«Y¡C
The recognition rate is improved.
Less computing power is needed for constructing the classifier.
The selected features can help us understand the causal relationship between features and classes.
§Ṳ́]¥i¥H¨Ï¥Î¼Æ¾Çªº¤è¦¡¨Ó»¡©ú¯S¼x¿ï¨ú¡C°²³]§Ú̦³¤@²Õ¯S¼x $V=\{v_1, v_2, \dots, v_d\}$¡A§Ú̧Ʊæ³z¹L¬YºØ¤ÀÃþ¾¹¨Ó±o¨ì¿ëÃѲv J¡]£»¡^¡A¦¹¿ëÃѲv¬O©Ò¿ï¨ú¯S¼xªº¨ç¼Æ¡F¦Ó¯S¼x¿ï¨úªº¥Ø¼Ð¡A´N¬On±q V ¤¤§ä¥X¤@²ÕŲ§O¯à¤O³Ì¨Îªºªº¤l¶°¦X S¡A¨Ï±o $J(S) \geq J(T)$¡A¨ä¤¤ T ¬O¥ô¦ó¥Ñ W ©Ò§Î¦¨ªº¤l¶°¦X¡C
¤@¯ë¨Ó»¡¡An¿ï¥X¤@²ÕŲ§O¯à¤O³Ì¦nªº¤l¶°¦X¡A¨Ã¨S¦³¤°»ò¦n¤èªk¡A°£¤F²³æªº heuristics ¥~¡A´N¬O¼É¤Oªk¡A¦ý¬O¼É¤Oªkªºpºâ¶q«Ü¤j¡A°£«D¸ê®Æ¶q¤£¤j¡A§_«h§ÚÌ«ÜÃø¨Ï¥Î¼É¤Oªk¡C¦]¦¹³Ì±`¥Î¨ìªº¬Oªñ¦ü³Ì¨Îªk¡]©ÎºÙ¬° heuristic methods¡^¡A¨ä¤¤ 1971 ¦~¥Ñ Whitney ©Ò´£¥Xªº Sequential forward selection ¡]²ºÙ SFS ¡^¬O³Ì±`³Q¥Î¨ìªºªñ¦ü³Ì¨Îªk¡A¨BÆJ¦p¤U¡G
¨Ï¥Î³Ìªñ¾F©~ªk«h¡]KNNC¡^©M¡u¤@¦¸¬D¤@Ó¡v¡]leave-one-out, LOO¡^¿ëÃѲv¹w¦ôªk¡C
²Ä¤@Ó¬D¿ïªº¯S¼x¥²©w¬O¿ëÃѲv³Ì°ªªº¯S¼x¡C
¤U¤@Ó¬D¿ïªº¯S¼x¥²©w¬O©M쥻¤w¿ï¨úªº¯S¼x¦X¨Ö«á¡A¿ëÃѲv³Ì°ªªº¤@Ó¡C
«½Æ¨BÆJ 3¡Aª½¦Ü¬D¿ï¥X¥þ³¡ªº¯S¼x¡C
§ÚÌ¥i¥H¨Ï¥Î¤Wz¤èªk¨Ó¹ï IRIS ¸ê®Æ¶i¦æ¯S¼x¿ï¨ú¡CIRIS ¸ê®Æ¦@¦³ 150 µ§¡A¦@¤À¦¨¤TÃþ¡A¨C¤@µ§¸ê®Æ¨ã¦³¥|Ó¯S¼x¡A¨Ï¥Î inputSelectSequential.m ¨Ó¶i¦æ SFS¡Aµ²ªG½Ð¨£¤U¦C½d¨Ò¡G
Example 1: feaSelBySfs4iris01.m DS=prData('iris');
inputNum=size(DS.input, 1);
DS.inputName=cellstr(int2str((1:inputNum)'));
inputSelectSequential(DS);
Construct 10 knnc models, each with up to 4 inputs selected from 4 candidates...
Selecting input 1:
Model 1/10: selected={1} => Recog. rate = 58.7%
Model 2/10: selected={2} => Recog. rate = 48.0%
Model 3/10: selected={3} => Recog. rate = 88.0%
Model 4/10: selected={4} => Recog. rate = 88.0%
Currently selected inputs: 3
Selecting input 2:
Model 5/10: selected={3, 1} => Recog. rate = 90.7%
Model 6/10: selected={3, 2} => Recog. rate = 90.7%
Model 7/10: selected={3, 4} => Recog. rate = 95.3%
Currently selected inputs: 3, 4
Selecting input 3:
Model 8/10: selected={3, 4, 1} => Recog. rate = 95.3%
Model 9/10: selected={3, 4, 2} => Recog. rate = 95.3%
Currently selected inputs: 3, 4, 1
Selecting input 4:
Model 10/10: selected={3, 4, 1, 2} => Recog. rate = 96.0%
Currently selected inputs: 3, 4, 1, 2
Overall maximal recognition rate = 96.0%.
Selected 4 inputs (out of 4): 3, 4, 1, 2
±q¤W¹Ï¥i¥H¬Ý¥X¡ASFS ¿ïªº¯S¼x¤À§O¬O {1}, {2}, {3}, {4}, {3,1}, {3,2}, {3,4}, {3,4,1}, {3,4,2}, {3,4,1,2}¡A¦ý³Ì«á«oµoı¥un¨Ï¥Î {3}¡A¿ëÃѲv´N¥i¥H¹F¨ì 94.7%¡AµM¦Ó³o¨Ã¤À¬O³Ì¨Î¿ëÃѲv¡A¦]¬° SFS ¥»¨Ó´N¬O¤@Ó heuristic ªº¤èªk¡Apºâ³t«×«Ü§Ö¡A¦ý¬O¨Ã¤£«OÃÒ¥i¥H§ä¥X³Ì¦nªº¯S¼x¡C
¦pªG§Ú̧ï¥Î¼É¤Oªk¡A±N©Ò¦³¥i¯àªº¯S¼x²Õ¦X¥þ³¡ºâ¤@¹M¡A¦b³t«×¤W·|¤ñ¸ûºC¡A¦ý¥Ñ©ó IRIS ¸ê®ÆªºÓ¼Æ¨Ã¤£¤j¡A¯S¼x¤]¥u¦³¥|Ó¡A¦]¦¹§ÚÌ¥i¥Hª½±µ¨Ï¥Î inputSelectExhaustive.m ¨Ó¶i¦æ½aÁ|¦¡ªº¯S¼x¿ï¨ú¡A¦p¤U¡G
Example 2: feaSelByEs4iris01.m DS=prData('iris');
inputNum=size(DS.input, 1);
DS.inputName=cellstr(int2str((1:inputNum)'));
inputSelectExhaustive(DS);
Construct 15 KNN models, each with up to 4 inputs selected from 4 candidates...
modelIndex 1/15: selected={1} => Recog. rate = 58.666667%
modelIndex 2/15: selected={2} => Recog. rate = 48.000000%
modelIndex 3/15: selected={3} => Recog. rate = 88.000000%
modelIndex 4/15: selected={4} => Recog. rate = 88.000000%
modelIndex 5/15: selected={1, 2} => Recog. rate = 70.666667%
modelIndex 6/15: selected={1, 3} => Recog. rate = 90.666667%
modelIndex 7/15: selected={1, 4} => Recog. rate = 92.666667%
modelIndex 8/15: selected={2, 3} => Recog. rate = 90.666667%
modelIndex 9/15: selected={2, 4} => Recog. rate = 92.666667%
modelIndex 10/15: selected={3, 4} => Recog. rate = 95.333333%
modelIndex 11/15: selected={1, 2, 3} => Recog. rate = 93.333333%
modelIndex 12/15: selected={1, 2, 4} => Recog. rate = 94.666667%
modelIndex 13/15: selected={1, 3, 4} => Recog. rate = 95.333333%
modelIndex 14/15: selected={2, 3, 4} => Recog. rate = 95.333333%
modelIndex 15/15: selected={1, 2, 3, 4} => Recog. rate = 96.000000%
Overall max recognition rate = 96.0%.
Selected 4 inputs (out of 4): 1, 2, 3, 4
¥Ñ©ó IRIS ¸ê®Æ¶q¤Î¯S¼xºû«×³£¤£¤j¡A¦]¦¹¼É¤Oªk¥u»Ýnpºâ 15¡]= 24 - 1¡^Ó¤ÀÃþ¾¹¡A©Ò±o¨ìªº³Ì¨Î¿ëÃѲv¬O 96%¡A©Ò¿ï¨ìªº¯S¼x¬O {2,4}¡C
In the above two examples, the optimum recognition rate occurs when all features are selected. However, this is not the general case since sometimes we only need to use a subset of features to outperform the use of all features. The following example demonstrates that we only need 3 features to achieve the maximum recognition rate when the number of allowable features is 6.
Example 3: feaSelBySfs4wine01.m DS=prData('wine');
inputNum=size(DS.input, 1);
DS.inputName=cellstr(int2str((1:inputNum)'));
inputSelectSequential(DS, 6);
Construct 63 knnc models, each with up to 6 inputs selected from 13 candidates...
Selecting input 1:
Model 1/63: selected={ 1} => Recog. rate = 59.6%
Model 2/63: selected={ 2} => Recog. rate = 55.1%
Model 3/63: selected={ 3} => Recog. rate = 37.6%
Model 4/63: selected={ 4} => Recog. rate = 40.4%
Model 5/63: selected={ 5} => Recog. rate = 50.0%
Model 6/63: selected={ 6} => Recog. rate = 60.1%
Model 7/63: selected={ 7} => Recog. rate = 70.8%
Model 8/63: selected={ 8} => Recog. rate = 35.4%
Model 9/63: selected={ 9} => Recog. rate = 49.4%
Model 10/63: selected={10} => Recog. rate = 64.0%
Model 11/63: selected={11} => Recog. rate = 56.7%
Model 12/63: selected={12} => Recog. rate = 58.4%
Model 13/63: selected={13} => Recog. rate = 66.9%
Currently selected inputs: 7
Selecting input 2:
Model 14/63: selected={ 7, 1} => Recog. rate = 89.3%
Model 15/63: selected={ 7, 2} => Recog. rate = 74.7%
Model 16/63: selected={ 7, 3} => Recog. rate = 71.9%
Model 17/63: selected={ 7, 4} => Recog. rate = 76.4%
Model 18/63: selected={ 7, 5} => Recog. rate = 79.2%
Model 19/63: selected={ 7, 6} => Recog. rate = 70.8%
Model 20/63: selected={ 7, 8} => Recog. rate = 75.8%
Model 21/63: selected={ 7, 9} => Recog. rate = 75.3%
Model 22/63: selected={ 7, 10} => Recog. rate = 93.3%
Model 23/63: selected={ 7, 11} => Recog. rate = 88.8%
Model 24/63: selected={ 7, 12} => Recog. rate = 75.3%
Model 25/63: selected={ 7, 13} => Recog. rate = 71.3%
Currently selected inputs: 7, 10
Selecting input 3:
Model 26/63: selected={ 7, 10, 1} => Recog. rate = 93.8%
Model 27/63: selected={ 7, 10, 2} => Recog. rate = 92.7%
Model 28/63: selected={ 7, 10, 3} => Recog. rate = 91.6%
Model 29/63: selected={ 7, 10, 4} => Recog. rate = 84.8%
Model 30/63: selected={ 7, 10, 5} => Recog. rate = 88.2%
Model 31/63: selected={ 7, 10, 6} => Recog. rate = 93.8%
Model 32/63: selected={ 7, 10, 8} => Recog. rate = 93.8%
Model 33/63: selected={ 7, 10, 9} => Recog. rate = 93.8%
Model 34/63: selected={ 7, 10, 11} => Recog. rate = 93.8%
Model 35/63: selected={ 7, 10, 12} => Recog. rate = 93.3%
Model 36/63: selected={ 7, 10, 13} => Recog. rate = 76.4%
Currently selected inputs: 7, 10, 1
Selecting input 4:
Model 37/63: selected={ 7, 10, 1, 2} => Recog. rate = 92.7%
Model 38/63: selected={ 7, 10, 1, 3} => Recog. rate = 92.7%
Model 39/63: selected={ 7, 10, 1, 4} => Recog. rate = 90.4%
Model 40/63: selected={ 7, 10, 1, 5} => Recog. rate = 92.7%
Model 41/63: selected={ 7, 10, 1, 6} => Recog. rate = 92.7%
Model 42/63: selected={ 7, 10, 1, 8} => Recog. rate = 93.8%
Model 43/63: selected={ 7, 10, 1, 9} => Recog. rate = 93.8%
Model 44/63: selected={ 7, 10, 1, 11} => Recog. rate = 93.3%
Model 45/63: selected={ 7, 10, 1, 12} => Recog. rate = 92.7%
Model 46/63: selected={ 7, 10, 1, 13} => Recog. rate = 77.0%
Currently selected inputs: 7, 10, 1, 8
Selecting input 5:
Model 47/63: selected={ 7, 10, 1, 8, 2} => Recog. rate = 93.3%
Model 48/63: selected={ 7, 10, 1, 8, 3} => Recog. rate = 91.6%
Model 49/63: selected={ 7, 10, 1, 8, 4} => Recog. rate = 90.4%
Model 50/63: selected={ 7, 10, 1, 8, 5} => Recog. rate = 92.7%
Model 51/63: selected={ 7, 10, 1, 8, 6} => Recog. rate = 93.3%
Model 52/63: selected={ 7, 10, 1, 8, 9} => Recog. rate = 93.3%
Model 53/63: selected={ 7, 10, 1, 8, 11} => Recog. rate = 93.8%
Model 54/63: selected={ 7, 10, 1, 8, 12} => Recog. rate = 92.7%
Model 55/63: selected={ 7, 10, 1, 8, 13} => Recog. rate = 77.0%
Currently selected inputs: 7, 10, 1, 8, 11
Selecting input 6:
Model 56/63: selected={ 7, 10, 1, 8, 11, 2} => Recog. rate = 93.3%
Model 57/63: selected={ 7, 10, 1, 8, 11, 3} => Recog. rate = 93.8%
Model 58/63: selected={ 7, 10, 1, 8, 11, 4} => Recog. rate = 89.9%
Model 59/63: selected={ 7, 10, 1, 8, 11, 5} => Recog. rate = 92.7%
Model 60/63: selected={ 7, 10, 1, 8, 11, 6} => Recog. rate = 92.7%
Model 61/63: selected={ 7, 10, 1, 8, 11, 9} => Recog. rate = 93.3%
Model 62/63: selected={ 7, 10, 1, 8, 11, 12} => Recog. rate = 92.7%
Model 63/63: selected={ 7, 10, 1, 8, 11, 13} => Recog. rate = 77.0%
Currently selected inputs: 7, 10, 1, 8, 11, 3
Overall maximal recognition rate = 93.8%.
Selected 3 inputs (out of 13): 7, 10, 1
Moreover, we can observe that whenever feature 13 is selected, the recognition rate will drop. This observation reminds us that the range of feature 13 might be very different from those of other features. Indeed, this is already confirmed by the example shown in the previous section. As a result, we need to normalize the features before conducting feature selection, as shown in the next example.
Example 4: feaSelBySfs4wine02.m DS=prData('wine');
inputNum=size(DS.input, 1);
DS.inputName=cellstr(int2str((1:inputNum)'));
DS.input=inputNormalize(DS.input); % Feature normalization
inputSelectSequential(DS, 6);
Construct 63 knnc models, each with up to 6 inputs selected from 13 candidates...
Selecting input 1:
Model 1/63: selected={ 1} => Recog. rate = 57.9%
Model 2/63: selected={ 2} => Recog. rate = 56.2%
Model 3/63: selected={ 3} => Recog. rate = 37.6%
Model 4/63: selected={ 4} => Recog. rate = 39.3%
Model 5/63: selected={ 5} => Recog. rate = 50.0%
Model 6/63: selected={ 6} => Recog. rate = 60.1%
Model 7/63: selected={ 7} => Recog. rate = 69.7%
Model 8/63: selected={ 8} => Recog. rate = 35.4%
Model 9/63: selected={ 9} => Recog. rate = 48.9%
Model 10/63: selected={10} => Recog. rate = 64.0%
Model 11/63: selected={11} => Recog. rate = 56.2%
Model 12/63: selected={12} => Recog. rate = 58.4%
Model 13/63: selected={13} => Recog. rate = 67.4%
Currently selected inputs: 7
Selecting input 2:
Model 14/63: selected={ 7, 1} => Recog. rate = 89.3%
Model 15/63: selected={ 7, 2} => Recog. rate = 74.2%
Model 16/63: selected={ 7, 3} => Recog. rate = 73.6%
Model 17/63: selected={ 7, 4} => Recog. rate = 77.5%
Model 18/63: selected={ 7, 5} => Recog. rate = 87.6%
Model 19/63: selected={ 7, 6} => Recog. rate = 71.3%
Model 20/63: selected={ 7, 8} => Recog. rate = 73.0%
Model 21/63: selected={ 7, 9} => Recog. rate = 75.8%
Model 22/63: selected={ 7, 10} => Recog. rate = 92.7%
Model 23/63: selected={ 7, 11} => Recog. rate = 86.5%
Model 24/63: selected={ 7, 12} => Recog. rate = 77.5%
Model 25/63: selected={ 7, 13} => Recog. rate = 84.8%
Currently selected inputs: 7, 10
Selecting input 3:
Model 26/63: selected={ 7, 10, 1} => Recog. rate = 94.4%
Model 27/63: selected={ 7, 10, 2} => Recog. rate = 93.3%
Model 28/63: selected={ 7, 10, 3} => Recog. rate = 93.8%
Model 29/63: selected={ 7, 10, 4} => Recog. rate = 89.3%
Model 30/63: selected={ 7, 10, 5} => Recog. rate = 93.3%
Model 31/63: selected={ 7, 10, 6} => Recog. rate = 92.7%
Model 32/63: selected={ 7, 10, 8} => Recog. rate = 92.7%
Model 33/63: selected={ 7, 10, 9} => Recog. rate = 91.6%
Model 34/63: selected={ 7, 10, 11} => Recog. rate = 92.7%
Model 35/63: selected={ 7, 10, 12} => Recog. rate = 92.7%
Model 36/63: selected={ 7, 10, 13} => Recog. rate = 96.6%
Currently selected inputs: 7, 10, 13
Selecting input 4:
Model 37/63: selected={ 7, 10, 13, 1} => Recog. rate = 94.9%
Model 38/63: selected={ 7, 10, 13, 2} => Recog. rate = 94.9%
Model 39/63: selected={ 7, 10, 13, 3} => Recog. rate = 94.4%
Model 40/63: selected={ 7, 10, 13, 4} => Recog. rate = 93.3%
Model 41/63: selected={ 7, 10, 13, 5} => Recog. rate = 95.5%
Model 42/63: selected={ 7, 10, 13, 6} => Recog. rate = 94.9%
Model 43/63: selected={ 7, 10, 13, 8} => Recog. rate = 94.9%
Model 44/63: selected={ 7, 10, 13, 9} => Recog. rate = 93.8%
Model 45/63: selected={ 7, 10, 13, 11} => Recog. rate = 96.6%
Model 46/63: selected={ 7, 10, 13, 12} => Recog. rate = 95.5%
Currently selected inputs: 7, 10, 13, 11
Selecting input 5:
Model 47/63: selected={ 7, 10, 13, 11, 1} => Recog. rate = 97.2%
Model 48/63: selected={ 7, 10, 13, 11, 2} => Recog. rate = 96.6%
Model 49/63: selected={ 7, 10, 13, 11, 3} => Recog. rate = 94.4%
Model 50/63: selected={ 7, 10, 13, 11, 4} => Recog. rate = 95.5%
Model 51/63: selected={ 7, 10, 13, 11, 5} => Recog. rate = 94.9%
Model 52/63: selected={ 7, 10, 13, 11, 6} => Recog. rate = 93.8%
Model 53/63: selected={ 7, 10, 13, 11, 8} => Recog. rate = 93.3%
Model 54/63: selected={ 7, 10, 13, 11, 9} => Recog. rate = 94.9%
Model 55/63: selected={ 7, 10, 13, 11, 12} => Recog. rate = 94.9%
Currently selected inputs: 7, 10, 13, 11, 1
Selecting input 6:
Model 56/63: selected={ 7, 10, 13, 11, 1, 2} => Recog. rate = 97.2%
Model 57/63: selected={ 7, 10, 13, 11, 1, 3} => Recog. rate = 97.2%
Model 58/63: selected={ 7, 10, 13, 11, 1, 4} => Recog. rate = 96.6%
Model 59/63: selected={ 7, 10, 13, 11, 1, 5} => Recog. rate = 97.8%
Model 60/63: selected={ 7, 10, 13, 11, 1, 6} => Recog. rate = 96.6%
Model 61/63: selected={ 7, 10, 13, 11, 1, 8} => Recog. rate = 97.2%
Model 62/63: selected={ 7, 10, 13, 11, 1, 9} => Recog. rate = 96.6%
Model 63/63: selected={ 7, 10, 13, 11, 1, 12} => Recog. rate = 96.6%
Currently selected inputs: 7, 10, 13, 11, 1, 5
Overall maximal recognition rate = 97.8%.
Selected 6 inputs (out of 13): 7, 10, 13, 11, 1, 5
The only difference between this example and the previous one is that the features are normalized, which enhances the recognition rate from 93.8% to 97.8%. This demonstrates the importance of input (feature) normalization.
Hint In fact, by using exhaustive search on all features, we can achieve a recognition rate of 99.4% based on the selected features {1, 3, 4, 9, 10, 11, 12, 13}, after conducting 8191 LOO tests.
However, it should be noted that input (feature) normalization does not guarantee improvement. As a result, it is always a good practice to try both the original and the normalized datasets for a given choice of classifier.
Some of the commonly used methods for feature selection, including sequential forward selection and exhaustive search, are listed here according to their computational complexity:
One-pass ranking
Sequential forward selection
Generalized sequential forward selection
Sequential backward selection
Generalized sequential backward selection
"Add m, remove n" selection
Generalized "add m, remove n" selection
Exhaustive search
Reference:
A. Whitney, "A direct method of nonparametric measurement selection", IEEE Transactions on Computers, vol. 20, pp.1100-1103, 1971.
Data Clustering and Pattern Recognition (¸ê®Æ¤À¸s»P¼Ë¦¡¿ë»{)