6-2 Methods for Recognition Rate Estimate (���Ѳv�w��)

[chinese][english]

(½Ðª`·N¡G¤¤¤åª©¥»¨Ã¥¼ÀH­^¤åª©¥»¦P¨B§ó·s¡I)

In this section, we shall introduce the estimation of recognition rates (or error rates) for a classifier that can be constructed by any pattern recognition methods.

¥»¸`»¡©ú¦b¤ÀÃþ¾¹ªº³]­p¹Lµ{¤¤¡A¦p¦ó¹w¦ô¨ä¿ëÃѲv©Î¿ù»~²v¡C

If we use the same dataset for both training and test, then the obtained recognition rate is referred to as the inside-test recognition rate or the resubstitution recognition rate. This inside-test result is usually overly optimistical since all data is used for training and the test is also based on the same data. In particular, if we use 1-NNR for our classifier, then the inside-test recognition rate will always be 100%.

¡u¤º³¡´ú¸Õ¿ù»~²v¡v¡]inside test error¡^¤SºÙ¬°¡u­«·s±a¤J¿ù»~²v¡v¡]resubstitution error¡^©Î¡uªí­±¿ù»~²v¡v¡]apparent error rate¡^¡A«üªº¬O¨Ï¥Î¥þ³¡ªº¸ê®Æ¶i¦æ°V½m¥H³]­p¤ÀÃþ¾¹¡A¤§«á¦A¥H¦P¤@²Õ¸ê®Æ¶i¦æ´ú¸Õ¡C¦¹¤è¦¡ÁöµM¥R¤À¹B¥Î¨C¤@µ§¸ê®Æ¨Ó¶i¦æ¤ÀÃþ¾¹³]­p¡A¦ý¦]¬°´ú¸Õ¸ê®Æ©M°V½m¸ê®Æ¬O¦P¤@¥÷¡A©Ò±o¨ìªº¿ëÃѲv·|°¾°ª¡]¿ù»~²v°¾§C¡^¡A³oºØ¡u²y­û­Ýµô§P¡v¤§ªº¿ù»~²v¡A¨Ã¤£¨ã«ÈÆ[©Ê¡C

Though the inside-test recognition rate is not objective, it can serve as the upper-bound of the true recognition rate. In general, we use the inside-test recognition rate as a first step for examining our classifier. If the inside-test recognition rate is already low, there are two possible reasons:

Á|¨Ò¨Ó»¡¡A¦pªG§Ú­Ì¨Ï¥Î 1-NNR ¬°¤ÀÃþ¾¹¡A¦A¨Ï¥Î¤º³¡¿ù»~²v¦ô´úªk¡A©Ò±o¨ìªº¿ëÃѲv´N¬O 100%¡]¿ù»~²v¬° 0%¡^¡A«Ü©úÅã¦a¡A³o¬O¹L©ó¼ÖÆ[ªºµ²ªG¡A¦]¦¹¤º³¡¿ù»~²v¦ô´úªkªºµ²ªG¥u¯à©h¥BÅ¥¤§¡A°Ñ¦Ò©Ê¤ñ¸û§C¡A§Ú­Ì¥u¯à±N¤§µø¬°¹ê»Ú¿ù»~²vªº¤U­­­È¡]©Î¬O¹ê»Ú¿ëÃѲvªº¤W­­­È¡^¡C¤@¯ë¦Ó¨¥¡A§Ú­Ì¨Ï¥Î¤º³¡¿ù»~²v¨Ó¶i¦æªì¨BÀË´ú¡A¦pªG¤@­Ó¤ÀÃþ¾¹ªº¤º³¡¿ù»~²v¤w¸g«Ü°ª¡A¥Nªí¦³¤U¦C¨âºØ¥i¯à¡G

However, if the inside-test recognition rate is high, it does not mean we have reach a reliable classifier. Usually we need to prepare a set of "unseen" data to test the classifier, as explained next.

·íµM¡A³o¥u¬O¤@­Ó°ò¥»ªºÀË´ú¡A¤º³¡¿ù»~²v¹L°ª¡Aªí¥Ü¥i¯à¦³¤W­z¨âºØ¿ù»~¡A¦ý¬O¤º³¡¿ù»~²v­Y«Ü§C¡A¨Ã«D¥Nªí¤ÀÃþ¾¹©Î¸ê®Æ¥¿½T¡A¦¹®ÉÁÙ¥²¶·¾a¡u¥~³¡´ú¸Õ¿ù»~²v¡v¡]outside test error¡^¨Ó¶i¦æ¶i¤@¨BªºÀË©w¡A¦p¤U©Ò­z¡C

After a classifier is constructed, usually it will face unseen data for further application. Therefore it is better to prepare a set of "unseen" data for evaluating the recognition rate of the classifier. In practice, we usually divide the available data set into two disjoint part of a training set and a test set. The training set is used for designing the classifier, while the test set is used for evaluating the recognition rate of the classifier. The obtained recognition rate is referred to as the outside-test recognition rate or the holdout recognition rate, with the following characteristics:

¬°¤FÁקK¡u²y­û­Ýµô§P¡v¤§¶û¡A³Ì²³æªº¤è¦¡«K¬O¦b¶i¦æ¿ù»~²v¹w¦ô¤§®É¡A±N¸ê®Æ¤Á¦¨³]­p¸ê®Æ design set¡^©M´ú¸Õ¸ê®Æ test set¡A§Ú­Ì¥i¥H¨Ï¥Î DS ¨Ó¶i¦æ¤ÀÃþ¾¹ªº³]­p¡AµM«á¨Ï¥Î TS ¨Ó¶i¦æ¿ëÃѲvªº´ú¸Õ¡A¦¹ºØ¿ëÃѲvºÙ¬°¡u¥~³¡´ú¸Õ¿ù»~²v¡v¡]outside test error¡^©Î¡u¾B½ª¦¡¿ù»~²v¡v¡]holdout error¡^¡C¦¹ºØ¤èªkªº¯S©Ê¦p¤U¡G

We can extend the concept of outside test to have the so-called two-fold cross validation or two-way outside-test recognition rate. Namely, we can divide the data set into part A and B of equal size. In the first run, we use part A as the training set and part B as the test set. In the second run, we reverse the roles of part A and B. The overall recognition rate will be the average of these two outside-test recognition rates.

§Ú­Ì¥i¥H±N¥~³¡´ú¸Õ¿ù»~²v°µ¶i¤@¨Bªº©µ¦ù¡A¥ý±N©Ò¦³¸ê®Æµ¥¤Á¦¨¨â¥÷ A »P B¡A¦b²Ä¤@¦¸¹w¦ô®É¥H A ¬°°V½m¸ê®Æ¡BB ¬°´ú¸Õ¸ê®Æ¡A¦ý¦b²Ä¤G¦¸¹w¦ô®É¡A§ï¥H¥H B ¬°°V½m¸ê®Æ¡BA ¬°´ú¸Õ¸ê®Æ¡F³Ì«á¦A¨D³o¨â¦¸¹w¦ôªº¥­§¡¿ù»~²v¡AºÙ¬°¡uÂù¦V¦¡¥~³¡¿ù»~²v¡v¡]two-way outside test error¡^©Î two-fold cross validation¡C

In two-fold cross validation, the dataset is divided into two equal-size parts, which lead to slight lower outside-test recognition rates since each classifier can only use 50% of the dataset. In order to estimate the recognition rate better, we can have m-fold cross validation in which the dataset S is divided into m sets of about equal size, S1, S2, ..., Sm, with the following characteristics: ¨Ï¥Î«e­zªº two-fold cross validation ®É¡A¥Ñ©ó¨Ï¥Îªº³]­p¸ê®Æ¶q¤j¬ù¥u¦³¼Ë¥»¸ê®Æªº¤@¥b¡A¦]¦¹±o¨ìªº¿ëÃѲv·|°¾§C¡C¬°¤F§ó¦³®Ä¦a¹w¦ô¿ëÃѲv¡A§Ú­Ì¥i¥H±N¸ê®Æ¤Á¦¨ m ­Ó¤l¶°¦X S1, S2, ..., Sm¡A¨C­Ó¶°¦X©Ò¥]§tªº¸ê®Æ­Ó¼Æ¤j¬ù¬Ûµ¥¡A¨Ãº¡¨¬¤U¦C±ø¥ó¡G

Then we estimate the recognition according to the following steps:

µM«á¥H¤U¦C¤è¦¡¨Ó¦ô´ú¿ëÃѲv¡G

  1. Use Si as the test set, while all the other data S-Si as the training set to design a classifier. Test the classifier using Si to obtain the outside-test recognition rate.
  2. Repeat the above step for each of Si, i = 1 to m. Compute the overall average outside-test recognition rate.
  1. ¥H Si ¬°´ú¸Õ¸ê®Æ¡A¥H³Ñ¾lªº¸ê®Æ S-Si ³]­p¤ÀÃþ¾¹¡A¦A¥H Si ¹ï³o­Ó¤ÀÃþ¾¹¶i¦æ´ú¸Õ¡A±o¨ì¥~³¡´ú¸Õ¿ëÃѲv¡C
  2. ­«½Æ¤W­zªº¨BÆJ¡Aª½¨ì±o¨ì¨C­Ó¤l¶°¦X Si ªº¿ëÃѵ²ªG¡A¨Ã­pºâ¾ãÅé¿ëÃѲv¡C
¤W­zªº¤èªkºÙ¬° m-fold cross validation¡A©Ò±o¨ìªº¿ù»~²vºÙ¬°½ü°j¿ù»~²v¡C

The following example demonstrate the use of 5-fold cross validation on the IRIS dataset.

Example 1: rreViaCv01.mdataSet=prData('iris'); m=5; cvOpt=cvDataGen('defaultOpt'); cvOpt.foldNum=m; cvOpt.cvDataType='full'; cvData=cvDataGen(dataSet, cvOpt); foldNum=length(cvData); % Actual no. of folds for i=1:foldNum [qcPrm, logProb1, tRr(i)]=qcTrain(cvData(i).TS); tSize(i)=length(cvData(i).TS.output); [computedClass, logProb2, vRr(i)]=qcEval(cvData(i).VS, qcPrm); vSize(i)=length(cvData(i).VS.output); end tRrAll=dot(tRr, tSize)/sum(tSize); vRrAll=dot(vRr, vSize)/sum(vSize); plot(1:foldNum, tRr, '.-', 1:foldNum, vRr, '.-'); xlabel('Folds'); ylabel('Recog. rate (%)'); legend('Training RR', 'Validating RR', 'location', 'northOutside', 'orientation', 'horizontal'); fprintf('Training RR=%.2f%%, Validating RR=%.2f%%\n', tRrAll*100, vRrAll*100); Training RR=97.83%, Validating RR=97.33%

Since this type of performance evaluation using cross-validation is used often, we have created a function to serve this purpose, as shown in the next example where 10-fold cross-validation is applied to IRIS dataset:

Example 2: perfCv4qc01.mDS=prData('iris'); showPlot=1; foldNum=10; classifier='qc'; [vRrAll, tRrAll]=perfCv(DS, classifier, [], foldNum, showPlot); fprintf('Training RR=%.2f%%, Validating RR=%.2f%%\n', tRrAll*100, vRrAll*100); Training RR=98.07%, Validating RR=98.00%

A larger m will require more computation for constructing m classifiers. In practice, we select the value of m based on the size of the data set and the time needed to construct a specific classifier. In particular,

·í m ¶V¨Ó¶V¤j®É¡A©Ò»Ý­nªº­pºâ¶q¤]·|¶V¨Ó¶V¤j¡A¦]¦¹§Ú­Ì¥i¥Hµø¹ê»Ú±¡ªp¡]¼Ë¥»¸ê®Æ¶q¤j¤p¡B¤ÀÃþ¾¹³]­pªº­pºâ®É¶¡¡^¨Ó¨M©w m ªº­È¡A»¡©ú¦p¤U¡G

Leave-one-out method is also known as the jackknife procedure, which the most objective method for recognition rate estimate since almost all the data (except one entry) is used for constructing the classifier. It involves the following steps:

¡u"¤@¦¸¬D¤@­Ó"¿ù»~²v¡v¡]leave-one-out error rate¡^¬O¼Ë¦¡¿ë»{¤¤³Ì±`³Q¥Î¨ìªº¿ù»~²v¹w¦ô¤èªk¡A¦]¬°¨C­Ó´ú¸Õ¸ê®Æ³£¨S¦³°Ñ»P¤ÀÃþ¾¹ªº³]­p¡A¦]¦¹¤]¬O¤@ºØ¸û¬°¤½¥­¡B«ÈÆ[ªº¿ù»~²v¹w¦ô¤è¦¡¡C¾ã­Ó¿ù»~²v¹w¦ôºtºâ¹Lµ{¤SºÙÅI¤M¦¡¬yµ{¡]jackknife procedure¡^¡A¨ä¥D­n¨BÆJ¦p¤U©Ò­z¡G

    Use xi (the i-th entry in the dataset) as the test set, while all the other data as the training set to design a classifier. Test the classifier using xi to obtain the outside-test recognition rate (which is either 0% or 100%).
  1. Repeat the above step for each of xi, i = 1 to n. Compute the overall average outside-test recognition rate.
  1. ¥ý±q¸ê®Æ¶°¤¤¨ú¥X¤@µ§¸ê®Æ xi¡A¥H³Ñ¾lªº¸ê®Æ³]­p¤ÀÃþ¾¹¡A¦A¥H xi ¹ï³o­Ó¤ÀÃþ¾¹¶i¦æ´ú¸Õ¡C
  2. ­«½Æ¤W­zªº¨BÆJ¡Aª½¨ì±o¨ì¨C¤@µ§¸ê®Æªº¿ëÃѵ²ªG¡A¨Ã­pºâ¾ãÅé LOO ¿ù»~²v©Î LOO ¿ëÃѲv¡C

The obtained recognition rate is known as the leave-one-out (LOO for short) recognition rate. The leave-one-out method has the following characteristics:

¥Ñ¤W­z¤èªk¥i¥H¬Ý¥X¡ALOO ¿ëÃѲvªº¯S©Ê¦p¤U¡G

In the following example, we use the function knncLoo.m to find the LOO recognition rates based 1-NNR. Each misclassified data point is labeled with a cross for easy visual inspection, as follows:

¤]¥Ñ©ó­pºâ¶q¤Ó¤j¡A¦]¦¹§Ú­Ì³q±`¥u¨Ï¥Î²³æªº¤ÀÃþ¾¹¡A¨Ò¦p KNNC¡A¨Ó¦ô´ú LOO ¿ù»~²v¡A¨Ã¶i¦Ó±ÀÂ_¼Ë¥»¸ê®Æªº¯S¼x¬O§_¯à°÷¨¬°÷ªºÅ²§O¯à¤O¡C¦b¤U­±³o­Ó½d¨Ò¤¤¡A§Ú­Ì¨Ï¥Î¤@²Õ¶Ã¼Æ¨Ó²£¤@²Õ¥]§t¥|­ÓÃþ§Oªº¼Ë¥»¸ê®Æ¡AµM«á§Q¥Î knncLoo «ü¥O¨Ó­pºâ 1-NNR ©Ò²£¥Íªº LOO¡A¨Ã±N¿ëÃÑ¿ù»~ªº¸ê®ÆÂI¥´¤W¡ux¡v¸¹¡A¥H«KÀˬd¡A¦p¤U¡G

Example 3: knncLoo01.mDS=prData('random2'); dsScatterPlot(DS); knncPrm.k=1; plotOpt=1; [recogRate, computed, nearestIndex]=knncLoo(DS, knncPrm, plotOpt);

You can change the value of param.k to get the LOO recognition rates of various KNNC.

ŪªÌ¥i¥H§ïÅܤW­zªº k ­È¡A´N¥i¥H±o¨ì KNNC ¦b¤£¦Pªº k ­Èªº¿ëÃѵ²ªG¡C


Data Clustering and Pattern Recognition (¸ê®Æ¤À¸s»P¼Ë¦¡¿ë»{)