11-5 LDA for Face Recognition

In the previous section, we have explain the use of PCA for face recognition, which reduces the feature dimensions from 10304 (=112*92) to 400. The use of PCA can effectively retain the data variance along the first few dimension. However, it does not consider the classes (or identity) of the dataset.
On the other hand, we can apply LDA after PCA for projecting the data along the dimensions with better discriminative power. It should be noted that

There is no easy way to compute the eigenvectors of LDA using the original 10304 features. As a result, we need to apply PCA first to reduce the dimensions to 400.
Since the data count is also 400, we cannot use all 400 features for computing the eigenvectors of LDA. (If we use all 400 features, all data points in the same class will be mapped to a single points, leading to a overly optimistic recognition rate of 100%. This is too good to be realistic.)

In the following example, we use the first 60 features after PCA for LDA computation. In order to visualize the data, we select only the first 2 dimensions after LDA for scatter plot:
Example 1: faceRecog/face2dLdaProj01.m

Apparently the classes seem to converge better than PCA alone.
We can vary the dimensions after LDA (and keep the dimensions after PCA to be 60) to see the effects on the overall recognition rate:
Example 2: faceRecog/optLdaEigNum01.m

The recognition rate rises to 100% when the dimension is 9. This indicates how LDA is effective in projecting the dataset along the most discriminative directions. However, there is a caveat here. Since we have used the whole dataset for PCA and LDA, the recognition rate is, again, a little overly optimistic.
In order to evaluate the performance objectively, we need to resort to LOO (leave one out) scheme for face recognition. In other words, when we take a face for test, it cannot be used for computing PCA, LDA, etc. The following example uses such LOO scheme for performance evaluation. (Be warned that it takes hours to run the example.)
Example 3: faceRecog/optLdaEigNum02.m

The example indicates that the objective estimated performance of PCA + LDA for face recognition should be around 99.00% when the dimension is 14.
(Error analysis to be added here.)
Data Clustering and Pattern Recognition (資料分群與樣式辨認)