6-3 GMM Application: Classification

[chinese][all]
We can also use GMM to construct a classifier for pattern recognition. Such a GMM-based classifier is denoted as GMMC for short. The construction and evaluation of a GMMC are explained next.

At the training stage, we need to obtain a GMM for each class. In other words, we need to use the data of a class to train a GMM. This is done sequentially for each class; there is no interactions between GMM of different classes.
At the application stage, we need to send the unknown-class data to the GMM of each class. The predicted class is the one associated with the GMM with the maximum probability.

The following example demonstrates the use of GMM for classification of iris dataset:
Example 1: gmmcIris01.m

In the above example, we partitioned the dataset into a design set (DS) and a test set (TS). We used DS for training and TS for test. Each class is model as a two-Gaussian GMM. The obtained recognition rates are 100% and 96% for inside and outside tests, respectively.
Since the above procedure of training a GMM classifier and evaluating the classifier is used quite often, we can simplify the procedure by using the following two functions:

gmmcTrain.m: For training a GMM classifier.
gmmcEval.m: For evaluating a GMM classifier.
In particular, gmmcTrain.m will genearte a set of parameters for GMM classifier (including the prior information), and gmmcEval.m can take the parameters and evaluate the classifier based on a given new set of data. The use of these two functions is shown in the next example.
Example 2: gmmcIris02.m

If we want to visualize the decision boundary imposed by a GMM classifier, we can try the following example which apply GMMC to a nonlinear separable dataset:
Example 3: gmmcNonlinearSeparable01.m

In the above example, we set the number of Gaussians to be 3 since we already know the data distribution and 3 is a good choice since class 1 is segmented into 3 disjoint regions. In practice, we are usually dealing with high-dimensional data and such information does not come by easily. As a result, we usually have to resort to trials and errors, as explained later.
The next example demonstrates the PDFs of GMMs associated with these two classes. Due to the combination of several Gaussian PDFs, we are able to model more complicated PDFs of this problem.
Example 4: gmmcNonlinearSeparable02.m

Hint
If we set the number of Gaussians to 2 in the above two examples, what would you expect to see? Guess it before you try it. Then try it, you might be amazed at how versatile & flexible the GMMC is!

As mentioned earlier, the performance of GMM is highly related to its number of Gaussian components. In the following example, we can plot the relationship between the recognition rate and GMM's number of Gaussian components.
Example 5: gmmcIris03.m

The number of Gaussians in GMM represents the flexibility of a GMM. In the above example, we can observe a common characteristics of a classifier:

When the classifier have more and more tunable parameters, the inside-test recognition rate will go up all the way, while the outside-test recognition rate will go up initially and then fall down eventually.
The optimum configuration of a classifier is usually chosen as the one that can have the maximum outside-test recognition rate.

Since it is a common practice to observe the training and test recognition rates with respective to the number of Gaussians, we have create a function gmmcGaussianNumEstimate.m to accomplish the task. In the following example, we use the wine dataset for such task:
Example 6: gmmcWine02.m

In order to perform more analysis, we plot the range of all features:
Example 7: rangePlotWine.m

Obviously the last feature has a much wider range than the others. We can perform input normalization before GMM training, as follows:
Example 8: gmmcWine03.m

The above plot demonstrates that input normalization can sometimes lead to a better accuracy.
Data Clustering and Pattern Recognition (資料分群與樣式辨認)