## Chapter 5: Exercises

1. (*)Function for KNN search: Write a function knnSearch.m with the following usage:
[index, distance]=knnSearch(x, X, k)
where
• x: an input column vector of $d$-dimension
• X: a dataset matrix of $d$ by $n$, with each column being an observation
• k: no. of nearest neighbors to be retrieved
• index: a vector of k integers, representing the indices of the first k nearest neighbors (starting from the nearest one to the farthest one)
• distance: a vector of k element, representing the distances to these k nearest neighbors
(You can simply assume there is no ties in sorting the distance.)

Test case (You don't need to create the plots):

• Input: x=[7;8], X=[10 10 8 6 3 0 -3 -6 -8 -10 -10 -10 -8 -6 -3 0 3 6 8 10;0 3 6 8 10 10 10 8 6 3 0 -3 -6 -8 -10 -10 -10 -8 -6 -3], k=5.
• Output: index=[4 3 5 2 6], distance=[1 2.23606797749979 4.47213595499958 5.8309518948453 7.28010988928052].

• Input: x=[40;50], X=[13 51 36 79 68 74 81 90 37 82 77 58 48 69 95 91 81 71 29 79 23 90 34 14 74 8 57 60 15 58;99 12 5 2 31 61 95 24 57 49 49 16 20 47 40 43 76 41 59 77 34 51 93 9 54 87 54 49 94 71], k=5.
• Output: index=[9 19 27 28 21], distance=[7.61577310586391 14.2126704035519 17.464249196573 20.0249843945008 23.3452350598575].

1. What is the full name for KNNC?
2. Please describe the basic principle of KNNC.
3. Give the major strength and drawback of KNNC.
3. (*)Voronoi diagram:
1. Given 3 points in a plane, draw its Voronoi diagram.
2. Given 4 points in a plane, draw its Voronoi diagram.
4. (*)Single-layer perceptrons:
1. What is the equation for computing the output of a 2-input single-layer perceptron?
2. What are the learning rules of the 3 parameters in the above equation?
5. (**)Surface and contour plots of 2D Gaussian distributions: Write a script to draw both surface and contour plots of a 2D Gaussian distributions with the following parameters:
1. m = [0, 0]T, S = [1 0; 0 1] (an identity matrix).
2. m = [0, 0]T, S = [1 0; 0 5] (a diagonal matrix).
3. m = [0, 0]T, S = [1 2; 2 5] (a positive-definite matrix).
4. m = [0, 0]T, S = [-1 2; 2 -5]*50 (an arbitrary matrix).
Your plots should be similar to the one shown next:
You should choose a range that can display important characteristics of these plots. Please explain why the plots of (d) are very different from those of the other cases.
(Hint: You can follow the self demo part of gaussian.m.)
1. Why the classifier is named "quadratic"?
2. How do you train a quadratic classifier?
3. How do you evaluate (test) a quadratic classifier?
4. What is the major strength of a quadratic classifier?
5. What is the major weakness of a quadratic classifier?
6. If a quadratic classifier has a diagonal covariance matrix, does it fall back to a naive Bayes classifier? Why?
7. (*)Naive Bayes classifier:
1. How do you train a naive Bayes classifier?
2. How do you evaluate (test) a naive Bayes classifier?
3. What is the major strength of a naive Bayes classifier?
4. What is the major weakness of a naive Bayes classifier?
8. (**)KNNC on IRIS: recognition rates w.r.t. number of clusters: Please modify the example in order to test the recognition rates of KNNC with respect to the numbers of clusters.
1. Write a script knncIrisRrVsClusterNum01.m to display the recognition rates for both inside and outside tests. Your plot should be similar to the one shown next:
This is an example of exhaustive search that can be used to find the best number of clusters for KNNC.
2. Write a script knncIrisRrVsClusterNum02.m that repeats the previous subproblem but reverse the roles of training and test sets. Your plots should be similar to the one shown next:
3. Write a script knncIrisRrVsClusterNum03.m that combine previous two subproblems to plot the average recognition rates for both inside and outside tests. Your plot should be similar to the one shown next:
This method of switching the role of training and test sets for identify the average recognition rates are referred to as two-fold cross validation. We can extend the concept to K-fold cross validation in which at K-th iteration, only 1/K of the data is for test while the others are for training. This is a more robust method for estimating the performance of the constructed classifier. In general, the inside-test recognition rate should increase with the number of clusters. On the other hand, the outside-test recognition rate should increase initially and then decrease eventually. Usually we take the number of clusters that can maximize the out-side recognition rate for the construction of KNNC classifier.
9. (**)KNNC on WINE: recognition rates w.r.t. number of clusters: Use prData.m to obtain the WINE dataset and repeat the previous exercise to get 3 plots.
10. (*)Various classifiers on WINE dataset: Use prData.m to obtain the WINE dataset. Use the following classifiers to obtain the recognition rates of both inside and outside tests.