[chinese][all] Learning vector quantization (LVQ) combines the concepts of vector quantization and nearest-neighbor classification to update the cluster centers for better recognition rates. More specifically, LVQ updates the cluster centers in an incremental manner, such that the updated centers can lead to a better accuracy of recognition, as follows:Note that the learning rate α is decreased with each iteration, with 0<α<1.
- Set repesentative centers for each class. Suppose that we have 3 clusters for a 4-class problem, then the number of centers are 12 in total. These cluster centers can be obtained by k-means clustering over each individual class. (In practice, the number of clusters for each class can be set to be proportional to the size of the class.)
- For each data point x, find the nearest centers yk. Based on the class labels of x and yk, update centers as follows:
- If the class labels are the same, then move yk toward x:
yk = yk + α [x - yk] - If the class labels are different, then move yi away from x:
yk = yk - α [x - yk] - Update the learning rate α.
- Back to step 2 until all the centers converge.
After the cluster centers converge, we can then use these centers to represent the original class. For any unseen data, we can apply k-nearest-neighbor classification to determine its class.
Here is a summary of the difference between LVQ and VQ.
- VQ is used for data compression and representation, which is applicable for unlabeled data.
- The goal of LVQ is to find representative points for each class. And then update these points for best recognition rate. Therefore it is suitable for labeled data (classification problem).
References:
- T. Kohonen, "Improved Versions of Learning Vector Quantization", International Joint Conference on Neural Networks (IJCNN), 1990.
Data Clustering and Pattern Recognition (資料分群與樣式辨認)