6-1 Intro. to Recognition Rate Estimate of Classifiers (蝪∩?)

[chinese][english]

(請注意:中文版本並未隨英文版本同步更新!)

Slides

Once we have constructed a classifier using a certain pattern recognition method, we need to evaluate its performance objectively. The performance evaluation of a classifier usually involves two factors:

所謂分類器的「效能評估」(performance evaluation),是指我們在設計一個分類器之後,如何以一個有效的方式來預估此分類器的能力,通常可以分為兩部分來評估:

The computation load of a classifier depends on the underlying classifier a lot, which we shall not go into detail in this chapter. Instead, the focus of this chapter is to cover several methods for estimating the ideally true recognition rate for a given classifier and a dataset.

不同的分類器,會有不同的運算量,本章將重點放在辨識率的估測,而不討論運算量。

Moreover, for a simple binary classification problem, the misclassified cases can be divided into two types of false positive and false negative. We shall also address the issue of selecting a threshold for the classifier based on the cost of false positive and false negative.

由於在現實世界中,所有的樣本資料(sample data)都是有限的,資料的收集過程本身就要耗費時間與人力,因此樣本資料也就益形珍貴。樣本資料越多,我們設計出來的分類器也會越精準,但是為了測試所設計出來的分類器的效能,所以在進行樣式辨識系統的設計流程中,我們會將所有的樣本資料切成兩部分:

不同的資料切分方式,就對應到不同的錯誤率估測方式,請見各小節詳述。
Data Clustering and Pattern Recognition (資料分群與樣式辨認)