3-2 Hierarchical Clustering (������������������)

[english][all]

(½Ðª`·N¡G¤¤¤åª©¥»¨Ã¥¼ÀH­^¤åª©¥»¦P¨B§ó·s¡I)

Slides

¶¥¼h¦¡¤À¸sªk¡]hierarchical clustering¡^³z¹L¤@ºØ¶¥¼h¬[ºcªº¤è¦¡¡A±N¸ê®Æ¼h¼h¤ÏÂЦa¶i¦æ¤Àµõ©Î»E¦X¡A¥H²£¥Í³Ì«áªº¾ðª¬µ²ºc¡A±`¨£ªº¤è¦¡¦³¨âºØ¡G

¥»¸`±N°w¹ï»E¦X¦¡¡]¥Ñ¤U¦Ó¤W¡^¶¥¼h¤À¸sªk¨Ó¶i¦æ»¡©ú¡C

»E¦X¦¡¶¥¼h¤À¸sªk¡]agglomerative hierarchical clustering¡^¥Ñ¾ðª¬µ²ºcªº©³³¡¶}©l¼h¼h»E¦X¡C¤@¶}©l§Ú­Ì±N¨C¤@µ§¸ê®Æµø¬°¤@­Ó¸s»E¡]cluster¡^¡A°²³]§Ú­Ì²{¦b¾Ö¦³nµ§¸ê®Æ¡A«h±N³onµ§¸ê®Æµø¬°n­Ó¸s»E¡A¥ç§Y¨C­Ó¸s»E¥]§t¤@µ§¸ê®Æ¡G

  1. ±N¨Cµ§¸ê®Æµø¬°¤@­Ó¸s»E Ci, i=1 1 to n.
  2. §ä¥X©Ò¦³¸s»E¶¡¡A¶ZÂ÷³Ì±µªñªº¨â­Ó¸s»E Ci¡BCj
  3. ¦X¨Ö Ci¡B Cj ¦¨¬°¤@­Ó·sªº¸s»E
  4. °²¦p¥Ø«eªº¸s»E¼Æ¥Ø¦h©ó§Ú­Ì¹w´Áªº¸s»E¼Æ¥Ø¡A«h¤ÏÂЭ«½Æ¨BÆJ¤G¦Ü¥|¡Aª½¨ì¸s»E¼Æ¥Ø¤w±N­°¨ì§Ú­Ì©Ò­n¨Dªº¼Æ¥Ø¡C
¤U¦C½d¨Ò®i¥Ü»E¦X¦¡¶¥¼h¤À¸sªk©Ò²£¥Íªº¾ðª¬¹Ï¡]dendrogram¡^¡C

Example 1: hierClusteringPlot01.mdata=rand(2, 50); % 50 data instances of dim 2 distMat=distPairwise(data); % Distance matrix of 50 by 50 hcOutput=hierClustering(distMat); hierClusteringPlot(hcOutput); % Plot the dendrogram

¦b¨BÆJ¤G¤¤¡A¦ó¿×¡u¶ZÂ÷³Ì±µªñªº¨â­Ó¸s»E Ci¡BCj¡v©O¡H¨Æ¹ê¤W¦b©w¸q¨â­Ó¸s»E¤§¶¡ªº¶ZÂ÷¡A¦³¦UºØ¤£¦Pªº¤è¦¡¡A¨C¤@ºØ¤è¦¡©Ò±o¨ìªºµ²ªG³£¤£¤Ó¬Û¦P¡C³o¨Ç±`¥Îªº¸s»E¶ZÂ÷ªº©w¸q¥i¥H»¡©ú¦p¤U¡C

§Ú­Ì¥i¥H¨Ï¥Î¤£¦Pªº¸s»E¶ZÂ÷¨Ó²£¥Í¶¥¼h¦¡¸s»E¾ð¡A©Ò±o¨ìªºµ²ªG¦p¤U¡G

Example 2: hierClusteringPlot02.mdata=rand(2, 50); % 50 data instances of dim 2 distMat=distPairwise(data); % Distance matrix of 50 by 50 method='single'; hcOutput=hierClustering(distMat, method); subplot(1,2,1); hierClusteringPlot(hcOutput); title(['method=', method]); method='complete'; hcOutput=hierClustering(distMat, method); subplot(1,2,2); hierClusteringPlot(hcOutput); title(['method=', method]);

¥Ñ¤W­z¸s»E¾ð¡A§Ú­Ì¥i¥Hµoı¤U¦C¯S©Ê¡G

­Y­nÆ[¬Ý¦b¸s»E¹Lµ{¤¤ªº°ÊºA®i¥Ü¡A½Ð¨£¤U¤@­Ó½d¨Ò¡]¦¹½d¨Ò¥H single linkage ¬°¸s»E¶ZÂ÷¡^¡G

Example 3: hierClusteringAnim01.mdata=dcData(6); data=data.input; dataNum=size(data,2); distMat=distPairwise(data, data); distMat(1:dataNum+1:dataNum^2)=inf; % Diagonal elements should always be inf. method='single'; % 'single' or 'complete' level=hierClustering(distMat, method); hierClusteringAnim(data, distMat, level);

¨Æ¹ê¤W§Ú­Ì¥i¥HÃÒ©ú¡A¦pªG¸ê®Æ¶°¬O¤Gºû¥­­±¤WªºÂI©Ò¦¨ªº¶°¦X¡A¦Ó¥B§Ú­Ì±Ä¥Î single linkage¡A«h©Ò±o¨ìªº³sµ²¹Ï§Y¬O³o¨ÇÂIªº minimum spanning tree¡C

¾ãÅé¨Ó»¡¡A¶¥¼h¦¡¤À¸sªkªºÀuÂI¦p¤U¡G

¦ý¬O¶¥¼h¦¡¤À¸sªk¤]¦³¯ÊÂI¡A¥¦³q±`¥u¾A¥Î©ó¤Ö¶q¸ê®Æ¡A«ÜÃø³B²z¤j¶q¸ê®Æ¡C
Data Clustering and Pattern Recognition (¸ê®Æ¤À¸s»P¼Ë¦¡¿ë»{)