10-2 �u�ʰj�k�G�������X

¤@¯ë¡u¦h¿é¤J¡B³æ¿é¥X¡vªº½u©Ê°jÂk¼Æ¾Ç¼Ò«¬¥i¼g¦¨

$$ y = f(\mathbf{x}) = \theta_1f_1(\mathbf{x}) + \theta_2f_2(\mathbf{x}) + \cdots + \theta_nf_n(\mathbf{x}) $$

¨ä¤¤ $\mathbf{x}$ ¬°¿é¤J¡]ªø«×¬° $m$ ªº¦V¶q¡^¡Ay ¬°¿é¥X¡]¯Â¶q¡^¡A$\theta_1$¡B$\theta_2$¡B$\cdots$¡B$\theta_n$ ¬°¥iÅܪº¥¼ª¾°Ñ¼Æ¡A$f_i(\mathbf{x}), i=1$ to $n$ «h¬O¤wª¾ªº¨ç¼Æ¡AºÙ¬°°ò©³¨ç¼Æ¡]Basis Functions¡^¡C°²³]©Òµ¹ªº¸ê®ÆÂI¬° $(\mathbf{x}_i, y_i), i=1 \cdots m$¡A³o¨Ç¸ê®ÆÂIºÙ¬°¨ú¼Ë¸ê®Æ¡]Sample Data¡^©Î°V½m¸ê®Æ¡]Training Data¡^¡A±N³o¨Ç¸ê®ÆÂI±a¤J¼Ò«¬«á¥i±o¡G $$ \left\{ \begin{matrix} y_1 & = & f(\mathbf{x}_1) & = & \theta_1f_1(\mathbf{x}_1) + \theta_2f_2(\mathbf{x}_1) + \cdots + \theta_nf_n(\mathbf{x}_1) \\ \vdots & = & \vdots & = & \vdots \\ y_m & = & f(\mathbf{x}_m) & = & \theta_1f_1(\mathbf{x}_m) + \theta_2f_2(\mathbf{x}_m) + \cdots + \theta_nf_n(\mathbf{x}_m) \\ \end{matrix} \right. $$

©Î¥iªí¥Ü¦¨¯x°}®æ¦¡¡G

$$ \underbrace{ \left[ \begin{matrix} f_1(\mathbf{x}_1) & \cdots & f_n(\mathbf{x}_1) \\ f_1(\mathbf{x}_2) & \cdots & f_n(\mathbf{x}_2) \\ \vdots & \vdots & \vdots\\ f_1(\mathbf{x}_m) & \cdots & f_n(\mathbf{x}_m) \\ \end{matrix} \right] }_\mathbf{A} \underbrace{ \left[ \begin{matrix} \theta_1\\ \vdots\\ \theta_n\\ \end{matrix} \right] }_\mathbf{\theta} = \underbrace{ \left[ \begin{matrix} y_1\\ y_2\\ \vdots\\ y_m\\ \end{matrix} \right] }_\mathbf{y} $$

¥Ñ©ó¦b¤@¯ë±¡ªp¤U¡A$m>n$¡]§Y¸ê®ÆÂI­Ó¼Æ»·¤j©ó¥iÅܰѼƭӼơ^¡A¦]¦¹¤W¦¡µLºë½T¸Ñ¡A±ý¨Ï¤W¦¡¦¨¥ß¡A¶·¥[¤W¤@»~®t¦V¶q $\mathbf{e}$¡G $$ \mathbf{A}\mathbf{\theta}=\mathbf{y}+\mathbf{e} $$ ¥­¤è»~®t«h¥i¼g¦¨

$$ E(\mathbf{\theta})=\|\mathbf{e}\|^2=\mathbf{e}^T\mathbf{e}= (\mathbf{A}\mathbf{\theta}-\mathbf{y})^T (\mathbf{A}\mathbf{\theta}-\mathbf{y}) $$

ª½±µ¨ú $E(\mathbf{\theta})$ ¹ï $\mathbf{\theta}$ ªº°¾·L¤À¡A¨Ã¥O¨äµ¥©ó¹s¡A§Y¥i±o¨ì¤@²Õ $n$ ¤¸¤@¦¸ªº½u©ÊÁp¥ß¤èµ{¦¡¡A­Y¨Ï¥Î¯x°}¹Bºâ¨Óªí¥Ü¡A$\mathbf{\theta}$ ªº³Ì¨Î­È¥i¥Hªí¥Ü¦¨ $$ \hat{\mathbf{\theta}} = (\mathbf{A}^T\mathbf{A})^{-1}\mathbf{A}^T\mathbf{y} $$ ¡]¦³Ãö¤W¦¡ªº±À¾É¡A¥i¨£¥»³¹³Ì«á¤@¤p¸`ªº»¡©ú¡C¡^

¦b¹ê§@¤W¡A§Ú­Ì¥i¥Hª½±µ¨Ï¥Î MATLAB ªº¡u¥ª°£¡v¨Óºâ¥X $\mathbf{\theta}$ ªº³Ì¨Î­È¡A§Y $\hat{\mathbf{\theta}} = \mathbf{A}$\$\mathbf{y}$¡C

Hint
²z½×¤W¡A³Ì¨Îªº $\mathbf{\theta}$ ­È¬° $(\mathbf{A}^T\mathbf{A})^{-1}\mathbf{A}^T\mathbf{y}$¡A¦ý¬O $\mathbf{A}^T\mathbf{A}$ ªº­pºâ·|²£¥Í¥­¤è¶µ¡A¦]¦¹¦bºâ¤Ï¯x°}®e©ö³y¦¨¹q¸£¤º³¡ªº¼Æ­È»~®t¡AMATLAB ¹ê»Ú¦b­pºâ¡u¥ª°£¡v®É¡A·|¨Ì·Ó¯x°} $\mathbf{A}$ ªº¯S©Ê¦Ó¿ï¥Î³Ì¨Îªº¤èªk¡A¦]¦¹¥i¥H±o¨ì¸ûí©w¥B¥¿½Tªº¼Æ­È¸Ñ¡C

¥H¤U¥H¡§ peaks ¡¨¨ç¼Æ¬°¨Ò¡A¨Ó»¡©ú¤@¯ëªº½u©Ê°jÂk¡C­Y¦b MATLAB ¤U¿é¤J peaks¡A¥i¥Hµe¥X¤@­Ó¥W¥Y¦³­Pªº¦±­±¡A¦p¤U¡G

¦¹¨ç¼Æªº¤èµ{¦¡¦p¤U¡G

$$ z = 3(1-x)^2 e^{-x^2-(y+1)^2}-10\left(\frac{x}{5}-x^3-y^5\right) e^{-x^2-y^2}- \frac{1}{3} e^{-(x+1)^2-y^2} $$

¦b¤U¦C»¡©ú¤¤¡A§Ú­Ì°²³]¡G

¦]¦¹¤W­z¨ç¼Æ¥i¼g¦¨¡G

$$ \begin{array}{rcl} z & = & 3(1-x)^2 e^{-x^2-(y+1)^2}-10\left(\frac{x}{5}-x^3-y^5\right) e^{-x^2-y^2}- \frac{1}{3} e^{-(x+1)^2-y^2} + noise\\ & = & 3 f_1(x, y) - 10 f_2(x, y) - \frac{1}{3} f_3(x, y) + noise\\ & = & \theta_1 f_1(x, y) + \theta_2 f_2(x, y) + \theta_3 f_3(x, y) + noise\\ \end{array} $$

¨ä¤¤§Ú­Ì°²³] $\theta_1$¡B$\theta_2$ ©M $\theta_3$ ¬O¥¼ª¾°Ñ¼Æ¡A$noise$ «h¬O¥­§¡¬°¹s¡BÅܲ§¬° 1 ªº¥¿³W¤À§GÂø°T¡C¨Ò¦p¡G¦pªG­n¨ú±o 100 µ§°V½m¸ê®Æ¡A¥i¨Ï¥Î¤U¦C½d¨Ò¡G

Example 1: 10-¦±½uÀÀ¦X»P°jÂk¤ÀªR/peaks01.mpointNum = 10; [xx, yy, zz] = peaks(pointNum); zz = zz + randn(size(zz)); % ¥[¤JÂø°T surf(xx, yy, zz); axis tight

¦b¤W¨Ò¤¤¡Arandn «ü¥Oªº¨Ï¥Î§Y¦b¥[¤J¥¿³W¤À§GÂø°T¡C¤W¹Ï¬°§Ú­Ì¦¬¶°¨ìªº°V½m¸ê®Æ¡A¥Ñ©óÂø°T«Ü¤j¡A©Ò¥H©M­ì¥ý¥¼±aÂø°Tªº¹Ï§Î®t²§«Ü¤j¡C²{¦b§Ú­Ì­n¥Î¤wª¾ªº°ò©³¨ç¼Æ¡A¨Ó§ä¥X³Ì¨Îªº $\theta_1$¡B$\theta_2$ ©M $\theta_3$¡A½d¨Ò¦p¤U¡G

Example 2: 10-¦±½uÀÀ¦X»P°jÂk¤ÀªR/peaks02.mpointNum = 10; [xx, yy, zz] = peaks(pointNum); zz = zz + randn(size(zz))/10; % ¥[¤JÂø°T x = xx(:); % Âର¦æ¦V¶q y = yy(:); % Âର¦æ¦V¶q z = zz(:); % Âର¦æ¦V¶q A = [(1-x).^2.*exp(-(x.^2)-(y+1).^2), (x/5-x.^3-y.^5).*exp(-x.^2-y.^2), exp(-(x+1).^2-y.^2)]; theta = A\z % ³Ì¨Îªº theta ­È theta = 3.0392 -10.0407 -0.4093

¥Ñ¦¹§ä¥Xªº $\mathbf{\theta}$ ­È©M³Ì¨Î­È $\left(3, -10, -\frac{1}{3} \right)$ ¬Û·í±µªñ¡C®Ú¾Ú¦¹°Ñ¼Æ¡A§Ú­Ì¥i¥H¿é¤J¸û±KªºÂI¡A±o¨ì°jÂk«áªº¦±­±¡A½Ð¨£¤U¦C½d¨Ò¡G

Example 3: 10-¦±½uÀÀ¦X»P°jÂk¤ÀªR/peaks03.mpointNum = 10; [xx, yy, zz] = peaks(pointNum); zz = zz + randn(size(zz))/10; % ¥[¤JÂø°T x = xx(:); y = yy(:); z = zz(:); % Âର¦æ¦V¶q A = [(1-x).^2.*exp(-(x.^2)-(y+1).^2), (x/5-x.^3-y.^5).*exp(-x.^2-y.^2), exp(-(x+1).^2-y.^2)]; theta = A\z; % ³Ì¨Îªº theta ­È % µe¥X¹w´úªº¦±­± pointNum = 31; [xx, yy] = meshgrid(linspace(-3, 3, pointNum), linspace(-3, 3, pointNum)); x = xx(:); y = yy(:); % Âର¦æ¦V¶q A = [(1-x).^2.*exp(-(x.^2)-(y+1).^2), (x/5-x.^3-y.^5).*exp(-x.^2-y.^2), exp(-(x+1).^2-y.^2)]; zz = reshape(A*theta, pointNum, pointNum); surf(xx, yy, zz); axis tight

¦b¤W¹Ï¤¤¡A¥iª¾°jÂk«áªº¦±­±©M­ì¥ýªº¦±­±¬Û·í±µªñ¡C³Ì¥D­nªº­ì¦]¬O¡G§Ú­Ì²q¹ï¤F°ò©³¨ç¼Æ¡]©Î¬O§ó¥¿½Tªº»¡¡A§Ú­Ì°½¬Ý¤F¥¿½Tªº°ò©³¨ç¼Æ¡^¡A¦]¦¹±o¨ì«D±`¦nªº¦±­±ÀÀ¦X¡C¤@¯ë¦Ó¨¥¡A­Y¤£ª¾¥¿½Tªº°ò©³¨ç¼Æ¦Ó­J¶Ã¿ï¥Î¡A«ÜÃø¥Ñ 3 ­Ó¥iÅܨç¼Æ¹F¨ì 100 ­Ó¸ê®ÆÂIªº¨}¦nÀÀ¦X¡C

¦b¤W¨Ò¤¤§Ú­Ì´¿¦b¸ê®ÆÂI¥[¤J¥¿³W¤À§G¡]Normal Distributed¡^ªºÂø°T¡C¨Æ¹ê¤W¡A¥u­n°ò©³¨ç¼Æ¥¿½T¡A¦Ó¥BÂø°T¬O¥¿³W¤À§G¡A¨º»ò·í¸ê®ÆÂI¶V¨Ó¶V¦h¡A¤W­zªº³Ì¤p¥­¤èªk´N¥i¥H¹Gªñ°Ñ¼Æªº¯u¥¿¼Æ­È¡C


MATLABµ{¦¡³]­p¡G¶i¶¥½g