## 正交 Procrustes 問題

Procrustes 和他的客人(點圖看動畫) From http://www.mythweb.com/today/media/procrustes07.gif

$\displaystyle \Vert A\Vert_{F}=\sqrt{\mathrm{trace}(A^TA)}=\sqrt{\sum_{i=1}^m\sum_{j=1}^n a_{ij}^2}$

\displaystyle\begin{aligned} \Vert B-AQ\Vert_F^2&=\hbox{trace}\left((B-AQ)^T(B-AQ)\right)\\ &=\hbox{trace}(B^TB-B^TAQ-Q^TA^TB+Q^TA^TAQ)\\ &=\hbox{trace}(B^TB)-\hbox{trace}(B^TAQ)-\hbox{trace}(Q^TA^TB)+\hbox{trace}(Q^TA^TAQ)\\ &=\hbox{trace}(B^TB)-\hbox{trace}(QB^TA)-\hbox{trace}(A^TBQ^T)+\hbox{trace}(A^TAQQ^T)\\ &=\hbox{trace}(B^TB)-\hbox{trace}(QB^TA)^T-\hbox{trace}(A^TBQ^T)+\hbox{trace}(A^TA)\\ &=\Vert B\Vert_F^2-2\hbox{trace}(A^TBQ^T)+\Vert A\Vert_F^2. \end{aligned}

$\displaystyle \max_{QQ^T=I}\hbox{trace}(A^TBQ^T)$

$\displaystyle L=\hbox{trace}(A^TBQ^T)-\hbox{trace}(\Lambda(QQ^T-I))$

\displaystyle\begin{aligned} \frac{\partial L}{\partial Q}&=\frac{\partial\hbox{trace}(A^TBQ^T)}{\partial Q}-\frac{\partial\hbox{trace}(Q^T\Lambda Q)}{\partial Q}+\frac{\partial\hbox{trace}(\Lambda)}{\partial Q}\\ &=A^TB-(\Lambda+\Lambda^T)Q. \end{aligned}

$\displaystyle U\Sigma V^TQ^T=QV\Sigma^TU^T=QV\Sigma U^T$

\displaystyle\begin{aligned} \hbox{trace}(A^TBQ^T)&=\hbox{trace}(U\Sigma V^TVU^T)=\hbox{trace}(U\Sigma U^T)\\ &=\hbox{trace}(\Sigma UU^T)=\hbox{trace}\Sigma=\sum_{i=1}^n\sigma_i(A^TB).\end{aligned}

$\displaystyle \Vert B-AQ\Vert_F=\Vert B-AUV^T\Vert_F=\sqrt{\Vert A\Vert_F^2+\Vert B\Vert_F^2-2\sum_{i=1}^n\sigma_i(A^TB)}$

$\hbox{trace}(A^TBQ^T)=\hbox{trace}(U\Sigma V^TQ^T)=\hbox{trace}(\Sigma V^TQ^TU)=\hbox{trace}(\Sigma P)$

$\displaystyle \hbox{trace}(\Sigma P)=\sum_{i=1}^n\sigma_i(A^TB)p_{ii}$

$\sum_{i=1}^np^2_{ij}=1$$\sum_{j=1}^np^2_{ij}=1$，上式的最大值必發生於 $p_{ii}=1$$i=1,\ldots,n$，即 $P=I$。因此，$V^TQ^TU=I$，解得 $Q=UV^T$

$\displaystyle A=\begin{bmatrix} 1.2&2.1\\ 2,9&4.3\\ 5.2&6.1\\ 6.8&8.1 \end{bmatrix},~~ B=\begin{bmatrix} 1&2\\ 3&4\\ 5&6\\ 7&8 \end{bmatrix}$

$\displaystyle Q=\left[\!\!\begin{array}{cr} 0.9999 &-0.0126\\ 0.0126&0.9999 \end{array}\!\!\right]$

\displaystyle\begin{aligned} \Vert B-PAQ\Vert_F^2&=\hbox{trace}\left((B-PAQ)^T(B-PAQ)\right)\\ &=\hbox{trace}(B^TB-B^TPAQ-Q^TA^TP^TB+Q^TA^TP^TPAQ)\\ &=\hbox{trace}(B^TB)-\hbox{trace}(B^TPAQ)-\hbox{trace}(Q^TA^TP^TB)+\hbox{trace}(Q^TA^TAQ)\\ &=\hbox{trace}(B^TB)-\hbox{trace}(QB^TPA)-\hbox{trace}(A^TP^TBQ^T)+\hbox{trace}(A^TAQQ^T)\\ &=\hbox{trace}(B^TB)-\hbox{trace}(QB^TPA)^T-\hbox{trace}(A^TP^TBQ^T)+\hbox{trace}(A^TA)\\ &=\Vert B\Vert_F^2-2\hbox{trace}(A^TP^TBQ^T)+\Vert A\Vert_F^2, \end{aligned}

$A$$B$ 的奇異值分解為 $A=U_1\Sigma_1V_1^T$$B=U_2\Sigma_2V_2^T$，其中 $U_1$$U_2$$m\times m$ 階正交矩陣，$V_1$$V_2$$n\times n$ 階正交矩陣，$\Sigma_1=\begin{bmatrix} S_1&0\\ 0&0 \end{bmatrix}$$\Sigma_2=\begin{bmatrix} S_2&0\\ 0&0 \end{bmatrix}$$m\times n$ 階奇異值矩陣 (零列和零行必須隨 $m$$n$ 的大小調整)，$S_1=\hbox{diag}(\sigma_1(A),\ldots,\sigma_s(A))$$S_2=\hbox{diag}(\sigma_1(B),\ldots,\sigma_s(B))$$s=\min\{m,n\}$。代入奇異值分解，使用跡數循環不變性，

\displaystyle\begin{aligned} \hbox{trace}(A^TP^TBQ^T)&=\hbox{trace}(V_1\Sigma_1^TU_1^TP^TU_2\Sigma_2V_2^TQ^T)\\ &=\hbox{trace}(\Sigma_1^TU_1^TP^TU_2\Sigma_2V_2^TQ^TV_1)\\ &=\hbox{trace}(\Sigma_1^TX^T\Sigma_2Y^T), \end{aligned}

$\displaystyle \max_{XX^T=I\atop{YY^T=I}}\hbox{trace}(\Sigma_1^TX^T\Sigma_2Y^T)$

$\displaystyle L=\hbox{trace}(\Sigma_1^TX^T\Sigma_2Y^T)-\hbox{trace}(\Lambda(XX^T-I))-\hbox{trace}(M(YY^T-I))$

\displaystyle\begin{aligned} \frac{\partial L}{\partial X}&=\frac{\partial\hbox{trace}(X^T\Sigma_2Y^T\Sigma_1^T)}{\partial X}-\frac{\partial\hbox{trace}(X^T\Lambda X-\Lambda)}{\partial X}-\frac{\partial\hbox{trace}(Y^TMY-M)}{\partial X}\\ &=\Sigma_2Y^T\Sigma_1^T-(\Lambda+\Lambda^T)X, \end{aligned}

\displaystyle\begin{aligned} \frac{\partial L}{\partial Y}&=\frac{\partial\hbox{trace}(\Sigma_1^TX^T\Sigma_2Y^T)}{\partial Y}-\frac{\partial\hbox{trace}(X^T\Lambda X-\Lambda)}{\partial Y}-\frac{\partial\hbox{trace}(Y^TMY-M)}{\partial Y}\\ &=\Sigma_1^TX^T\Sigma_2-(M+M^T)Y. \end{aligned}

\displaystyle\begin{aligned} \Sigma_2Y^T\Sigma_1^TX^T&=\Lambda+\Lambda^T\\ \Sigma_1^TX^T\Sigma_2Y^T&=M+M^T. \end{aligned}

\displaystyle\begin{aligned} \Sigma_2Y^T\Sigma_1^TX^T&=X\Sigma_1Y\Sigma_2^T\\ \Sigma_1^TX^T\Sigma_2Y^T&=Y\Sigma_2^TX\Sigma_1 .\end{aligned}

$\displaystyle \Sigma_1^TX^T\Sigma_2Y^T\Sigma_1^TX^T=Y\Sigma_2^TX\Sigma_1\Sigma_1^TX^T$

$\displaystyle \Sigma_1^TX^T\Sigma_2Y^T\Sigma_1^TX^T=\Sigma_1^TX^TX\Sigma_1Y\Sigma_2^T=\Sigma_1^T\Sigma_1Y\Sigma_2^T$

$\displaystyle (Y\Sigma_2^TX)(\Sigma_1\Sigma_1^T)=(\Sigma_1^T\Sigma_1)(Y\Sigma_2^TX)$

$\displaystyle (Y\Sigma_2^TX)_{ij}\sigma_j^2(A)=\sigma_i^2(A)(Y\Sigma_2^TX)_{ij}$

$\displaystyle \hbox{trace}(A^TP^TBQ^T)=\hbox{trace}(\Sigma_1^T\Sigma_2)=\sum_{i=1}^s\sigma_i(A)\sigma_i(B)$

$\displaystyle PAQ=U_2U_1^TU_1\Sigma_1V_1^TV_1V_2^T=U_2\Sigma_1V_2^T$

\displaystyle\begin{aligned} \Vert B-PAQ\Vert_F&=\Vert B-U_2\Sigma_1V_2^T\Vert_F=\sqrt{\Vert A\Vert_F^2+\Vert B\Vert_F^2-2\sum_{i=1}^s\sigma_i(A)\sigma_i(B)}\\ &=\sqrt{\sum_{i=1}^s\sigma_i^2(A)+\sum_{i=1}^s\sigma_i^2(B)-2\sum_{i=1}^s\sigma_i(A)\sigma_i(B)}\\ &=\sqrt{\sum_{i=1}^s\left(\sigma_i(A)-\sigma_i(B)\right)^2}, \end{aligned}

$\displaystyle \Vert A\Vert_F=\sqrt{\sum_{i=1}^s\sigma_i^2(A)}$

$\displaystyle \sum_{i=1}^n\sigma_i(A^TB)\le\sum_{i=1}^{\min\{m,n\}}\sigma_i(A)\sigma_i(B)$

[1] 維基百科：Procrustes
[2] 正交 Procrustes 問題也可以表示為求一個 $m\times m$ 階正交矩陣 $Q$ 使最小化 $\Vert B-QA\Vert_F$。因為 $\Vert A\Vert_F=\hbox{trace}(A^TA)=\hbox{trace}(AA^T)=\Vert A^T\Vert_F$，推得 $\Vert B-QA\Vert_F=\Vert B^T-A^TQ^T\Vert$，其中 $Q^T$ 是一個正交矩陣。
[3] 取自 G. H. Golub and C. F. Van Loan, Matrix Computations, 2nd edition, 1989, pp 582.
[4] 維基百科：排序不等式

### 8 Responses to 正交 Procrustes 問題

1. 張盛東 說道：

老師，這種矩陣近似一般在什麼場合下使用呢？

2. 徐君 說道：

老师，你在文中说：比較等號兩邊可得 QV=U，故 Q=UV^T 為一最佳解。那这个解是不是唯一的吗？或者说正交 Procrustes 問題是不是一个凸问题？

3. 徐君 說道：

老师，你在文中说: 設上式為零，可得 A^TB=(\Lambda+\Lambda^T)Q，或 A^TBQ^T=\Lambda+\Lambda^T。这里用到了QQ^T=I这个条件。可是，在Lagrangian 函數里还能用到之前有的约束条件（比如QQ^T=I）吗？

4. 徐君 說道：
5. 徐君 說道：

正交 Procrustes 問題里的Q可以有更直观的解决方法，请看https://jianfengwang.files.wordpress.com/2015/07/proof_of_the_orthogonal_procrustes_problem.pdf

• ccjou 說道：

謝謝，確實是有更簡潔的推導法，有空我再補上來。

(見“補充一個較快捷的推導方式”該段)

6. 徐君 說道：

谢谢老师！老师这篇帖子里提到的技巧帮助我证明了我论文中的一个定理。