696 words
3 minutes
工科高等代数3.24

SVD(SingularValueDecomposition)SVD(Singular-Value-Decomposition)#

定理:每个m×nm\times n实矩阵AA都可以写成#

A=PSQTA=PSQ^{T}
其中P,QP,Qmm阶与nn阶正交矩阵,

S=[σ1...σr0]m×n\begin{align} S=\begin{bmatrix} \sigma_1 & & & \\ &... & & \\ & &\sigma_r & \\ & & &0 \\ \end{bmatrix}_{m\times n} \end{align}

r=rank(A),σ1σ2......σr0r=rank(A),\sigma_1\ge\sigma_2\ge...\ge...\sigma_r\ge0
其中σi\sigma_iATAA^TA正特征值的算术平方根
SSAA的奇异矩阵,σi\sigma_iAA的奇异值,对应的向量为奇异向量
AAm×nm\times n实矩阵,由于AATAA^T实对称
存在正交矩阵PP使得

AAT=[β1,β2,...,βm][λ1...λr0]PT\begin{align} AA^T=[\beta_1,\beta_2,...,\beta_m] \begin{bmatrix} \lambda_1 & & & \\ &... & & \\ & &\lambda_r & \\ & & &0 \\ \end{bmatrix}P^T \end{align}

AATAA^T的特征值非负:
λ1λ2...λr>λr+1=...=0\lambda_1\ge\lambda_2\ge...\ge\lambda_r>\lambda_{r+1}=...=0

(ATP)TATP=PTAATP=[λ1...λr0]\begin{align} (A^TP)^TA^TP=P^TAA^TP= \begin{bmatrix} \lambda_1 & & & \\ &... & & \\ & &\lambda_r & \\ & & &0 \\ \end{bmatrix} \end{align}

ATPA^TP的列向量ATβ,...,ATβmA^T\beta,...,A^T\beta_m
两两正交且长度分别为
λ1,...,λr,0,...,0\sqrt{\lambda_1},...,\sqrt{\lambda_r},0,...,0
γi=1λiATβi\gamma_i=\frac{1}{\sqrt{\lambda_i}}A^T\beta_i
γ1,...,γr\gamma_1,...,\gamma_r可扩充成RnR^n的标准正交基
γ1,...,γr,...,γn\gamma_1,...,\gamma_r,...,\gamma_n

{γr+1γ1=0γr+1γ2=0...γr+1γr=0\begin{cases} \gamma_{r+1}\gamma_1=0\\ \gamma_{r+1}\gamma_2=0\\ ...\\ \gamma_{r+1}\gamma_r=0\\ \end{cases}

解方程即可
记正交矩阵Q=[γ1,...,γn]Q=[\gamma_1,...,\gamma_n]我们有

ATP=Q[λ1...λr0]n×m\begin{align} A^TP =Q \begin{bmatrix} \sqrt{\lambda_1} & & & \\ &... & & \\ & &\sqrt{\lambda_r} & \\ & & &0 \\ \end{bmatrix}_{n\times m} \end{align}

故每个m×nm\times n实对称AA都能写成

A=PSQT=[β1,β2,...,βm][λ1...λr0]m×n[γ1,γ2,...,γn]T=λ1β1γ1T+λ2β2γ2T+...+λrβrγrT\begin{align} A&=PSQ^{T}\\ &=[\beta_1,\beta_2,...,\beta_m] \begin{bmatrix} \sqrt{\lambda_1} & & & \\ &... & & \\ & &\sqrt{\lambda_r} & \\ & & &0 \\ \end{bmatrix}_{m\times n}[\gamma_1,\gamma_2,...,\gamma_n]^T\\ &=\sqrt{\lambda_1}\beta_1\gamma_1^T+\sqrt{\lambda_2}\beta_2\gamma_2^T+...+\sqrt{\lambda_r}\beta_r\gamma_r^T \end{align}

空间r+rn+rm>O((m+n)r)r+rn+rm->O((m+n)r)
后面的奇异值很小接近零,

在秩kk的限制下矩阵AA的最佳逼近:#

Ak=λ1β1γ1T+λ2β2γ2T+...+λkβkγkTAA_k=\sqrt{\lambda_1}\beta_1\gamma_1^T+\sqrt{\lambda_2}\beta_2\gamma_2^T+...+\sqrt{\lambda_k}\beta_k\gamma_k^T\rightarrow A

AATP=[β1,β2,...,βm][λ1...λr0]m×n\begin{align} AA^TP=[\beta_1,\beta_2,...,\beta_m] \begin{bmatrix} \sqrt{\lambda_1} & & & \\ &... & & \\ & &\sqrt{\lambda_r} & \\ & & &0 \\ \end{bmatrix}_{m\times n} \end{align}


A[ATβ1,ATβ2,...,ATβr]=[λ1β1,λ2β2,...,λrβr]A[A^T\beta_1,A^T\beta_2,...,A^T\beta_r]=[\lambda_1\beta_1,\lambda_2\beta_2,...,\lambda_r\beta_r]
A[γ1,γ2,...,γr]=[λ1β1,λ2β2,...,λrβr]A[\gamma_1,\gamma_2,...,\gamma_r]=[\sqrt{\lambda_1}\beta_1,\sqrt{\lambda_2}\beta_2,...,\sqrt{\lambda_r}\beta_r]

例:求下矩阵SVDSVD分解#

A=[111111]\begin{align} A=\begin{bmatrix} 1 &1\\ 1 &1\\ 1 &1\\ \end{bmatrix} \end{align}
1.ATAA^TA#
ATA=[3333]\begin{align} A^TA=\begin{bmatrix} 3 &3\\ 3 &3\\ \end{bmatrix} \end{align}

λ1=6,λ2=0\lambda_1=6,\lambda_2=0
σ1=6,σ2=0\sigma_1=\sqrt6,\sigma_2=0

2.右奇异向量#
λ1=6,v1=12[11]\begin{align} \lambda_1=6,v_{1}=\frac{1}{\sqrt{2}}\begin{bmatrix} 1 \\ 1 \\ \end{bmatrix} \end{align}λ1=0,v2=12[11]\begin{align} \lambda_1=0,v_{2}=\frac{1}{\sqrt{2}}\begin{bmatrix} 1 \\ -1 \\ \end{bmatrix} \end{align}
3.求左奇异向量#
u1=Av1σ1=16[222]=13[111]\begin{align} u_{1}=\frac{Av_1}{\sigma_1}=\frac{1}{\sqrt{6}}\begin{bmatrix} 2 \\ 2 \\ 2 \\ \end{bmatrix} =\frac{1}{\sqrt{3}}\begin{bmatrix} 1 \\ 1 \\ 1 \\ \end{bmatrix} \end{align}
补充正交基:#
u2=12[110],u3=16[112]\begin{align} u_{2}=\frac{1}{\sqrt{2}}\begin{bmatrix} 1 \\ -1 \\ 0 \\ \end{bmatrix}, u_3=\frac{1}{\sqrt{6}}\begin{bmatrix} 1 \\ 1 \\ -2 \\ \end{bmatrix} \end{align}A=[u1,u2,u3][600000][v1,v2]T\begin{align} A=[u_1,u_2,u_3]\begin{bmatrix} \sqrt{6} &0\\ 0 &0\\ 0 &0\\ \end{bmatrix}[v_1,v_2]^T \end{align}

MoorePenroseMoore--Penrose广义逆#

实数域下MPMP广义逆的定义
AAA=AAA^{\dagger}A=A
AAA=AA^\dagger AA^\dagger=A^\dagger
(AA)T=AA(AA^{\dagger})^T=AA^\dagger
(AA)T=AA(A^{\dagger}A)^T=A^\dagger A
第一个满足(AAI)A=0(AA^\dagger-I)A=0
存在且唯一,
可逆时A=A1A^\dagger=A^{-1}
(A)=A(A^\dagger)^\dagger=A
考虑如何解析的求出来

定理:#

A=PSQT=(PQT)QSQTA=PSQ^T=(PQ^T)QSQ^T
每个实方阵都可以写成一个正交矩阵与一个实对称矩阵的乘积
实矩阵的MPM-P广义逆:若A=PSQTA=PSQ^T,则

A=Q[σ11...σr10]n×mPT\begin{align} A^\dagger=Q\begin{bmatrix} \sigma_1^{-1} & & & \\ &... & & \\ & &\sigma_r^{-1} & \\ & & &0 \\ \end{bmatrix}_{n\times m}P^T \end{align}AA=P[Ir000]PT\begin{align} AA^\dagger=P\begin{bmatrix} I_{r} &0 \\ 0 &0 \\ \end{bmatrix}P^T \end{align}

证明:

1.存在性:奇异值分解#
2.唯一性:#

X,YCn×mX,Y\in C^{n\times m}均满足上述条件
A=AYAX=XAXX=X(AYA)X=XAYAXA=AYA\rightarrow X=XAX--X=X(AYA)X=XAYAX
Y=YAYXAYAX=XA(YAY)AX=(XAY)A(YAX)Y=YAY\rightarrow XAYAX=XA(YAY)AX=(XAY)A(YAX)
(AX)T=AX,(AY)T=AY(AX)^T=AX,(AY)^T=AY
XAY=(XAY)T=YAX()XAY=(XAY)^T=YAX(*)
于是X=(XAY)A(YAX)=(XAY)A(XAY)X=(XAY)A(YAX)=(XAY)A(XAY)
同理Y=(XAY)A(XAY)Y=(XAY)A(XAY)
X=YX=Y
注:从()(*)开始推导错误

X=XAX=X(AX)T=XXTAT=XXT(AYA)T=XXTATYTAT=X(XTAT)(YTAT)=X(AX)T(AY)T=XAXAY=XAY\begin{align} X&=XAX\\ &=X(AX)^T\\ &=XX^{T}A^T\\ &=XX^{T}(AYA)^T\\ &=XX^TA^TY^TA^T\\ &=X(X^TA^T)(Y^TA^T)\\ &=X(AX)^T(AY)^T\\ &=XAXAY\\ &=XAY \end{align}Y=YAY=(YA)TY=ATYTY=(AXA)TYTY=ATXTATYTY=XAYAY=XAY=Y\begin{align} Y&=YAY\\ &=(YA)^TY\\ &=A^TY^TY\\ &=(AXA)^TY^TY\\ &=A^TX^TA^TY^TY\\ &=XAYAY\\ &=XAY=Y \end{align}

ARm×nA\in R^{m\times n},定义P=AAP=AA^{\dagger}.证明PPcol(A)col(A)的正交投影算子:#

P2=P,PT=P,range(P)=col(A)P^2=P,P^T=P,range(P)=col(A)
P2=AAAA=(AAA)A=AA=PP^{2}=AA^{\dagger}AA^{\dagger}=(AA^{\dagger}A)A^{\dagger}=AA^{\dagger}=P
PT=(AA)T=AA=PP^{T}=(AA^{\dagger})^{T}=AA^\dagger=P
xRn\forall x\in R^n
Px=AAx=A(Ax)col(A)Px=AA^\dagger x=A(A^\dagger x)\in col(A)
range(P)col(A)range(P)\subseteq col(A)
ycol(A),z,st.y=Azy\in col(A),\exists z,st.y=Az
Py=AAAz=Az=yPy=AA^{\dagger}Az=Az=y
col(A)range(P)col(A)\subseteq range(P)
range(P)=col(A)range(P)=col(A)

证明:trace(ATA)=iσi2trace(A^TA)=\sum_{i}\sigma_i^2#

A=UΣVT=(aij)m×nA=U\Sigma V^{T}=(a_{ij})_{m\times n}
tr(ATA)=tr(VΣTUTUΣVT)=tr(VΣTΣVT)=tr(ΣVTVΣT)=tr(ΣΣT)tr(A^TA)=tr(V\Sigma^TU^TU\Sigma V^T)=tr(V\Sigma^T\Sigma V^T)=tr(\Sigma V^T V\Sigma^T)=tr(\Sigma\Sigma^T)
i,jai,j2=iσi2\sum_{i,j}a_{i,j}^{2}=\sum_{i}\sigma_i^2
ARm×n,A\in R^{m\times n},证明:
maxx2=1Ax2=σmax(A)max_{||x||_2=1}||Ax||_2=\sigma_{max}(A)
只要看Ax22||Ax||_2^2的最大值
Ax22=Ax,Ax=xTATAx||Ax||_2^2=\langle Ax,Ax\rangle=x^TA^TAx
maxx2=1Ax22=λmax(A)max_{||x||_2=1}||Ax||_2^2=\lambda_{max}(A)
maxx2=1Ax2=σmax(A)max_{||x||_2=1}||Ax||_2=\sigma_{max}(A)

线性算子 AA#

A(α1x1+α2x2)=α1Ax1+α2Ax2A(\alpha_{1}x_{1}+\alpha_{2}x_2)=\alpha_{1}Ax_1+\alpha_2Ax_2

BigOBig-\mathcal{O}#

f(n)=O(g(n))f(n)=\mathcal{O}(g(n))
f(n)Cg(n),C>0f(n)\le Cg(n),\exists C>0
1<logn<n<nlogn<n2<2n1<\log n<n<n\log n<n^2<2^n