1、函数对向量的微分
\quad定义
多元函数f(x)=f(x1,x2,..,xn)f(x)=f(x_1, x_2, .., x_n)f(x)=f(x1,x2,..,xn), x∈x\inx∈RnR^nRn,称列向量
(∂f(x)∂x1,∂f(x)∂x2,⋯ ,∂f(x)∂xn)T
(\dfrac{\partial f(x)}{\partial x_1},
\dfrac{\partial f(x)}{\partial x_2},
\cdots,
\dfrac{\partial f(x)}{\partial x_n} )^T
(∂x1∂f(x),∂x2∂f(x),⋯,∂xn∂f(x))T
为函数f(x)f(x)f(x)对向量xxx的微分或梯度,记为df(x)dx\dfrac{df(x)}{dx}dxdf(x)或∇xf(x)\nabla_xf(x)∇xf(x),也记为grad f(x)grad\,f(x)gradf(x)或∇f(x)\nabla f(x)∇f(x)。
\quad(1)f(x)=Axf(x)=Axf(x)=Ax,则∇f(x)=AT\nabla f(x)=A^T∇f(x)=AT,下式中αi\alpha_iαi为列向量
。
f(x)=(α1,α2,⋯ ,αn)(x1,x2,⋯ ,xn)T=α1x1+α2x2+⋯+αnxn∇f(x)=(α1,α2,⋯ ,αn)T=AT
\begin{aligned}
&f(x)=(\alpha_1,\alpha_2,\cdots,\alpha_n)(x_1,x_2,\cdots,x_n)^T
=\alpha_1x_1+\alpha_2x_2+\cdots+\alpha_nx_n \\\\
& \nabla f(x)=(\alpha_1,\alpha_2,\cdots,\alpha_n)^T=A^T
\end{aligned}
f(x)=(α1,α2,⋯,αn)(x1,x2,⋯,xn)T=α1x1+α2x2+⋯+αnxn∇f(x)=(α1,α2,⋯,αn)T=AT
\quad(2)f(x)=xTAf(x)=x^TAf(x)=xTA,则∇f(x)=A\nabla f(x)=A∇f(x)=A,下式中αi\alpha_iαi为行向量
。
f(x)=(x1,x2,⋯ ,xn)(α1,α2,⋯ ,αn)T=α1x1+α2x2+⋯+αnxn∇f(x)=(α1,α2,⋯ ,αn)T=A
\begin{aligned}
&f(x)=(x_1,x_2,\cdots,x_n)(\alpha_1,\alpha_2,\cdots,\alpha_n)^T
=\alpha_1x_1+\alpha_2x_2+\cdots+\alpha_nx_n \\\\
&\nabla f(x)=(\alpha_1,\alpha_2,\cdots,\alpha_n)^T=A
\end{aligned}
f(x)=(x1,x2,⋯,xn)(α1,α2,⋯,αn)T=α1x1+α2x2+⋯+αnxn∇f(x)=(α1,α2,⋯,αn)T=A
\quad(3)f(x)=yTAxf(x)=y^TAxf(x)=yTAx,则∇f(x)=ATy\nabla f(x)=A^Ty∇f(x)=ATy。
\quad(4)f(x)=xTAxf(x)=x^TAxf(x)=xTAx,则∇f(x)=(AT+A)x\nabla f(x)=(A^T+A)x∇f(x)=(AT+A)x。
\quad\quad\quad方法一:
f(x)=(x1x2⋯xn)(a11a12⋯a1na21a22⋯a2n⋮⋮⋱⋮an1an2⋯ann)(x1x2⋮xn)=(∑i=1nxiai1,∑i=1nxiai1,⋯ ,∑i=1nxiai1)(x1,x2,⋯ ,xnT)=∑j=1n∑i=1nxiaijxj∇f(x)=(A+AT)x
\begin{aligned}
& \begin{aligned}
f(x)&=
\begin{pmatrix}
x_1 & x_2 & \cdots & x_n
\end{pmatrix}
\begin{pmatrix}
a_{11} & a_{12} & \cdots & a_{1n} \\
a_{21} & a_{22} & \cdots & a_{2n} \\
\vdots & \vdots & \ddots & \vdots \\
a_{n1} & a_{n2} & \cdots & a_{nn} \\
\end{pmatrix}
\begin{pmatrix}
x_1 \\ x_2 \\ \vdots \\ x_n
\end{pmatrix}
\\&=
(\sum_{i=1}^n x_ia_{i1},\sum_{i=1}^n x_ia_{i1},\cdots,\sum_{i=1}^n x_ia_{i1})
(x_1 ,x_2,\cdots,x_n^T)\\
&=\sum_{j=1}^n\sum_{i=1}^n x_i a_{ij} x_j
\end{aligned} \\
&\nabla f(x)=(A+A^T)x
\end{aligned}
f(x)=(x1x2⋯xn)⎝⎜⎜⎜⎛a11a21⋮an1a12a22⋮an2⋯⋯⋱⋯a1na2n⋮ann⎠⎟⎟⎟⎞⎝⎜⎜⎜⎛x1x2⋮xn⎠⎟⎟⎟⎞=(i=1∑nxiai1,i=1∑nxiai1,⋯,i=1∑nxiai1)(x1,x2,⋯,xnT)=j=1∑ni=1∑nxiaijxj∇f(x)=(A+AT)x
\quad\quad\quad方法二:
∇f(x)=dxTAxdx=(dxT)Axdx+xTAdxdx=(A+AT)x
\nabla f(x)=\frac{dx^TAx}{dx}=\frac{(dx^T)Ax}{dx}+\frac{x^TAdx}{dx}=(A+A^T)x
∇f(x)=dxdxTAx=dx(dxT)Ax+dxxTAdx=(A+AT)x