前言
本文是 模型预测控制(2022春)lecture 1-1 Unconstrained MPC 的学习笔记,非常感谢诸兵老师的精彩课程。
符号说明
针对离散时不变线性系统:
x
k
+
1
=
A
x
k
+
B
u
k
(1)
x_{k+1}=Ax_k+Bu_k \tag{1}
xk+1=Axk+Buk(1)
x
k
x_k
xk为系统状态,
u
k
u_k
uk为系统输入。
MPC需要分析模型预测的状态,将在
k
k
k 时刻预测未来第
i
i
i 步的状体记为
x
(
i
∣
k
)
x_{(i|k)}
x(i∣k),
对应MPC在
k
k
k 时刻优化得到的第
i
i
i 步输入为
u
(
i
∣
k
)
u_{(i|k)}
u(i∣k).
最优的状态、输入和代价使用上标
∗
^*
∗ 标记。
一、无约束MPC求解
1.1 定义代价函数
定义代价函数为:
J
=
∑
i
=
1
N
(
x
(
i
∣
k
)
T
Q
x
(
i
∣
k
)
+
u
(
i
−
1
∣
k
)
T
R
u
(
i
−
1
∣
k
)
)
J=\sum_{i=1}^N \left(x_{(i|k)}^TQx_{(i|k)} + u_{(i-1|k)}^TRu_{(i-1|k)} \right)
J=i=1∑N(x(i∣k)TQx(i∣k)+u(i−1∣k)TRu(i−1∣k))
可以写为矩阵形式:
J
=
X
T
Q
X
+
U
T
R
U
(2)
J=X^T\mathcal{Q}X +U^T\mathcal{R}U \tag{2}
J=XTQX+UTRU(2)
其中
X
=
[
x
(
1
∣
k
)
x
(
2
∣
k
)
⋯
x
(
N
∣
k
)
]
T
X = [x_{(1|k)} ~ x_{(2|k)} ~ \cdots~x_{(N|k)}]^T
X=[x(1∣k) x(2∣k) ⋯ x(N∣k)]T,
U
=
[
u
(
0
∣
k
)
u
(
1
∣
k
)
⋯
u
(
N
−
1
∣
k
)
]
T
U = [u_{(0|k)} ~ u_{(1|k)} ~ \cdots~u_{(N-1|k)}]^T
U=[u(0∣k) u(1∣k) ⋯ u(N−1∣k)]T,
Q
=
d
i
a
g
(
Q
,
Q
,
⋯
,
Q
)
\mathcal{Q}=\mathrm{diag}(Q, Q, \cdots, Q)
Q=diag(Q,Q,⋯,Q),
R
=
d
i
a
g
(
R
,
R
,
⋯
,
R
)
\mathcal{R}=\mathrm{diag}(R, R, \cdots, R)
R=diag(R,R,⋯,R).
1.2 状态预测
系统的未来状态与当前状态和输入有关,故可以根据系统模型预测未来的所有状态,并求解最优的系统输入来使代价函数最小化。
根据式 (1) 进行预测,有:
x
(
1
∣
k
)
=
A
x
(
0
∣
k
)
+
B
u
(
0
∣
k
)
x
(
2
∣
k
)
=
A
x
(
1
∣
k
)
+
B
u
(
1
∣
k
)
=
A
(
A
x
(
0
∣
k
)
+
B
u
(
0
∣
k
)
)
+
B
u
(
1
∣
k
)
=
A
2
x
(
0
∣
k
)
+
A
B
u
(
0
∣
k
)
+
B
u
(
1
∣
k
)
x
(
3
∣
k
)
=
A
x
(
2
∣
k
)
+
B
u
(
2
∣
k
)
=
A
(
A
2
x
(
0
∣
k
)
+
A
B
u
(
0
∣
k
)
+
B
u
(
1
∣
k
)
)
+
B
u
(
2
∣
k
)
=
A
3
x
(
0
∣
k
)
+
A
2
B
u
(
0
∣
k
)
+
A
B
u
(
1
∣
k
)
+
B
u
(
2
∣
k
)
⋮
x
(
N
∣
k
)
=
A
N
x
(
0
∣
k
)
+
A
N
−
1
B
u
(
0
∣
k
)
+
A
N
−
2
B
u
(
1
∣
k
)
+
⋯
+
B
u
(
N
−
1
∣
k
)
\begin{align*} x_{(1|k)} &= Ax_{(0|k)} + Bu_{(0|k)} \\ x_{(2|k)} &= Ax_{(1|k)} + Bu_{(1|k)} \\ &= A(Ax_{(0|k)} + Bu_{(0|k)} ) + Bu_{(1|k)} \\ &= A^2 x_{(0|k)} + ABu_{(0|k)} + Bu_{(1|k)} \\ x_{(3|k)} &= Ax_{(2|k)} + Bu_{(2|k)} \\ &= A( A^2 x_{(0|k)} + ABu_{(0|k)} + Bu_{(1|k)} ) + Bu_{(2|k)} \\ &= A^3 x_{(0|k)} + A^2Bu_{(0|k)} + ABu_{(1|k)} +Bu_{(2|k)} \\ & \hspace{0.2cm} \vdots \\ x_{(N|k)} &= A^N x_{(0|k)} + A^{N-1}Bu_{(0|k)} + A^{N-2}Bu_{(1|k)} + \cdots + Bu_{(N-1|k)} \\ \end{align*}
x(1∣k)x(2∣k)x(3∣k)x(N∣k)=Ax(0∣k)+Bu(0∣k)=Ax(1∣k)+Bu(1∣k)=A(Ax(0∣k)+Bu(0∣k))+Bu(1∣k)=A2x(0∣k)+ABu(0∣k)+Bu(1∣k)=Ax(2∣k)+Bu(2∣k)=A(A2x(0∣k)+ABu(0∣k)+Bu(1∣k))+Bu(2∣k)=A3x(0∣k)+A2Bu(0∣k)+ABu(1∣k)+Bu(2∣k)⋮=ANx(0∣k)+AN−1Bu(0∣k)+AN−2Bu(1∣k)+⋯+Bu(N−1∣k)
写为矩阵形式为:
X
=
G
x
(
0
∣
k
)
+
H
U
(3)
X=\mathcal{G}x_{(0|k)} + \mathcal{H}U \tag{3}
X=Gx(0∣k)+HU(3)
其中,
X
=
[
x
(
1
∣
k
)
x
(
2
∣
k
)
⋯
x
(
N
∣
k
)
]
T
U
=
[
u
(
0
∣
k
)
u
(
1
∣
k
)
⋯
u
(
N
−
1
∣
k
)
]
T
G
=
[
A
A
2
⋯
A
N
]
T
H
=
[
B
0
0
⋯
0
A
B
B
0
⋯
0
A
2
B
A
B
B
⋯
0
⋮
⋮
⋮
⋱
⋮
A
N
−
1
B
A
2
B
A
B
⋯
B
]
\begin{align*} X &= [x_{(1|k)} ~ x_{(2|k)} ~ \cdots~x_{(N|k)}]^T \\ U &= [u_{(0|k)} ~ u_{(1|k)} ~ \cdots~u_{(N-1|k)}]^T \\ \mathcal{G} &= \left[ A ~ A^2 ~\cdots ~ A^N \right]^T \\ \mathcal{H} &= \begin{bmatrix} B & 0 & 0 & \cdots & 0\\ AB & B & 0 & \cdots & 0\\ A^2B & AB & B & \cdots & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ A^{N-1}B & A^2B & AB & \cdots & B \end{bmatrix} \end{align*}
XUGH=[x(1∣k) x(2∣k) ⋯ x(N∣k)]T=[u(0∣k) u(1∣k) ⋯ u(N−1∣k)]T=[A A2 ⋯ AN]T=
BABA2B⋮AN−1B0BAB⋮A2B00B⋮AB⋯⋯⋯⋱⋯000⋮B
1.3 优化求解
将式 (3) 代入式 (2) 得:
J
=
(
G
x
(
0
∣
k
)
+
H
U
)
T
Q
(
G
x
(
0
∣
k
)
+
H
U
)
+
U
T
R
U
=
(
G
x
(
0
∣
k
)
)
T
Q
G
x
(
0
∣
k
)
+
2
(
H
U
)
T
Q
G
x
(
0
∣
k
)
+
(
H
U
)
T
Q
H
U
+
U
T
R
U
=
(
G
x
(
0
∣
k
)
)
T
Q
G
x
(
0
∣
k
)
+
2
U
T
H
T
Q
G
x
(
0
∣
k
)
+
U
T
(
H
T
Q
H
+
R
)
U
(4)
\begin{align*} J &= (\mathcal{G}x_{(0|k)} + \mathcal{H}U)^T \mathcal{Q} (\mathcal{G}x_{(0|k)} + \mathcal{H}U) + U^T \mathcal{R} U \\ &= (\mathcal{G}x_{(0|k)})^T \mathcal{Q} \mathcal{G}x_{(0|k)} + 2(\mathcal{H}U)^T \mathcal{Q} \mathcal{G}x_{(0|k)} + (\mathcal{H}U)^T \mathcal{Q} \mathcal{H}U + U^T\mathcal{R}U \\ &= (\mathcal{G}x_{(0|k)})^T \mathcal{Q} \mathcal{G}x_{(0|k)} + 2U^T\mathcal{H}^T \mathcal{Q} \mathcal{G}x_{(0|k)} + U^T (\mathcal{H}^T \mathcal{Q} \mathcal{H} + \mathcal{R})U \end{align*} \tag{4}
J=(Gx(0∣k)+HU)TQ(Gx(0∣k)+HU)+UTRU=(Gx(0∣k))TQGx(0∣k)+2(HU)TQGx(0∣k)+(HU)TQHU+UTRU=(Gx(0∣k))TQGx(0∣k)+2UTHTQGx(0∣k)+UT(HTQH+R)U(4)
另导数为 0 求极值,有:
∂
J
∂
U
=
2
H
T
Q
G
x
(
0
∣
k
)
+
2
(
H
T
Q
H
+
R
)
U
=
0
⇒
U
=
−
(
H
T
Q
H
+
R
)
−
1
H
T
Q
G
x
(
0
∣
k
)
\begin{align*} \frac{\partial J}{\partial U} &= 2\mathcal{H}^T \mathcal{Q} \mathcal{G}x_{(0|k)} + 2(\mathcal{H}^T \mathcal{Q} \mathcal{H} + \mathcal{R})U = 0 \\ \Rightarrow \hspace{0.4cm} U &= -(\mathcal{H}^T \mathcal{Q} \mathcal{H} + \mathcal{R})^{-1}\mathcal{H}^T \mathcal{Q} \mathcal{G}x_{(0|k)} \tag{5} \end{align*}
∂U∂J⇒U=2HTQGx(0∣k)+2(HTQH+R)U=0=−(HTQH+R)−1HTQGx(0∣k)(5)
当
∂
2
J
/
∂
U
2
=
2
(
H
T
Q
H
+
R
)
≥
0
\partial^2 J/{\partial U}^2 = 2(\mathcal{H}^T \mathcal{Q} \mathcal{H} + \mathcal{R}) \ge 0
∂2J/∂U2=2(HTQH+R)≥0 ,式 (5) 可求得最优控制序列,使代价函数最小化。
式 (6) 可写成
U
=
−
K
x
(
0
∣
k
)
U=-Kx_{(0|k)}
U=−Kx(0∣k) 的形式,与LQR所得到的最优线性反馈控制律
u
=
−
K
x
u = -Kx
u=−Kx 相似,区别是此处只优化
N
N
N 步,LQR是优化无穷步(
N
→
∞
N \to \infty
N→∞)。
但针对有约束的MPC,式 (4) 的优化求解更复杂,可使用二次规划求解函数进行求解。
二、稳定性分析
最优并不能保证系统可稳定,需要对系统进行稳定性分析。
2.1 离散系统的李雅普诺夫稳定性判据
间接法:自治系统
x
k
+
1
=
A
x
k
x_{k+1} = Ax_k
xk+1=Axk 中,
A
A
A 的所有特征根的模
∣
e
i
g
(
A
)
<
1
∣
|eig(A)<1|
∣eig(A)<1∣
直接法:存在李雅普诺夫函数
V
(
x
k
)
>
0
V(x_k)>0
V(xk)>0,满足
Δ
V
=
V
(
x
k
+
1
)
−
V
(
x
k
)
<
0
\Delta V = V(x_{k+1})-V(x_k)<0
ΔV=V(xk+1)−V(xk)<0
2.2 离散时不变线性系统的稳定性分析
针对系统 (1),根据式 (5),在状态
k
k
k 可求解出最优控制序列
U
=
K
x
(
0
∣
k
)
U=Kx_{(0|k)}
U=Kx(0∣k),其中
K
=
[
k
0
k
1
.
.
.
k
N
−
1
]
K = [k_0 ~ k_1 ~ ... ~ k_{N-1}]
K=[k0 k1 ... kN−1],即
u
(
i
∣
k
)
=
k
i
x
(
0
∣
k
)
u_{(i|k)}=k_ix_{(0|k)}
u(i∣k)=kix(0∣k)。
在 MPC 的思想中,我们只取控制序列的第一个控制输入应用于系统中。
在定常系统中,
K
=
−
(
H
T
Q
H
+
R
)
−
1
H
T
Q
G
K=-(\mathcal{H}^T \mathcal{Q} \mathcal{H} + \mathcal{R})^{-1}\mathcal{H}^T \mathcal{Q} \mathcal{G}
K=−(HTQH+R)−1HTQG是固定的,故实际控制输入固定为
u
(
k
)
=
u
(
0
∣
k
)
=
k
0
x
(
0
∣
k
)
u_{(k)} = u_{(0|k)} = k_0x_{(0|k)}
u(k)=u(0∣k)=k0x(0∣k).
根据间接法,当满足
∣
e
i
g
(
A
−
B
k
0
)
<
1
∣
|eig(A-Bk_0)<1|
∣eig(A−Bk0)<1∣ 时,系统稳定。
但当预测空间
N
N
N 不同时,求解到的
k
0
k_0
k0 也不同,不能保证系统都是稳定的。
2.3 如何保证系统稳定性
2.3.1 优化无限步的预测空间
若系统可控,在
k
k
k 时刻可以找到一组可行的最优解
u
(
i
∣
k
)
∗
,
i
=
0
,
1
,
2
,
⋯
u_{(i|k)}^*, ~i=0,1,2,\cdots
u(i∣k)∗, i=0,1,2,⋯.
此时,MPC优化得到的最优代价为:
J
k
∗
=
∑
i
=
1
N
(
(
x
(
i
∣
k
)
∗
)
T
Q
x
(
i
∣
k
)
∗
+
(
u
(
i
−
1
∣
k
)
∗
)
T
R
u
(
i
−
1
∣
k
)
∗
)
(6)
J_k^* = \sum_{i=1}^N \left( (x_{(i|k)}^*)^TQx_{(i|k)}^* + (u_{(i-1|k)}^*)^TRu_{(i-1|k)}^* \right) \tag{6}
Jk∗=i=1∑N((x(i∣k)∗)TQx(i∣k)∗+(u(i−1∣k)∗)TRu(i−1∣k)∗)(6)
当
k
+
1
k+1
k+1 时刻继续按这组控制序列执行,则系统完全按预测的状态进行,有:
J
k
+
1
=
∑
i
=
2
N
(
(
x
(
i
∣
k
)
∗
)
T
Q
x
(
i
∣
k
)
∗
+
(
u
(
i
−
1
∣
k
)
∗
)
T
R
u
(
i
−
1
∣
k
)
∗
)
J_{k+1} = \sum_{i=2}^N \left( (x_{(i|k)}^*)^TQx_{(i|k)}^* + (u_{(i-1|k)}^*)^TRu_{(i-1|k)}^* \right)
Jk+1=i=2∑N((x(i∣k)∗)TQx(i∣k)∗+(u(i−1∣k)∗)TRu(i−1∣k)∗)
注意,因为此时没有重新优化,代价不是状态
k
+
1
k+1
k+1 时的最优代价,最优代价
J
k
+
1
∗
≤
J
k
+
1
J_{k+1}^* \le J_{k+1}
Jk+1∗≤Jk+1,有:
J
k
+
1
∗
−
J
k
∗
≤
J
k
+
1
−
J
k
∗
=
−
(
x
(
1
∣
k
)
∗
)
T
Q
x
(
1
∣
k
)
∗
+
(
u
(
0
∣
k
)
∗
)
T
R
u
(
0
∣
k
)
∗
)
<
0
(7)
J_{k+1}^* - J_{k}^* \le J_{k+1} - J_{k}^* = -\left( x_{(1|k)}^*)^TQx_{(1|k)}^* + (u_{(0|k)}^*)^TRu_{(0|k)}^* \right) <0 \tag{7}
Jk+1∗−Jk∗≤Jk+1−Jk∗=−(x(1∣k)∗)TQx(1∣k)∗+(u(0∣k)∗)TRu(0∣k)∗)<0(7)
此处认为 J k + 1 J_{k+1} Jk+1 只有 N − 1 N-1 N−1 项,而 J k ∗ J_{k}^* Jk∗ 有 N N N 项,其中 N → ∞ N \to \infty N→∞。
故可选代价函数式 (2) 作为李雅普诺夫函数
V
(
x
k
)
V(x_k)
V(xk),显然有:
V
(
x
k
)
>
0
V(x_k)>0
V(xk)>0,
同时,根据式 (7) ,有
V
(
x
k
+
1
)
−
V
(
x
k
)
<
0
V(x_{k+1}) - V(x_{k})<0
V(xk+1)−V(xk)<0.
故系统渐近稳定。
2.3.2 求 P P P 以代替无限步的总代价
当优化无限步时,可使得系统渐近稳定,但这无法直接实现。
由于系统渐近稳定,总代价是收敛的,故可计算无限项的总代价来进行等效,有:
J
(
k
)
=
∑
i
=
1
∞
(
(
x
(
i
∣
k
)
)
T
Q
x
(
i
∣
k
)
+
(
u
(
i
−
1
∣
k
)
)
T
R
u
(
i
−
1
∣
k
)
)
=
x
(
0
∣
k
)
T
P
x
(
0
∣
k
)
−
x
(
0
∣
k
)
T
Q
x
(
0
∣
k
)
(8)
J(k) = \sum_{i=1}^\infty \left( (x_{(i|k)})^TQx_{(i|k)} + (u_{(i-1|k)})^TRu_{(i-1|k)} \right)=x_{(0|k)}^TPx_{(0|k)} - x_{(0|k)}^TQx_{(0|k)} \tag{8}
J(k)=i=1∑∞((x(i∣k))TQx(i∣k)+(u(i−1∣k))TRu(i−1∣k))=x(0∣k)TPx(0∣k)−x(0∣k)TQx(0∣k)(8)
当选取一个反馈增益
K
K
K 使得系统
x
k
+
1
=
(
A
−
B
K
)
x
k
x_{k+1} = (A-BK)x_{k}
xk+1=(A−BK)xk 渐近稳定时(即满足
∣
e
i
g
(
A
−
B
K
)
∣
<
1
|\mathrm{eig}(A-BK)|<1
∣eig(A−BK)∣<1),
P
P
P 可通过下式求解:
P
−
(
A
−
B
K
)
T
P
(
A
−
B
K
)
=
Q
+
K
T
R
K
(9)
P-(A-BK)^TP(A-BK) = Q + K^TRK \tag{9}
P−(A−BK)TP(A−BK)=Q+KTRK(9)
证明:
式 (9) 两端同乘 x ( k ) x_{(k)} x(k),有:
x k T [ P − ( A − B K ) T P ( A − B K ) ] x k = x k T [ Q + K T R K ] x k ⇒ x k T P x k − x k + 1 T P x k + 1 = x k T Q x k + u k T R u k \begin{align*} & x_{k}^T[P - (A-BK)^T P(A-BK)]x_{k} = x_{k}^T[Q + K^TRK]x_{k} \\ \Rightarrow& \hspace{1cm}x_{k}^TPx_{k} - x_{k+1}^T Px_{k+1} = x_{k}^TQx_{k} + u_{k}^TRu_{k} \end{align*} ⇒xkT[P−(A−BK)TP(A−BK)]xk=xkT[Q+KTRK]xkxkTPxk−xk+1TPxk+1=xkTQxk+ukTRuk
根据上式有:
x k T P x k − x k + 1 T P x k + 1 = x k T Q x k + u k T R u k x k + 1 T P x k + 1 − x k + 2 T P x k + 2 = x k + 1 T Q x k + 1 + u k + 1 T R u k + 1 ⋮ x k + N T P x k + N − x k + N + 1 T P x k + N + 1 = x k + N T Q x k + N + u k + N T R u k + N \begin{align*} x_{k}^TPx_{k} - x_{k+1}^T Px_{k+1} &= x_{k}^TQx_{k} + u_{k}^TRu_{k} \\ x_{k+1}^TPx_{k+1} - x_{k+2}^T Px_{k+2} &= x_{k+1}^TQx_{k+1} + u_{k+1}^TRu_{k+1} \\ \vdots & \\ x_{k+N}^TPx_{k+N} - x_{k+N+1}^T Px_{k+N+1} &= x_{k+N}^TQx_{k+N} + u_{k+N}^TRu_{k+N} \end{align*} xkTPxk−xk+1TPxk+1xk+1TPxk+1−xk+2TPxk+2⋮xk+NTPxk+N−xk+N+1TPxk+N+1=xkTQxk+ukTRuk=xk+1TQxk+1+uk+1TRuk+1=xk+NTQxk+N+uk+NTRuk+N
将以上各式相加得:
x k T P x k − x k + N + 1 T P x k + N + 1 = ∑ i = k k + N ( x i T Q x i + u i T R u i ) \begin{align*} x_{k}^TPx_{k} - x_{k+N+1}^T Px_{k+N+1} = \sum_{i=k}^{k+N} \left( x_{i}^TQx_{i} + u_{i}^TRu_{i} \right) \end{align*} xkTPxk−xk+N+1TPxk+N+1=i=k∑k+N(xiTQxi+uiTRui)
由于系统渐近稳定,当 N → ∞ N \to \infty N→∞ 时, x k + N + 1 T P x k + N + 1 → 0 x_{k+N+1}^T Px_{k+N+1} \to 0 xk+N+1TPxk+N+1→0, u k + N T R u k + N → 0 u_{k+N}^T Ru_{k+N} \to 0 uk+NTRuk+N→0.
有 ∑ i = k N ( x k T Q x k + u k T R u k ) = ∑ i = k + 1 N ( x k T Q x k + u k − 1 T R u k − 1 ) + x k T Q x k \sum_{i=k}^{N} \left( x_{k}^TQx_{k} + u_{k}^TRu_{k} \right) = \sum_{i=k+1}^{N} \left( x_{k}^TQx_{k} + u_{k-1}^TRu_{k-1} \right) + x_{k}^TQx_{k} ∑i=kN(xkTQxk+ukTRuk)=∑i=k+1N(xkTQxk+uk−1TRuk−1)+xkTQxk.
得:
x k T P x k − x k T Q x k = ∑ i = k + 1 ∞ ( x i T Q x i + u i − 1 T R u i − 1 ) x_{k}^TPx_{k} - x_{k}^TQx_{k} = \sum_{i=k+1}^{\infty} \left( x_{i}^TQx_{i} + u_{i-1}^TRu_{i-1} \right) xkTPxk−xkTQxk=i=k+1∑∞(xiTQxi+ui−1TRui−1)
即:
x ( 0 ∣ k ) T P x ( 0 ∣ k ) − x ( 0 ∣ k ) T Q x ( 0 ∣ k ) = ∑ i = 1 ∞ ( ( x ( i ∣ k ) ) T Q x ( i ∣ k ) + ( u ( i − 1 ∣ k ) ) T R u ( i − 1 ∣ k ) ) x_{(0|k)}^TPx_{(0|k)} - x_{(0|k)}^TQx_{(0|k)} = \sum_{i=1}^{\infty} \left( (x_{(i|k)})^TQx_{(i|k)} + (u_{(i-1|k)})^TRu_{(i-1|k)} \right) x(0∣k)TPx(0∣k)−x(0∣k)TQx(0∣k)=i=1∑∞((x(i∣k))TQx(i∣k)+(u(i−1∣k))TRu(i−1∣k))
2.3.3 将无限步优化转为有限步优化
J
(
k
)
=
∑
i
=
1
∞
(
(
x
(
i
∣
k
)
)
T
Q
x
(
i
∣
k
)
+
(
u
(
i
−
1
∣
k
)
)
T
R
u
(
i
−
1
∣
k
)
)
=
∑
i
=
1
N
(
(
x
(
i
∣
k
)
)
T
Q
x
(
i
∣
k
)
+
(
u
(
i
−
1
∣
k
)
)
T
R
u
(
i
−
1
∣
k
)
)
+
∑
i
=
N
+
1
∞
(
(
x
(
i
∣
k
)
)
T
Q
x
(
i
∣
k
)
+
(
u
(
i
−
1
∣
k
)
)
T
R
u
(
i
−
1
∣
k
)
)
=
∑
i
=
1
N
(
(
x
(
i
∣
k
)
)
T
Q
x
(
i
∣
k
)
+
(
u
(
i
−
1
∣
k
)
)
T
R
u
(
i
−
1
∣
k
)
)
+
x
(
N
∣
k
)
T
P
x
(
N
∣
k
)
−
x
(
N
∣
k
)
T
Q
x
(
N
∣
k
)
=
∑
i
=
1
N
−
1
(
(
x
(
i
∣
k
)
)
T
Q
x
(
i
∣
k
)
+
(
u
(
i
−
1
∣
k
)
)
T
R
u
(
i
−
1
∣
k
)
)
+
x
(
N
∣
k
)
T
P
x
(
N
∣
k
)
+
u
(
N
−
1
∣
k
)
T
R
u
(
N
−
1
∣
k
)
\begin{align*} J(k) &= \sum_{i=1}^\infty \left( (x_{(i|k)})^TQx_{(i|k)} + (u_{(i-1|k)})^TRu_{(i-1|k)} \right) \\ &= \sum_{i=1}^{N} \left( (x_{(i|k)})^TQx_{(i|k)} + (u_{(i-1|k)})^TRu_{(i-1|k)} \right) + \sum_{i=N+1}^\infty \left( (x_{(i|k)})^TQx_{(i|k)} + (u_{(i-1|k)})^TRu_{(i-1|k)} \right) \\ &= \sum_{i=1}^{N} \left( (x_{(i|k)})^TQx_{(i|k)} + (u_{(i-1|k)})^TRu_{(i-1|k)} \right) + x_{(N|k)}^TPx_{(N|k)} - x_{(N|k)}^TQx_{(N|k)} \\ &= \sum_{i=1}^{N-1} \left( (x_{(i|k)})^TQx_{(i|k)} + (u_{(i-1|k)})^TRu_{(i-1|k)} \right) + x_{(N|k)}^TPx_{(N|k)} + u_{(N-1|k)}^TRu_{(N-1|k)} \end{align*}
J(k)=i=1∑∞((x(i∣k))TQx(i∣k)+(u(i−1∣k))TRu(i−1∣k))=i=1∑N((x(i∣k))TQx(i∣k)+(u(i−1∣k))TRu(i−1∣k))+i=N+1∑∞((x(i∣k))TQx(i∣k)+(u(i−1∣k))TRu(i−1∣k))=i=1∑N((x(i∣k))TQx(i∣k)+(u(i−1∣k))TRu(i−1∣k))+x(N∣k)TPx(N∣k)−x(N∣k)TQx(N∣k)=i=1∑N−1((x(i∣k))TQx(i∣k)+(u(i−1∣k))TRu(i−1∣k))+x(N∣k)TPx(N∣k)+u(N−1∣k)TRu(N−1∣k)
上式可以写成矩阵形式:
J
=
X
T
Q
′
X
+
U
T
R
U
(10)
J=X^T\mathcal{Q}^\prime X +U^T\mathcal{R}U \tag{10}
J=XTQ′X+UTRU(10)
与式 (2) 唯一不同的是,此处
Q
′
=
d
i
a
g
(
Q
,
Q
,
⋯
,
P
)
\mathcal{Q}^\prime=\mathrm{diag}(Q, Q, \cdots, P)
Q′=diag(Q,Q,⋯,P).
由上式进行优化求解即可保证系统是渐近稳定的。
三、求解流程
列如系统:
x
k
+
1
=
A
x
k
+
B
u
k
x_{k+1} = Ax_{k}+Bu_{k}
xk+1=Axk+Buk
其中,
x
∈
R
2
x \in \mathbb{R}^2
x∈R2,
u
∈
R
u \in \mathbb{R}
u∈R,
A
=
[
1.1
2
0
0.95
]
A = \begin{bmatrix} 1.1 & 2 \\ 0 & 0.95 \end{bmatrix}
A=[1.1020.95],
B
=
[
0
0.079
]
B = \begin{bmatrix} 0 \\ 0.079 \end{bmatrix}
B=[00.079].
3.1 选取预测空间 N N N 和对角权重矩阵 Q Q Q、 R R R
令 N = 4 N=4 N=4 , Q = I 2 × 2 Q=I_{2\times2} Q=I2×2, R = 0.1 R = 0.1 R=0.1。
3.2 选取合适的反馈增益 K K K,注意需满足 ∣ e i g ( A − B K ) ∣ < 1 |\mathrm{eig}(A-BK)|<1 ∣eig(A−BK)∣<1
令 K = [ 1.4 5.76 ] K = [1.4~5.76] K=[1.4 5.76],使用MATLAB计算特征根:
A = [1.1 2;0 0.95];
B = [0; 0.079];
K = [1.4 5.76];
eigSys = eig(A - B*K);
disp(abs(eigSys))
得 ∣ e i g ( A − B K ) ∣ 1 = ∣ e i g ( A − B K ) ∣ 2 = 0.8750 < 1 |\mathrm{eig}(A-BK)|_1 = |\mathrm{eig}(A-BK)|_2 = 0.8750 <1 ∣eig(A−BK)∣1=∣eig(A−BK)∣2=0.8750<1
3.3 通过方程 (9)求解 P
使用MATLAB求解:
Q = eye(2);
R = 0.1;
syms P [2 2] % P 为2*2的矩阵
equ = P - (A - B*K)' * P * (A - B*K) == Q + K'*R*K;
Psol = solve(equ, P);
Psol = [Psol.P1_1, Psol.P2_1; Psol.P2_1, Psol.P2_2];
Psol = double(Psol);
disp(Psol)
3.4 状态预测
计算式 (3) 中的 G \mathcal{G} G, H \mathcal{H} H:
N = 4;
[G, H] = getGH(N, A, B);
function [G, H] = getGH(N, A, B) % N>1
tmp = A;
G = tmp;
for i=2:N
tmp = A*tmp;
G = [G; tmp];
end
r = size(B, 1);
c = size(B, 2);
H = zeros(r * N, c * N);
tmp = B;
for j = N:-1:1
H( (j-1)*r+1:j*r, (j-1)*c+1:j*c ) = tmp;
end
for i = 2:N
tmp = A*tmp;
for j = i:N
H( (j-1)*r+1:j*r, (j-i)*c+1:(j-i+1)*c ) = tmp;
end
end
end
3.5 根据代价函数构建二次规划来求解 U U U
根据式 (4) 有:
J
=
(
G
x
(
0
∣
k
)
)
T
Q
′
G
x
(
0
∣
k
)
+
2
U
T
H
T
Q
′
G
x
(
0
∣
k
)
+
U
T
(
H
T
Q
′
H
+
R
)
U
=
(
G
x
(
0
∣
k
)
)
T
Q
′
G
x
(
0
∣
k
)
+
2
x
(
0
∣
k
)
T
G
T
Q
′
H
U
+
U
T
(
H
T
Q
′
H
+
R
)
U
\begin{align*} J &= (\mathcal{G}x_{(0|k)})^T \mathcal{Q}^\prime \mathcal{G}x_{(0|k)} + 2U^T\mathcal{H}^T \mathcal{Q}^\prime \mathcal{G}x_{(0|k)} + U^T (\mathcal{H}^T \mathcal{Q}^\prime \mathcal{H} + \mathcal{R})U \\ &= (\mathcal{G}x_{(0|k)})^T \mathcal{Q}^\prime \mathcal{G}x_{(0|k)} + 2x_{(0|k)}^T\mathcal{G}^T \mathcal{Q}^\prime \mathcal{H} U + U^T (\mathcal{H}^T \mathcal{Q}^\prime \mathcal{H} + \mathcal{R})U \end{align*}
J=(Gx(0∣k))TQ′Gx(0∣k)+2UTHTQ′Gx(0∣k)+UT(HTQ′H+R)U=(Gx(0∣k))TQ′Gx(0∣k)+2x(0∣k)TGTQ′HU+UT(HTQ′H+R)U
其中,忽略与输入无关的项(第一项),优化问题可写为:
a
r
g
min
U
1
2
U
T
H
U
+
f
T
U
(11)
\mathrm{arg}~ \min_{U} \frac{1}{2}U^THU + f^TU \tag{11}
arg Umin21UTHU+fTU(11)
其中,
H
=
2
(
H
T
Q
′
H
+
R
)
H = 2(\mathcal{H}^T \mathcal{Q}^\prime \mathcal{H} + \mathcal{R})
H=2(HTQ′H+R),
f
T
=
2
x
(
0
∣
k
)
T
G
T
Q
′
H
f^T = 2x_{(0|k)}^T\mathcal{G}^T \mathcal{Q}^\prime \mathcal{H}
fT=2x(0∣k)TGTQ′H.
式 (11) 可通过二次规划求解。
x0 = [1;1]; % 设初始状态为[1;1]
[Qp, Rp] = getQR(N, Q, Psol, R);
Hp = 2 * (H' * Qp * H + Rp);
fp = 2 * x0' * G' * Qp * H;
U = quadprog(Hp, fp);
%%
function [Qp, Rp] = getQR(N, Q, P, R)
Qp = eye(N);
Qp(end) = 0;
Qp = kron(Qp, Q) + kron(eye(N)-Qp, P);
Rp = eye(N);
Rp = kron(Rp, R);
end
3.6 滚动优化
每个周期重复进行优化,只取控制序列
U
U
U 中的第一个动作
u
(
0
∣
k
)
u_{(0|k)}
u(0∣k) 作用于系统.
初始状态
x
0
=
[
1
1
]
T
x_{0} = [1~~~1]^T
x0=[1 1]T,使用MATLAB演示系统动态:
xCur = [2;2]; % 设初始状态为[1;1]
xLog = xCur;
uLog = [];
options = optimoptions('quadprog','MaxIterations', 100, 'Display','none');
step = 0:50;
for i = step
Hp = 2 * (H' * Qp * H + Rp);
fp = 2 * xCur' * G' * Qp * H;
Hp = 0.5 * (Hp + Hp');
U = quadprog(Hp, fp, [], [], [], [], [], [], zeros(4,1), options);
u = U(1);
xCur = A*xCur + B*u;
xLog = [xLog, xCur];
uLog = [uLog, u];
end
figure(1)
subplot(3,1,1)
plot(step, xLog(1,1:end-1))
title('x1')
grid on
subplot(3,1,2)
plot(step, xLog(2,1:end-1))
title('x2')
grid on
subplot(3,1,3)
plot(step, uLog)
title('u')
grid on
得:
可见系统可以在10步的迭代中稳定到
0
\mathbf{0}
0.