Introduction
这篇文章的目的就是achieving robust constrained, high performance path-tracking in spite of unknown disturbances.
这篇文章的思路:simple process model, high model uncertainty →learn\rightarrow^{learn}→learnaccurate, low-uncertainty model
这篇文章用VO做Localization.
这篇文章和传统contrained NMPC有如下两方面的不同:
- 传统方法中process model是预先设计好并且不变的,在这篇文章中learn到disturbance model来加强process model,使得process model可以predict the mean and uncertainty of effects.
- 传统contrained NMPC没有考虑模型的不确定性,这篇文章apply robust constraints in real time considering the learned uncertainty. We provide robust constraint satisfaction when uncertainty is high and increased performance as uncertainty is reduced through learning.
这篇文章的主要创新点就是:
- use learned models
- account for model uncertainty
上面就是本文整体控制框图。RC-LB-NMPC主要包含两个主要的部分:
- the robust constrained, path-tracking NMPC algorithm based on an a priori process
- the GP-based disturbance model
Mathematical Formulation
先大概介绍一下NMPC吧:
At a given sample time, NMPC finds a sequence of control inputs that optimizes the plant behavior over a prediction horizon based on current state. The first input in the optimal sequence is then applied to the system. The entire process is repeated at the next sample time for the new system state.
Robust Constrained NMPC
- 首先肯定是要讲一下状态转移model
The true system is approximate by the sum of an a priori model and an experienced-based, learned model:
xk+1=f(xk,uk)+g(ak)x_{k+1} = f(x_{k}, u_{k}) + g(a_{k})xk+1=f(xk,uk)+g(ak)
where:
f(⋅)f(\cdot)f(⋅)——a known nonlinear process model representing our knowledge of ftrue(⋅)f_{true}(\cdot)ftrue(⋅)
g(⋅)g(\cdot)g(⋅)—— an (initially unknown) disturbance model representing discrepancies between the a priori model and the actual system behavior. g(⋅)g(\cdot)g(⋅) is modeled as GP. For simplicity, ak=(xkˉ,uk)a_{k} = (\bar{x_{k}}, u_{k})ak=(xkˉ,uk)
- 再来讲一下cost function
定义the cost function to be minimized over the next KKK time-steps as:
J(xˉ,u)=(xd−xˉ)TQ(xd−xˉ)+(ud−u)TR(ud−u)J(\bar{x}, u) = (x_{d} - \bar{x})^{T}Q(x_{d} - \bar{x}) + (u_{d} - u)^{T}R(u_{d} - u)J(xˉ,u)=(xd−xˉ)TQ(xd−xˉ)+(ud−u)TR(ud−u)
其中:
QQQ是半正定矩阵,RRR是正定矩阵
xd=(xd,k+1,...,xd,k+K)x_{d} = (x_{d, k+1}, ..., x_{d, k+K})xd=(xd,k+1,...,xd,k+K)——a sequence of desired states
x=(xk+1,...,xk+K)x = (x_{k+1}, ..., x_{k+K})x=(xk+1,...,xk+K)——a sequence of uncertain predicted states, xˉ\bar{x}xˉ is the sequence of mean values based on xxx
ud=(ud,k,...,ud,k+K−1)u_{d} = (u_{d, k}, ..., u_{d, k+K-1})ud=(ud,k,...,ud,k+K−1)——a sequence of desired inputs
u=(uk,...,uk+K−1)u = (u_{k}, ..., u_{k+K-1})u=(uk,...,uk+K−1)——a sequence of inputs
- 接下来就是要定义robust constraint了
从state和input两个角度定义
基于以上基础,我们就可以formulate the following constrained optimization problem:
xopt,uopt=argminx,uJ(xˉ,u){x_{opt}, u_{opt}} = \underset{x,u}{arg min}J(\bar{x}, u)xopt,uopt=x,uargminJ(xˉ,u) subjucttoxˉk+i+1=f(xˉk+i,uk+i)+g(ak+i),i=0,...,K−1subjuct to \bar{x}_{k+i+1} = f(\bar{x}_{k+i} , u_{k+i}) + g(a_{k+i}), i=0, ..., K-1subjucttoxˉk+i+1=f(xˉk+i,uk+i)+g(ak+i),i=0,...,K−1 ci(xˉ,u)>0c_{i}(\bar{x}, u) > 0ci(xˉ,u)>0
整个算法的流程:
在算法收敛之后,we apply the first element of the resulting optimal control input sequence for one time-step, and start all over at the next time-step.
Predicting uncertain trajectories
state都是正态分布的,所以使用Sigma-Point Transform来iteratively predict state sequences.
定义statezi=(xˉk+i,μ(ak+i))∈R2nz_{i} = (\bar{x}_{k+i}, \mu(a_{k+i})) \in R^{2n}zi=(xˉk+i,μ(ak+i))∈R2n representing the mean state and disturbance at time k+ik+ik+i with uncertainty Pi=diag(∑k+i,∑gp(ak+i))P_{i} = diag(\sum_{k+i}, \sum_{gp}(a_{k+i}))Pi=diag(∑k+i,∑gp(ak+i))
这个过程循环K次就可以生成完整的xxx序列。在这种方式下,3σ3\sigma3σ置信区间accouts for uncertainty arising from both localization and modeling
Gaussian Process Disturbance Model
The learned model depends on disturbance observations collection during previous trials.