Notes on Conjugate Gradient Method

本文深入探讨共轭梯度法在解决线性方程组与极小化问题的应用,通过最速下降法与共轭梯度法的对比,解释了共轭梯度法的关键思想与实现步骤。从简单情形出发,逐步扩展到正定对称矩阵的一般情形,详细阐述了如何通过坐标变换和子空间结构理解算法的工作原理。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

本文为作者原创,原载于我们的主页,转载于此。

这是一个关于共轭梯度法的笔记。请大家注意的是,这是个笔记,并不是一个教程,因此少不了跳跃和欠解释的地方。对CG方法了解不多的同学请移步这里

线性方程组和极小化问题

一个关于对称矩阵A的线性方程组Ax=b等价于求解如下极小值问题:

  \[ f(x) = \frac{1}{2} \transp{x} A x - \transp{b} x \]

这很容易说明,我们微分目标函数得

(1) \begin{eqnarray*} \td f &=& \frac{1}{2}(\td\transp{x}Ax+\transp{x}A\td x)-\transp{b} \td x\\ &=& \transp{(Ax - b)} \td x \end{eqnarray*}

所以\td f=0意味着Ax=b.

x^*为问题的解,e为偏离极小值点的位移,即

(2) \begin{eqnarray*} Ax^*&=&b\\ e&=&\Delta x=x-x^* \end{eqnarray*}

我们定义残差或负梯度r=b-Ax,容易看出,r正比于偏离梯度零点x^*的位移e:

  \[ r=-Ae \]

最速下降法和共轭梯度法简述

最速下降法:搜索方向为本轮迭代初始点的梯度方向,搜索到梯度与搜索方向正交的位置开始下一轮迭代。

共轭梯度法:搜索方向为本轮迭代初始点的梯度方向(残差方向)用A-内积做正交化,扣除掉之前所有搜索方向的分量给出,搜索到梯度与搜索方向正交的位置开始下一轮迭代。

共轭梯度法的理解

首先考虑一个简单情形,即A=I时。此时f退化,且x的不同分量解耦,f的极小值问题分解为各分量上的极小值问题。我们只需要在各分量方向上找到极小值,问题就得解了。具体的说,我们从任意一个初始点出发,在方向d_0上寻找极小值,然后从这个极小值出发继续在d_1寻找d_1方向的极小,以此类推最后到达全局极小值。

A为正定对称矩阵的一般情形,其实只是上述情形中的x表述在一个新坐标系下的结果(下文详细解释)。为了求解这种情形,我们考察上述方法的求解轨迹在新坐标系里的对应。

首先,A=I时的各个正交搜索方向在新坐标系里对应着一组A-正交的方向,因此我们需要设法找出这样一组正交方向,然后各自求解这些方向上的极小值。

求解特定方向上极小值的方法不变,极小点就是这个方向上梯度正交于搜索方向时的点(原因很简单,这说明搜索方向上梯度分量为零)。

最后一点,如何找到这样一组A-正交方向?

一般方法是找一组现成的完备基,进行A-正交化。问题是计算量比较大。

记迭代到第i步时,所搜索过的这样一组A-正交的搜索方向单位矢量为\{d_0,d_1,...,d_{i-1}\},它们撑起的子空间

(3) \begin{eqnarray*} D_i=\mathrm{span}\{d_0,d_1,...,d_{i-1}\} \end{eqnarray*}

即第i步前已经搜索过的子空间。

共轭梯度法的一个关键之处在于它可以从每次迭代的r_i中快速构造出新的搜索方向d_i,而不需要对r_i进行完整的正交化手续(即不需依次扣除r_id_0,d_1d_{i-1}方向的分量)。原因在于r_i自然的和D_{i-1}正交,因此构造d_i只需要从r_{i}中扣除掉d_{i-1}分量即可。

因此,共轭梯度算法有两处关键,一处为通过利用A-正交关系来把搜索问题转换到一个特殊坐标系,使得各搜索方向上的极小值问题解耦;一是利用r_iD_{i-1}是A-正交的这一规律来充分简化A-正交搜索方向的构造。

接下来我们详细论证这个算法。

第一个关键点,坐标变换和子问题解耦

(4) \begin{eqnarray*} A&=&\transp{T}T\\ y&=&T x\\ b^{\prime}&=&T^{-T} b\\ \end{eqnarray*}

其中T^{-T}表示矩阵T求逆后做转置,很拗口,不过这个不重要。

于是我们有

(5) \begin{eqnarray*} f(x) &=& \frac{1}{2} \transp{y} y - b^{\prime} y\\ &=& \frac{1}{2}\sum_{n}y_n^2 - b^{\prime}_n y_n \equiv \frac{1}{2}\sum_{n} P_n(y_n) \end{eqnarray*}

可见f已经被分解为一系列独立的子问题P_n(y_n)T正是将问题解耦所用的坐标变换。

在新的\{y_n\}坐标系中,整个极小值问题可以通过在一组正交方向上依次寻找极小点来解决。但求解T需要对A做分解,并非一个计算上经济的解决方法。这一个变换的真正意义在于,我们可以通过它来找到求解轨迹在\{x_n\}坐标系中的对应。

T是线性变换,因此\{y_n\}中的一组直线仍被变换为\{x_n\}中的一组直线。\{y_n\}坐标系中,我们依次沿着一组正交方向求解极小值,因此只需要找到这组方向在\{x_n\}坐标系中的对应(一组未必正交的方向),就可以等价的直接在这组方向之上寻找极小值,而不需要经过坐标变换。

考虑这组正交方向单位矢\{y_n\}

(6) \begin{eqnarray*} &&\transp{y_i} y_j = \delta_{ij}\\ &\Leftrightarrow& \transp{(T x_i)} T x_j=\delta_{ij}\\ &\Leftrightarrow& \transp{x_i} \transp{T}T x_j = \delta_{ij}\\ &\Leftrightarrow& \transp{x_i} A x_j = \delta_{ij} \end{eqnarray*}

可见,\{y_n\}这一组正交的方向在变换到\{x_n\}坐标系中后,虽然不是\{x_n\}中通常意义的正交矢量组,却可以在A-内积\langle \bullet,\bullet \rangle_A意义下成为一组A-正交矢量。这里的A-内积,指的是以对称矩阵A定义的内积运算

(7) \begin{eqnarray*} \langle a,b \rangle_A \equiv \transp{a} A b \end{eqnarray*}

需要注意的是,上述结论是可逆的,如果\langle x_i,x_j \rangle_A = 0,同样也可以得出,x_i,x_j对应着\{y_n\}坐标系中两个正交的方向。因此,可以直接在\{x_n\}坐标系下,寻找一组A-正交的方向,然后在这一组方向上依次极小化目标函数,就可以找到整个问题的极小值。

第二个关键点,D_i子空间结构和正交关系

算法的更新规则为

(8) \begin{eqnarray*} e_{i+1}&=&e_i + \alpha_i d_i\\ r_{i+1}&=&-A(e_i+\alpha_i d_i)=r_i-\alpha_i A d_i\\ d_{i}&=&r_i + \sum_{j<i} \beta_{ij} d_{j}\\ \beta_{ij}&=& - \frac{\transp{r_i}d_j}{\transp{d_j}Ad_j}\\ &=& - \frac{\langle r_i,d_j \rangle }{\langle d_j,d_j \rangle _A} \end{eqnarray*}

其中\alpha_i为第i步搜索方向上的位移长度,它的大小可以通过r_{i+1}应该和d_i正交这一要求定出

(9) \begin{eqnarray*} &&\transp{r_{i+1}}d_i=0\\ &\Rightarrow&\transp{(e_i+\alpha_i d_i)} A d_i =0\\ &\Rightarrow&\alpha_i = - \frac{\langle d_i,e_i \rangle_A }{\langle d_i,d_i \rangle_A}= \frac{\langle d_i,r_i \rangle}{\langle d_i,d_i \rangle_A} \end{eqnarray*}

算法将参数空间划分成了一个层级结构,下面我们来看看这个结构的性质。

根据上述更新规则,可以发现D_i可以由另外几个矢量组撑起。首先,由d_i的更新规则可以发现,d_ir_id_{j<i}线性组合而成,另一方面,我们有初始条件d_0=r_0,归纳而知,d_i可以由\{r_{j\leq i}\}线性组合出来。于是我们有

(10) \begin{eqnarray*} D_i=\mathrm{span}\{r_0,r_1,...,r_{i-1}\} \end{eqnarray*}

接着我们考虑r_i的更新规则。可以看到,D_{i+1}=\mathrm{span}\{D_{i},Ad_i\}.于是我们可以从D_1=\{d_0\}D_1=\{r_0\}构造出D_i

(11) \begin{eqnarray*} D_i&=&\mathrm{span}\{d_0,Ad_0,A^2d_0,...,A^{i-1}d_0\} = \mathrm{span}\{d_0,A D_{i-1}\}\\ &=&\mathrm{span}\{r_0,Ar_0,A^2r_0,...,A^{i-1}r_0\} = \mathrm{span}\{r_0,A D_{i-1}\} \end{eqnarray*}

接下来我们再考虑各子空间和向量之间的正交关系。首先,e_iD_i的补空间里。这点是明显的,因为每一个优化步都会优化掉e_j在相应d_j方向上的分量,因而剩余的e_i就只位于D_i的补空间里。于是我们有

(12) \begin{eqnarray*} e_i = \sum_{j \geq i} \delta_j d_j \end{eqnarray*}

这样ed之间的A-正交关系就很明显了。由于\langle d_i,d_j \rangle_A = \delta_{ij},而e_i只包含d_j>i分量,因而

(13) \begin{eqnarray*} \langle e_i,d_{j<i} \rangle_A =0 \end{eqnarray*}

e_i和子空间D_i也是A-正交的。这意味着r_i正交于D_i

(14) \begin{eqnarray*} &&\transp{e_i} A d_{j<i}=0\\ &\Rightarrow&\transp{(A e_i)} d_{j<i}=0\\ &\Rightarrow&\langle r_i,d_{j<i}\rangle =0 \end{eqnarray*}

到这里就可以解释为什么我们有\langle r_i,d_{j<i-1} \rangle_A =0了:

(15) \begin{eqnarray*} &&\langle r_i,D_i \rangle =0\\ &\Rightarrow&\langle r_i,\mathrm{span}\{d_0,A D_{i-1}\} \rangle =0\\ &\Rightarrow&\langle r_i,A D_{i-1} \rangle =0\\ &\Rightarrow&\langle r_i,D_{i-1} \rangle_A =0 \end{eqnarray*}

至此我们完成了第二个关键点的论证,且更清楚的了解了参数空间,尤其是D_i子空间的结构。

/* This file implements FGMRES (a Generalized Minimal Residual) method. Reference: Saad, 1993. Preconditioning: If the preconditioner is constant then this fgmres code is equivalent to RIGHT-PRECONDITIONED GMRES. FGMRES is a modification of gmres that allows the preconditioner to change at each iteration. Restarts: Restarts are basically solves with x0 not equal to zero. */ #include <../src/ksp/ksp/impls/gmres/fgmres/fgmresimpl.h> /*I "petscksp.h" I*/ #define FGMRES_DELTA_DIRECTIONS 10 #define FGMRES_DEFAULT_MAXK 30 static PetscErrorCode KSPFGMRESGetNewVectors(KSP, PetscInt); static PetscErrorCode KSPFGMRESUpdateHessenberg(KSP, PetscInt, PetscBool, PetscReal *); static PetscErrorCode KSPFGMRESBuildSoln(PetscScalar *, Vec, Vec, KSP, PetscInt); static PetscErrorCode KSPSetUp_FGMRES(KSP ksp) { PetscInt max_k, k; KSP_FGMRES *fgmres = (KSP_FGMRES *)ksp->data; PetscFunctionBegin; max_k = fgmres->max_k; PetscCall(KSPSetUp_GMRES(ksp)); PetscCall(PetscMalloc1(max_k + 2, &fgmres->prevecs)); PetscCall(PetscMalloc1(max_k + 2, &fgmres->prevecs_user_work)); /* fgmres->vv_allocated includes extra work vectors, which are not used in the additional block of vectors used to store the preconditioned directions, hence the -VEC_OFFSET term for this first allocation of vectors holding preconditioned directions */ PetscCall(KSPCreateVecs(ksp, fgmres->vv_allocated - VEC_OFFSET, &fgmres->prevecs_user_work[0], 0, NULL)); for (k = 0; k < fgmres->vv_allocated - VEC_OFFSET; k++) fgmres->prevecs[k] = fgmres->prevecs_user_work[0][k]; PetscFunctionReturn(PETSC_SUCCESS); } static PetscErrorCode KSPFGMRESResidual(KSP ksp) { KSP_FGMRES *fgmres = (KSP_FGMRES *)ksp->data; Mat Amat, Pmat; PetscFunctionBegin; PetscCall(PCGetOperators(ksp->pc, &Amat, &Pmat)); /* put A*x into VEC_TEMP */ PetscCall(KSP_MatMult(ksp, Amat, ksp->vec_sol, VEC_TEMP)); /* now put residual (-A*x + f) into vec_vv(0) */ PetscCall(VecWAXPY(VEC_VV(0), -1.0, VEC_TEMP, ksp->vec_rhs)); PetscFunctionReturn(PETSC_SUCCESS); } static PetscErrorCode KSPFGMRESCycle(PetscInt *itcount, KSP ksp) { KSP_FGMRES *fgmres = (KSP_FGMRES *)ksp->data; PetscReal res_norm; PetscReal hapbnd, tt; PetscBool hapend = PETSC_FALSE; /* indicates happy breakdown ending */ PetscInt loc_it; /* local count of # of dir. in Krylov space */ PetscInt max_k = fgmres->max_k; /* max # of directions Krylov space */ Mat Amat, Pmat; PetscFunctionBegin; /* Number of pseudo iterations since last restart is the number of prestart directions */ loc_it = 0; /* note: (fgmres->it) is always set one less than (loc_it) It is used in KSPBUILDSolution_FGMRES, where it is passed to KSPFGMRESBuildSoln. Note that when KSPFGMRESBuildSoln is called from this function, (loc_it -1) is passed, so the two are equivalent */ fgmres->it = (loc_it - 1); /* initial residual is in VEC_VV(0) - compute its norm*/ PetscCall(VecNorm(VEC_VV(0), NORM_2, &res_norm)); KSPCheckNorm(ksp, res_norm); /* The first entry in the right-hand side of the Hessenberg system is just the initial residual norm */ *RS(0) = res_norm; ksp->rnorm = res_norm; PetscCall(KSPLogResidualHistory(ksp, res_norm)); PetscCall(KSPMonitor(ksp, ksp->its, res_norm)); /* check for the convergence - maybe the current guess is good enough */ PetscCall((*ksp->converged)(ksp, ksp->its, res_norm, &ksp->reason, ksp->cnvP)); if (ksp->reason) { if (itcount) *itcount = 0; PetscFunctionReturn(PETSC_SUCCESS); } /* scale VEC_VV (the initial residual) */ PetscCall(VecScale(VEC_VV(0), 1.0 / res_norm)); /* MAIN ITERATION LOOP BEGINNING*/ /* keep iterating until we have converged OR generated the max number of directions OR reached the max number of iterations for the method */ while (!ksp->reason && loc_it < max_k && ksp->its < ksp->max_it) { if (loc_it) { PetscCall(KSPLogResidualHistory(ksp, res_norm)); PetscCall(KSPMonitor(ksp, ksp->its, res_norm)); } fgmres->it = (loc_it - 1); /* see if more space is needed for work vectors */ if (fgmres->vv_allocated <= loc_it + VEC_OFFSET + 1) { PetscCall(KSPFGMRESGetNewVectors(ksp, loc_it + 1)); /* (loc_it+1) is passed in as number of the first vector that should be allocated */ } /* CHANGE THE PRECONDITIONER? */ /* ModifyPC is the callback function that can be used to change the PC or its attributes before its applied */ PetscCall((*fgmres->modifypc)(ksp, ksp->its, loc_it, res_norm, fgmres->modifyctx)); /* apply PRECONDITIONER to direction vector and store with preconditioned vectors in prevec */ PetscCall(KSP_PCApply(ksp, VEC_VV(loc_it), PREVEC(loc_it))); PetscCall(PCGetOperators(ksp->pc, &Amat, &Pmat)); /* Multiply preconditioned vector by operator - put in VEC_VV(loc_it+1) */ PetscCall(KSP_MatMult(ksp, Amat, PREVEC(loc_it), VEC_VV(1 + loc_it))); /* update Hessenberg matrix and do Gram-Schmidt - new direction is in VEC_VV(1+loc_it)*/ PetscCall((*fgmres->orthog)(ksp, loc_it)); /* new entry in hessenburg is the 2-norm of our new direction */ PetscCall(VecNorm(VEC_VV(loc_it + 1), NORM_2, &tt)); KSPCheckNorm(ksp, tt); *HH(loc_it + 1, loc_it) = tt; *HES(loc_it + 1, loc_it) = tt; /* Happy Breakdown Check */ hapbnd = PetscAbsScalar((tt) / *RS(loc_it)); /* RS(loc_it) contains the res_norm from the last iteration */ hapbnd = PetscMin(fgmres->haptol, hapbnd); if (tt > hapbnd) { /* scale new direction by its norm */ PetscCall(VecScale(VEC_VV(loc_it + 1), 1.0 / tt)); } else { /* This happens when the solution is exactly reached. */ /* So there is no new direction... */ PetscCall(VecSet(VEC_TEMP, 0.0)); /* set VEC_TEMP to 0 */ hapend = PETSC_TRUE; } /* note that for FGMRES we could get HES(loc_it+1, loc_it) = 0 and the current solution would not be exact if HES was singular. Note that HH non-singular implies that HES is no singular, and HES is guaranteed to be nonsingular when PREVECS are linearly independent and A is nonsingular (in GMRES, the nonsingularity of A implies the nonsingularity of HES). So we should really add a check to verify that HES is nonsingular.*/ /* Now apply rotations to the new column of Hessenberg (and the right-hand side of the system), calculate new rotation, and get new residual norm at the same time*/ PetscCall(KSPFGMRESUpdateHessenberg(ksp, loc_it, hapend, &res_norm)); if (ksp->reason) break; loc_it++; fgmres->it = (loc_it - 1); /* Add this here in case it has converged */ PetscCall(PetscObjectSAWsTakeAccess((PetscObject)ksp)); ksp->its++; ksp->rnorm = res_norm; PetscCall(PetscObjectSAWsGrantAccess((PetscObject)ksp)); PetscCall((*ksp->converged)(ksp, ksp->its, res_norm, &ksp->reason, ksp->cnvP)); /* Catch error in happy breakdown and signal convergence and break from loop */ if (hapend) { if (!ksp->reason) { PetscCheck(!ksp->errorifnotconverged, PetscObjectComm((PetscObject)ksp), PETSC_ERR_NOT_CONVERGED, "Reached happy break down, but convergence was not indicated. Residual norm = %g", (double)res_norm); ksp->reason = KSP_DIVERGED_BREAKDOWN; break; } } } /* END OF ITERATION LOOP */ if (itcount) *itcount = loc_it; /* Down here we have to solve for the "best" coefficients of the Krylov columns, add the solution values together, and possibly unwind the preconditioning from the solution */ /* Form the solution (or the solution so far) */ /* Note: must pass in (loc_it-1) for iteration count so that KSPFGMRESBuildSoln properly navigates */ PetscCall(KSPFGMRESBuildSoln(RS(0), ksp->vec_sol, ksp->vec_sol, ksp, loc_it - 1)); /* Monitor if we know that we will not return for a restart */ if (ksp->reason == KSP_CONVERGED_ITERATING && ksp->its >= ksp->max_it) ksp->reason = KSP_DIVERGED_ITS; if (loc_it && ksp->reason) { PetscCall(KSPMonitor(ksp, ksp->its, res_norm)); PetscCall(KSPLogResidualHistory(ksp, res_norm)); } PetscFunctionReturn(PETSC_SUCCESS); } static PetscErrorCode KSPSolve_FGMRES(KSP ksp) { PetscInt cycle_its = 0; /* iterations done in a call to KSPFGMRESCycle */ KSP_FGMRES *fgmres = (KSP_FGMRES *)ksp->data; PetscBool diagonalscale; PetscFunctionBegin; PetscCall(PCGetDiagonalScale(ksp->pc, &diagonalscale)); PetscCheck(!diagonalscale, PetscObjectComm((PetscObject)ksp), PETSC_ERR_SUP, "Krylov method %s does not support diagonal scaling", ((PetscObject)ksp)->type_name); PetscCall(PetscObjectSAWsTakeAccess((PetscObject)ksp)); ksp->its = 0; PetscCall(PetscObjectSAWsGrantAccess((PetscObject)ksp)); /* Compute the initial (NOT preconditioned) residual */ if (!ksp->guess_zero) { PetscCall(KSPFGMRESResidual(ksp)); } else { /* guess is 0 so residual is F (which is in ksp->vec_rhs) */ PetscCall(VecCopy(ksp->vec_rhs, VEC_VV(0))); } /* This may be true only on a subset of MPI ranks; setting it here so it will be detected by the first norm computation in the Krylov method */ PetscCall(VecFlag(VEC_VV(0), ksp->reason == KSP_DIVERGED_PC_FAILED)); /* now the residual is in VEC_VV(0) - which is what KSPFGMRESCycle expects... */ PetscCall(KSPFGMRESCycle(&cycle_its, ksp)); while (!ksp->reason) { PetscCall(KSPFGMRESResidual(ksp)); if (ksp->its >= ksp->max_it) break; PetscCall(KSPFGMRESCycle(&cycle_its, ksp)); } /* mark lack of convergence */ if (ksp->its >= ksp->max_it && !ksp->reason) ksp->reason = KSP_DIVERGED_ITS; PetscFunctionReturn(PETSC_SUCCESS); } extern PetscErrorCode KSPReset_FGMRES(KSP); static PetscErrorCode KSPDestroy_FGMRES(KSP ksp) { PetscFunctionBegin; PetscCall(KSPReset_FGMRES(ksp)); PetscCall(PetscObjectComposeFunction((PetscObject)ksp, "KSPFGMRESSetModifyPC_C", NULL)); PetscCall(PetscObjectComposeFunction((PetscObject)ksp, "KSPFlexibleSetModifyPC_C", NULL)); PetscCall(KSPDestroy_GMRES(ksp)); PetscFunctionReturn(PETSC_SUCCESS); } static PetscErrorCode KSPFGMRESBuildSoln(PetscScalar *nrs, Vec vguess, Vec vdest, KSP ksp, PetscInt it) { PetscScalar tt; PetscInt ii, k, j; KSP_FGMRES *fgmres = (KSP_FGMRES *)ksp->data; PetscFunctionBegin; /* Solve for solution vector that minimizes the residual */ /* If it is < 0, no fgmres steps have been performed */ if (it < 0) { PetscCall(VecCopy(vguess, vdest)); /* VecCopy() is smart, exists immediately if vguess == vdest */ PetscFunctionReturn(PETSC_SUCCESS); } /* so fgmres steps HAVE been performed */ /* solve the upper triangular system - RS is the right side and HH is the upper triangular matrix - put soln in nrs */ if (*HH(it, it) != 0.0) { nrs[it] = *RS(it) / *HH(it, it); } else { nrs[it] = 0.0; } for (ii = 1; ii <= it; ii++) { k = it - ii; tt = *RS(k); for (j = k + 1; j <= it; j++) tt = tt - *HH(k, j) * nrs[j]; nrs[k] = tt / *HH(k, k); } /* Accumulate the correction to the soln of the preconditioned prob. in VEC_TEMP - note that we use the preconditioned vectors */ PetscCall(VecMAXPBY(VEC_TEMP, it + 1, nrs, 0, &PREVEC(0))); /* put updated solution into vdest.*/ if (vdest != vguess) { PetscCall(VecCopy(VEC_TEMP, vdest)); PetscCall(VecAXPY(vdest, 1.0, vguess)); } else { /* replace guess with solution */ PetscCall(VecAXPY(vdest, 1.0, VEC_TEMP)); } PetscFunctionReturn(PETSC_SUCCESS); } static PetscErrorCode KSPFGMRESUpdateHessenberg(KSP ksp, PetscInt it, PetscBool hapend, PetscReal *res) { PetscScalar *hh, *cc, *ss, tt; PetscInt j; KSP_FGMRES *fgmres = (KSP_FGMRES *)ksp->data; PetscFunctionBegin; hh = HH(0, it); /* pointer to beginning of column to update - so incrementing hh "steps down" the (it+1)th col of HH*/ cc = CC(0); /* beginning of cosine rotations */ ss = SS(0); /* beginning of sine rotations */ /* Apply all the previously computed plane rotations to the new column of the Hessenberg matrix */ /* Note: this uses the rotation [conj(c) s ; -s c], c= cos(theta), s= sin(theta), and some refs have [c s ; -conj(s) c] (don't be confused!) */ for (j = 1; j <= it; j++) { tt = *hh; *hh = PetscConj(*cc) * tt + *ss * *(hh + 1); hh++; *hh = *cc++ * *hh - (*ss++ * tt); /* hh, cc, and ss have all been incremented one by end of loop */ } /* compute the new plane rotation, and apply it to: 1) the right-hand side of the Hessenberg system (RS) note: it affects RS(it) and RS(it+1) 2) the new column of the Hessenberg matrix note: it affects HH(it,it) which is currently pointed to by hh and HH(it+1, it) (*(hh+1)) thus obtaining the updated value of the residual... */ /* compute new plane rotation */ if (!hapend) { tt = PetscSqrtScalar(PetscConj(*hh) * *hh + PetscConj(*(hh + 1)) * *(hh + 1)); if (tt == 0.0) { ksp->reason = KSP_DIVERGED_NULL; PetscFunctionReturn(PETSC_SUCCESS); } *cc = *hh / tt; /* new cosine value */ *ss = *(hh + 1) / tt; /* new sine value */ /* apply to 1) and 2) */ *RS(it + 1) = -(*ss * *RS(it)); *RS(it) = PetscConj(*cc) * *RS(it); *hh = PetscConj(*cc) * *hh + *ss * *(hh + 1); /* residual is the last element (it+1) of right-hand side! */ *res = PetscAbsScalar(*RS(it + 1)); } else { /* happy breakdown: HH(it+1, it) = 0, therefore we don't need to apply another rotation matrix (so RH doesn't change). The new residual is always the new sine term times the residual from last time (RS(it)), but now the new sine rotation would be zero...so the residual should be zero...so we will multiply "zero" by the last residual. This might not be exactly what we want to do here -could just return "zero". */ *res = 0.0; } PetscFunctionReturn(PETSC_SUCCESS); } static PetscErrorCode KSPFGMRESGetNewVectors(KSP ksp, PetscInt it) { KSP_FGMRES *fgmres = (KSP_FGMRES *)ksp->data; PetscInt nwork = fgmres->nwork_alloc; /* number of work vector chunks allocated */ PetscInt nalloc; /* number to allocate */ PetscInt k; PetscFunctionBegin; nalloc = fgmres->delta_allocate; /* number of vectors to allocate in a single chunk */ /* Adjust the number to allocate to make sure that we don't exceed the number of available slots (fgmres->vecs_allocated)*/ if (it + VEC_OFFSET + nalloc >= fgmres->vecs_allocated) nalloc = fgmres->vecs_allocated - it - VEC_OFFSET; if (!nalloc) PetscFunctionReturn(PETSC_SUCCESS); fgmres->vv_allocated += nalloc; /* vv_allocated is the number of vectors allocated */ /* work vectors */ PetscCall(KSPCreateVecs(ksp, nalloc, &fgmres->user_work[nwork], 0, NULL)); for (k = 0; k < nalloc; k++) fgmres->vecs[it + VEC_OFFSET + k] = fgmres->user_work[nwork][k]; /* specify size of chunk allocated */ fgmres->mwork_alloc[nwork] = nalloc; /* preconditioned vectors */ PetscCall(KSPCreateVecs(ksp, nalloc, &fgmres->prevecs_user_work[nwork], 0, NULL)); for (k = 0; k < nalloc; k++) fgmres->prevecs[it + k] = fgmres->prevecs_user_work[nwork][k]; /* increment the number of work vector chunks */ fgmres->nwork_alloc++; PetscFunctionReturn(PETSC_SUCCESS); } static PetscErrorCode KSPBuildSolution_FGMRES(KSP ksp, Vec ptr, Vec *result) { KSP_FGMRES *fgmres = (KSP_FGMRES *)ksp->data; PetscFunctionBegin; if (!ptr) { if (!fgmres->sol_temp) PetscCall(VecDuplicate(ksp->vec_sol, &fgmres->sol_temp)); ptr = fgmres->sol_temp; } if (!fgmres->nrs) { /* allocate the work area */ PetscCall(PetscMalloc1(fgmres->max_k, &fgmres->nrs)); } PetscCall(KSPFGMRESBuildSoln(fgmres->nrs, ksp->vec_sol, ptr, ksp, fgmres->it)); if (result) *result = ptr; PetscFunctionReturn(PETSC_SUCCESS); } static PetscErrorCode KSPSetFromOptions_FGMRES(KSP ksp, PetscOptionItems PetscOptionsObject) { PetscBool flg; PetscFunctionBegin; PetscCall(KSPSetFromOptions_GMRES(ksp, PetscOptionsObject)); PetscOptionsHeadBegin(PetscOptionsObject, "KSP flexible GMRES Options"); PetscCall(PetscOptionsBoolGroupBegin("-ksp_fgmres_modifypcnochange", "do not vary the preconditioner", "KSPFGMRESSetModifyPC", &flg)); if (flg) PetscCall(KSPFGMRESSetModifyPC(ksp, KSPFGMRESModifyPCNoChange, NULL, NULL)); PetscCall(PetscOptionsBoolGroupEnd("-ksp_fgmres_modifypcksp", "vary the KSP based preconditioner", "KSPFGMRESSetModifyPC", &flg)); if (flg) PetscCall(KSPFGMRESSetModifyPC(ksp, KSPFGMRESModifyPCKSP, NULL, NULL)); PetscOptionsHeadEnd(); PetscFunctionReturn(PETSC_SUCCESS); } static PetscErrorCode KSPFGMRESSetModifyPC_FGMRES(KSP ksp, KSPFlexibleModifyPCFn *fcn, void *ctx, PetscCtxDestroyFn *d) { PetscFunctionBegin; PetscValidHeaderSpecific(ksp, KSP_CLASSID, 1); ((KSP_FGMRES *)ksp->data)->modifypc = fcn; ((KSP_FGMRES *)ksp->data)->modifydestroy = d; ((KSP_FGMRES *)ksp->data)->modifyctx = ctx; PetscFunctionReturn(PETSC_SUCCESS); } PetscErrorCode KSPReset_FGMRES(KSP ksp) { KSP_FGMRES *fgmres = (KSP_FGMRES *)ksp->data; PetscInt i; PetscFunctionBegin; PetscCall(PetscFree(fgmres->prevecs)); if (fgmres->nwork_alloc > 0) { i = 0; /* In the first allocation we allocated VEC_OFFSET fewer vectors in prevecs */ PetscCall(VecDestroyVecs(fgmres->mwork_alloc[i] - VEC_OFFSET, &fgmres->prevecs_user_work[i])); for (i = 1; i < fgmres->nwork_alloc; i++) PetscCall(VecDestroyVecs(fgmres->mwork_alloc[i], &fgmres->prevecs_user_work[i])); } PetscCall(PetscFree(fgmres->prevecs_user_work)); if (fgmres->modifydestroy) PetscCall((*fgmres->modifydestroy)(&fgmres->modifyctx)); PetscCall(KSPReset_GMRES(ksp)); PetscFunctionReturn(PETSC_SUCCESS); } static PetscErrorCode KSPGMRESSetRestart_FGMRES(KSP ksp, PetscInt max_k) { KSP_FGMRES *gmres = (KSP_FGMRES *)ksp->data; PetscFunctionBegin; PetscCheck(max_k >= 1, PetscObjectComm((PetscObject)ksp), PETSC_ERR_ARG_OUTOFRANGE, "Restart must be positive"); if (!ksp->setupstage) { gmres->max_k = max_k; } else if (gmres->max_k != max_k) { gmres->max_k = max_k; ksp->setupstage = KSP_SETUP_NEW; /* free the data structures, then create them again */ PetscCall(KSPReset_FGMRES(ksp)); } PetscFunctionReturn(PETSC_SUCCESS); } static PetscErrorCode KSPGMRESGetRestart_FGMRES(KSP ksp, PetscInt *max_k) { KSP_FGMRES *gmres = (KSP_FGMRES *)ksp->data; PetscFunctionBegin; *max_k = gmres->max_k; PetscFunctionReturn(PETSC_SUCCESS); } /*MC KSPFGMRES - Implements the Flexible Generalized Minimal Residual method, flexible GMRES. [](sec_flexibleksp) Options Database Keys: + -ksp_gmres_restart <restart> - the number of Krylov directions to orthogonalize against . -ksp_gmres_haptol <tol> - sets the tolerance for "happy ending" (exact convergence) . -ksp_gmres_preallocate - preallocate all the Krylov search directions initially (otherwise groups of vectors are allocated as needed) . -ksp_gmres_classicalgramschmidt - use classical (unmodified) Gram-Schmidt to orthogonalize against the Krylov space (fast) (the default) . -ksp_gmres_modifiedgramschmidt - use modified Gram-Schmidt in the orthogonalization (more stable, but slower) . -ksp_gmres_cgs_refinement_type <refine_never,refine_ifneeded,refine_always> - determine if iterative refinement is used to increase the stability of the classical Gram-Schmidt orthogonalization. . -ksp_gmres_krylov_monitor - plot the Krylov space generated . -ksp_fgmres_modifypcnochange - do not change the preconditioner between iterations - -ksp_fgmres_modifypcksp - modify the preconditioner using `KSPFGMRESModifyPCKSP()` Level: beginner Notes: See `KSPFlexibleSetModifyPC()` or `KSPFGMRESSetModifyPC()` for how to vary the preconditioner between iterations GMRES requires that the preconditioner used is a linear operator. Flexible GMRES allows the preconditioner to be a nonlinear operator. This allows, for example, Flexible GMRES to use GMRES solvers or other Krylov solvers (which are nonlinear operators in general) inside the preconditioner used by `KSPFGMRES`. For example, the options `-ksp_type fgmres -pc_type ksp -ksp_ksp_type bcgs -ksp_view -ksp_pc_type jacobi` make the preconditioner (or inner solver) be bi-CG-stab with a preconditioner of `PCJACOBI`. `KSPFCG` provides a flexible version of the preconditioned conjugate gradient method. Only right preconditioning is supported. The following options `-ksp_type fgmres -pc_type ksp -ksp_ksp_type bcgs -ksp_view -ksp_pc_type jacobi` make the preconditioner (or inner solver) be bi-CG-stab with a preconditioner of `PCJACOBI` Developer Note: This object is subclassed off of `KSPGMRES`, see the source code in src/ksp/ksp/impls/gmres for comments on the structure of the code Contributed by: Allison Baker .seealso: [](ch_ksp), [](sec_flexibleksp), `KSPCreate()`, `KSPSetType()`, `KSPType`, `KSP`, `KSPGMRES`, `KSPLGMRES`, `KSPFCG`, `KSPGMRESSetRestart()`, `KSPGMRESSetHapTol()`, `KSPGMRESSetPreAllocateVectors()`, `KSPGMRESSetOrthogonalization()`, `KSPGMRESGetOrthogonalization()`, `KSPGMRESClassicalGramSchmidtOrthogonalization()`, `KSPGMRESModifiedGramSchmidtOrthogonalization()`, `KSPGMRESCGSRefinementType`, `KSPGMRESSetCGSRefinementType()`, `KSPGMRESGetCGSRefinementType()`, `KSPGMRESMonitorKrylov()`, `KSPFGMRESSetModifyPC()`, `KSPFGMRESModifyPCKSP()` M*/ PETSC_EXTERN PetscErrorCode KSPCreate_FGMRES(KSP ksp) { KSP_FGMRES *fgmres; PetscFunctionBegin; PetscCall(PetscNew(&fgmres)); ksp->data = (void *)fgmres; ksp->ops->buildsolution = KSPBuildSolution_FGMRES; ksp->ops->setup = KSPSetUp_FGMRES; ksp->ops->solve = KSPSolve_FGMRES; ksp->ops->reset = KSPReset_FGMRES; ksp->ops->destroy = KSPDestroy_FGMRES; ksp->ops->view = KSPView_GMRES; ksp->ops->setfromoptions = KSPSetFromOptions_FGMRES; ksp->ops->computeextremesingularvalues = KSPComputeExtremeSingularValues_GMRES; ksp->ops->computeeigenvalues = KSPComputeEigenvalues_GMRES; PetscCall(KSPSetSupportedNorm(ksp, KSP_NORM_UNPRECONDITIONED, PC_RIGHT, 3)); PetscCall(KSPSetSupportedNorm(ksp, KSP_NORM_NONE, PC_RIGHT, 1)); PetscCall(PetscObjectComposeFunction((PetscObject)ksp, "KSPGMRESSetPreAllocateVectors_C", KSPGMRESSetPreAllocateVectors_GMRES)); PetscCall(PetscObjectComposeFunction((PetscObject)ksp, "KSPGMRESSetOrthogonalization_C", KSPGMRESSetOrthogonalization_GMRES)); PetscCall(PetscObjectComposeFunction((PetscObject)ksp, "KSPGMRESGetOrthogonalization_C", KSPGMRESGetOrthogonalization_GMRES)); PetscCall(PetscObjectComposeFunction((PetscObject)ksp, "KSPGMRESSetRestart_C", KSPGMRESSetRestart_FGMRES)); PetscCall(PetscObjectComposeFunction((PetscObject)ksp, "KSPGMRESGetRestart_C", KSPGMRESGetRestart_FGMRES)); PetscCall(PetscObjectComposeFunction((PetscObject)ksp, "KSPFGMRESSetModifyPC_C", KSPFGMRESSetModifyPC_FGMRES)); PetscCall(PetscObjectComposeFunction((PetscObject)ksp, "KSPFlexibleSetModifyPC_C", KSPFGMRESSetModifyPC_FGMRES)); PetscCall(PetscObjectComposeFunction((PetscObject)ksp, "KSPGMRESSetCGSRefinementType_C", KSPGMRESSetCGSRefinementType_GMRES)); PetscCall(PetscObjectComposeFunction((PetscObject)ksp, "KSPGMRESGetCGSRefinementType_C", KSPGMRESGetCGSRefinementType_GMRES)); fgmres->haptol = 1.0e-30; fgmres->q_preallocate = 0; fgmres->delta_allocate = FGMRES_DELTA_DIRECTIONS; fgmres->orthog = KSPGMRESClassicalGramSchmidtOrthogonalization; fgmres->nrs = NULL; fgmres->sol_temp = NULL; fgmres->max_k = FGMRES_DEFAULT_MAXK; fgmres->Rsvd = NULL; fgmres->orthogwork = NULL; fgmres->modifypc = KSPFGMRESModifyPCNoChange; fgmres->modifyctx = NULL; fgmres->modifydestroy = NULL; fgmres->cgstype = KSP_GMRES_CGS_REFINE_NEVER; PetscFunctionReturn(PETSC_SUCCESS); }中文注释,转成Fortran
最新发布
07-02
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值