迭代图像切割技术的交互式前景提取

最新推荐文章于 2024-06-14 22:53:49 发布

置顶

zxying000

最新推荐文章于 2024-06-14 22:53:49 发布

阅读量554

点赞数 1

CC 4.0 BY-SA版权

分类专栏：计算机视觉文章标签： Grabcut

本文链接：https://blog.youkuaiyun.com/qq_41000782/article/details/96617953

----- -----《“GrabCut” — Interactive Foreground Extraction using Iterated Graph Cuts》

以下是我对grabcut的简单认识

grabcut是在graphcut基础上演变来的，grabcut算法利用了图像中的纹理（颜色）信息和边界（反差）信息，只要少量的用户交互操作即可得到比较好的分割结果。

一、Graph cut的介绍

首先，介绍一种图，如图1所示，这是一个特殊的图（graph），它的特殊之处在于除了中间的像素点之间（网格部分）构成图之外，他还有两个特殊顶点T（汇点）和S（源点），分别表示背景（Background）和前景（Object），以及各个像素和这两个点之间的连线，这两个特殊点是种子点，是用户在进行图像分割过程中标注出来的点（用户交互过程中指明哪里是背景，哪里是前景），如图2所示。

这种图的顶点确定了，那它各个点之间的边值又是怎么确定的呢？

这里我们就要引入最重要的公式了，利用这个公式不仅能解决上面的这个问题，还能帮我们进行图像分割，神奇，图像能量表示公式：

首先看公式（1）E(A)=lamda*R(A)+B(A),其中R(A)是区域项，表示像素A属于背景或者前景的概率，在图1中表示所有像素点与顶点T和顶点S的连线的边的权值，那我们怎么知道R(A) 为多少呢？这里要用到这个公式2，它等于该像素点属于前景或是背景的概率的负对数。

B(A)代表边界项，它对应图1中相邻像素点之间的边的权值，表示为两个像素点接近程度，这里用公式3来表示，两个像素点的像素值大小越相近，B（A）的值就越大，两个像素值大小越不相近，B(A)的值越小。

公式3

lamda表示R(A)和B(A)的比重，若lamda为0，代表只考虑B(A)，这里的lamda也是通过计算得出来的，具体详见第二部分。

这样我们不难发现，若是R(A)越小，代表像素属于前景或是背景的概率也就越大，B(A)越小，代表两个像素点不相近，不属于一类像素，应该分开，所以当E(A)能量越小，代表分割越准确。

那么我们如何进行分割呢？这里根据《Interactive Graph Cuts for Optimal Boundary & Region Segmentation of Objects in N-D Images》这篇论文，提出了最小割方法。最小割算法原理可以参考https://blog.youkuaiyun.com/chinacoy/article/details/45040897。通俗易懂。

二、grabcut 介绍

与graph cut相比，grabcut具有以下优点：

1、只需一个长方形的目标框

2、增加少量的用户交互，分割更完美

3、border matting技术使目标分割边界更加自然

与graph cut 的不同之处在于：

1、Graph Cut 目标、背景是灰度直方图，Grab Cut是RGB三通道的混合高斯模型（GMM）

2、Graph Cut 分割一次完成，Grab Cut是不断进行分割估计和模型参数学习的迭代过程

3、Graph Cut 需要指定目标和背景种子点，Grab Cut 只需框选目标，允许不完全标注

上算法流程图，配合下面代码更容易看懂一些：

初始化部分模型迭代部分：

grabcut()函数内部代码：


/*M///////////////////////////////////////////////////////////////////////////////////////

//

//  IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.

//

//  By downloading, copying, installing or using the software you agree to this license.

//  If you do not agree to this license, do not download, install,

//  copy or use the software.

//

//

//                        Intel License Agreement

//                For Open Source Computer Vision Library

//

// Copyright (C) 2000, Intel Corporation, all rights reserved.

// Third party copyrights are property of their respective owners.

//

// Redistribution and use in source and binary forms, with or without modification,

// are permitted provided that the following conditions are met:

//

//   * Redistribution's of source code must retain the above copyright notice,

//     this list of conditions and the following disclaimer.

//

//   * Redistribution's in binary form must reproduce the above copyright notice,

//     this list of conditions and the following disclaimer in the documentation

//     and/or other materials provided with the distribution.

//

//   * The name of Intel Corporation may not be used to endorse or promote products

//     derived from this software without specific prior written permission.

//

// This software is provided by the copyright holders and contributors "as is" and

// any express or implied warranties, including, but not limited to, the implied

// warranties of merchantability and fitness for a particular purpose are disclaimed.

// In no event shall the Intel Corporation or contributors be liable for any direct,

// indirect, incidental, special, exemplary, or consequential damages

// (including, but not limited to, procurement of substitute goods or services;

// loss of use, data, or profits; or business interruption) however caused

// and on any theory of liability, whether in contract, strict liability,

// or tort (including negligence or otherwise) arising in any way out of

// the use of this software, even if advised of the possibility of such damage.

//

//M*/

 

#include "precomp.hpp"

#include "gcgraph.hpp"

#include <limits>

 

using namespace cv;

 

/*

This is implementation of image segmentation algorithm GrabCut described in

"GrabCut — Interactive Foreground Extraction using Iterated Graph Cuts".

Carsten Rother, Vladimir Kolmogorov, Andrew Blake.

 */

 

/*

 GMM - Gaussian Mixture Model

*/

class GMM

{

public:

    static const int componentsCount = 5;

 

    GMM( Mat& _model );

    double operator()( const Vec3d color ) const;

    double operator()( int ci, const Vec3d color ) const;

    int whichComponent( const Vec3d color ) const;

 

    void initLearning();

    void addSample( int ci, const Vec3d color );

    void endLearning();

 

private:

    void calcInverseCovAndDeterm( int ci );

    Mat model;

    double* coefs;

    double* mean;

    double* cov;

 

    double inverseCovs[componentsCount][3][3]; //协方差的逆矩阵

    double covDeterms[componentsCount];  //协方差的行列式

 

    double sums[componentsCount][3];

    double prods[componentsCount][3][3];

    int sampleCounts[componentsCount];

    int totalSampleCount;

};

 

//背景和前景各有一个对应的GMM（混合高斯模型）

GMM::GMM( Mat& _model )

{

	//一个像素的（唯一对应）高斯模型的参数个数或者说一个高斯模型的参数个数

	//一个像素RGB三个通道值，故3个均值，3*3个协方差，共用一个权值

    const int modelSize = 3/*mean*/ + 9/*covariance*/ + 1/*component weight*/;

    if( _model.empty() )

    {

		//一个GMM共有componentsCount个高斯模型，一个高斯模型有modelSize个模型参数

        _model.create( 1, modelSize*componentsCount, CV_64FC1 );

        _model.setTo(Scalar(0));

    }

    else if( (_model.type() != CV_64FC1) || (_model.rows != 1) || (_model.cols != modelSize*componentsCount) )

        CV_Error( CV_StsBadArg, "_model must have CV_64FC1 type, rows == 1 and cols == 13*componentsCount" );

 

    model = _model;

 

	//注意这些模型参数的存储方式：先排完componentsCount个coefs，再3*componentsCount个mean。

	//再3*3*componentsCount个cov。

    coefs = model.ptr<double>(0);  //GMM的每个像素的高斯模型的权值变量起始存储指针

    mean = coefs + componentsCount; //均值变量起始存储指针

    cov = mean + 3*componentsCount;  //协方差变量起始存储指针

 

    for( int ci = 0; ci < componentsCount; ci++ )

        if( coefs[ci] > 0 )

			 //计算GMM中第ci个高斯模型的协方差的逆Inverse和行列式Determinant

			 //为了后面计算每个像素属于该高斯模型的概率（也就是数据能量项）

             calcInverseCovAndDeterm( ci ); 

}

 

//计算一个像素（由color=（B,G,R）三维double型向量来表示）属于这个GMM混合高斯模型的概率。

//也就是把这个像素像素属于componentsCount个高斯模型的概率与对应的权值相乘再相加，

//具体见论文的公式（10）。结果从res返回。

//这个相当于计算Gibbs能量的第一个能量项（取负后）。

double GMM::operator()( const Vec3d color ) const

{

    double res = 0;

    for( int ci = 0; ci < componentsCount; ci++ )

        res += coefs[ci] * (*this)(ci, color );

    return res;

}

 

//计算一个像素（由color=（B,G,R）三维double型向量来表示）属于第ci个高斯模型的概率。

//具体过程，即高阶的高斯密度模型计算式，具体见论文的公式（10）。结果从res返回

double GMM::operator()( int ci, const Vec3d color ) const

{

    double res = 0;

    if( coefs[ci] > 0 )

    {

        CV_Assert( covDeterms[ci] > std::numeric_limits<double>::epsilon() );

        Vec3d diff = color;

        double* m = mean + 3*ci;

        diff[0] -= m[0]; diff[1] -= m[1]; diff[2] -= m[2];

        double mult = diff[0]*(diff[0]*inverseCovs[ci][0][0] + diff[1]*inverseCovs[ci][1][0] + diff[2]*inverseCovs[