一、简介
1 高斯混合模型概述
高斯密度函数估计是一种参数化模型。高斯混合模型(Gaussian Mixture Model, GMM)是单一高斯概率密度函数的延伸,GMM能够平滑地近似任意形状的密度分布。高斯混合模型种类有单高斯模型(Single Gaussian Model, SGM)和高斯混合模型(Gaussian Mixture Model, GMM)两类。类似于聚类,根据高斯概率密度函数(Probability Density Function, PDF)参数不同,每一个高斯模型可以看作一种类别,输入一个样本x,即可通过PDF计算其值,然后通过一个阈值来判断该样本是否属于高斯模型。很明显,SGM适合于仅有两类别问题的划分,而GMM由于具有多个模型,划分更为精细,适用于多类别的划分,可以应用于复杂对象建模。
1.1 单高斯模型
1.2 高斯混合模型
2 高斯混合模型参数估计
2.1 样本分类已知情况下的GMM
二、源代码
function mix=gmm_init(ncentres,data,kiter,covar_type)
%% 输入:
% ncentres:混合模型数目
% train_data:训练数据
% kiter:kmeans的迭代次数
%% 输出:
% mix:gmm的初始参数集合
[dim,data_sz]=size(data');
mix.priors=ones(1,ncentres)./ncentres;
mix.centres=randn(ncentres,dim);
switch covar_type
case 'diag'
% Store diagonals of covariance matrices as rows in a matrix
mix.covars=ones(ncentres,dim);
case 'full'
% Store covariance matrices in a row vector of matrices
mix.covars=repmat(eye(dim),[1 1 ncentres]);
otherwise
error(['Unknown covariance type ', mix.covar_type]);
end
% Arbitrary width used if variance collapses to zero: make it 'large' so
% that centre is responsible for a reasonable number of points.
GMM_WIDTH=1.0;
%kmeans算法
% [mix.centres,options,post]=k_means(mix.centres,data);
[mix.centres,post]=k_means(mix.centres,data,kiter);
% Set priors depending on number of points in each cluster
cluster_sizes = max(sum(post,1),1); % Make sure that no prior is zero
mix.priors = cluster_sizes/sum(cluster_sizes); % Normalise priors
switch covar_type
case 'diag'
for j=1:ncentres
% Pick out data points belonging to this centre
c=data(find(post(:,j)),:);
diffs=c-(ones(size(c,1),1)*mix.centres(j,:));
mix.covars(j,:)=sum((diffs.*diffs),1)/size(c,1);
% Replace small entries by GMM_WIDTH value
mix.covars(j,:)=mix.covars(j,:)+GMM_WIDTH.*(mix.covars(j,:)<eps);
end
case 'full'
for j=1:ncentres
% Pick out data points belonging to this centre
c=data(find(post(:,j)),:);
diffs=c-(ones(size(c,1),1)*mix.centres(j,:));
mix.covars(:,:,j)=(diffs'*diffs)/(size(c,1)+eps);
% Add GMM_WIDTH*Identity to rank-deficient covariance matrices
if rank(mix.covars(:,:,j))<dim
mix.covars(:,:,j)=mix.covars(:,:,j)+GMM_WIDTH.*eye(dim);
end
end
otherwise
error(['Unknown covariance type ', mix.covar_type]);
end
mix.ncentres=ncentres;
mix.covar_type=covar_type;
%=============================================================
function [centres,post]=k_means(centres,data,kiter)
[dim,data_sz]=size(data');
ncentres=size(centres,1); %簇的数目
[ignore,perm]=sort(rand(1,data_sz)); %产生任意顺序的随机数
perm = perm(1:ncentres); %取前ncentres个作为初始簇中心的序号
centres=data(perm,:); %指定初始中心点
id=eye(ncentres); %Matrix to make unit vectors easy to construct
for n=1:kiter
% Save old centres to check for termination
old_centres=centres; %存储旧的中心,便于计算终止条件
% Calculate posteriors based on existing centres
d2=(ones(ncentres,1)*sum((data.^2)',1))'+...
ones(data_sz,1)* sum((centres.^2)',1)-2.*(data*(centres')); %计算距离
% Assign each point to nearest centre
[minvals, index]=min(d2', [], 1);
post=id(index,:);
num_points = sum(post, 1);
% Adjust the centres based on new posteriors
for j = 1:ncentres
if (num_points(j) > 0)
centres(j,:) = sum(data(find(post(:,j)),:), 1)/num_points(j);
end
end
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
- 16.
- 17.
- 18.
- 19.
- 20.
- 21.
- 22.
- 23.
- 24.
- 25.
- 26.
- 27.
- 28.
- 29.
- 30.
- 31.
- 32.
- 33.
- 34.
- 35.
- 36.
- 37.
- 38.
- 39.
- 40.
- 41.
- 42.
- 43.
- 44.
- 45.
- 46.
- 47.
- 48.
- 49.
- 50.
- 51.
- 52.
- 53.
- 54.
- 55.
- 56.
- 57.
- 58.
- 59.
- 60.
- 61.
- 62.
- 63.
- 64.
- 65.
- 66.
- 67.
- 68.
- 69.
- 70.
- 71.
- 72.
- 73.
- 74.
- 75.
- 76.
- 77.
- 78.
- 79.
- 80.
- 81.
- 82.
- 83.
- 84.
- 85.
- 86.
- 87.
- 88.
- 89.
- 90.
- 91.
- 92.
- 93.
- 94.