1 问题描述
贝叶斯计算的MCMC抽样方法用于解决贝叶斯的参数后验分布估计问题。在实际情况中,有时无法直接得到后验分布,而MCMC方法提供了一种通过模拟数值逼近后验分布的方法。MCMC方法生成从后验分布中抽取的样本序列,即马尔可夫链。通过模拟大量样本,MCMC算法能够在给定观测数据和先验信息时估计参数的后验分布,估计后验分布的各种特征,例如均值、中位数与方差等。
2 方法介绍
随机游动抽样
% 数据
x = [9,13,6,8,10,4,14,8,11,7,9,7,5,14,13,16,10,12,11,14,15,18,7,16,9,9,11,13,15,13,10,11,6,17,14,19,9,11,14,10,16,10,16,14,13,13,9,15,10,11,12,4,14,20];
y = [1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0];
% 参数
mu = [0, 0];
sigma2 = [10000, 10000];
tau2 = [0.01, 0.01];
N = 55000; % 迭代次数
% 初始化
beta = [0, 0];
betas = zeros(N, 2);
accept_count = 0;
alphas=zeros(N,1);
% 随机游动抽样
for t = 2:N
% 生成候选点
beta_candidate = mvnrnd(beta, diag(tau2));
% 计算接受概率
logit_theta = beta * [ones(size(x)); x];
logit_theta_candidate = beta_candidate * [ones(size(x)); x];
theta = 1./ (1 + exp(-(logit_theta)));
theta_candidate = 1./ (1 + exp(-(logit_theta_candidate)));
% 计算似然比的对数
log_likelihood_ratio = sum(y .* (log(theta_candidate) - log(theta)) + (1 - y) .* (log(1 - theta_candidate) - log(1 - theta)));
% 计算先验比的对数
log_prior_ratio = -0.5 * ((beta_candidate(1) - mu(1))^2 / sigma2(1) - (beta(1) - mu(1))^2 / sigma2(1) + ...
(beta_candidate(2) - mu(2))^2 / sigma2(2) - (beta(2) - mu(2))^2 / sigma2(2));
% 计算接受概率
log_alpha = log_likelihood_ratio + log_prior_ratio;
alpha = min([1, exp(log_alpha)]);
% 接受或拒绝
if rand < alpha
beta = beta_candidate;
accept_count = accept_count + 1;
end
% 存储结果
betas(t, :) = beta;
alphas(t,:)=alpha;
end
% 计算接受率
acceptance_rate = accept_count / (N-1);
% 输出结果
disp(['接受率:', num2str(acceptance_rate)]);
disp(['估计的beta:', num2str(mean(betas(5001:end, :)))]); % 然后取后5000个样本的均值作为估计
% 样本路径图 (trace plot)
figure;
subplot(2, 1, 1);
plot(betas(:, 1));
title('\beta_0的路径图');
xlabel('迭代次数');
ylabel('\beta_0');
subplot(2, 1, 2);
plot(betas(:, 2));
title('\beta_1的路径图');
xlabel('迭代次数');
ylabel('\beta_1');
% 遍历均值图 (running mean plot)
figure;
subplot(2, 1, 1);
plot(cumsum(betas(:, 1)) ./ (1:N)');
title('\beta_0的运行平均值图');
xlabel('迭代次数');
ylabel('\beta_0的平均值');
subplot(2, 1, 2);
plot(cumsum(betas(:, 2)) ./ (1:N)');
title('\beta_1的运行平均值图');
xlabel('迭代次数');
ylabel('\beta_1的平均值');
% MC误差 (MC error)
mc_error_beta0 = sqrt(cumsum((betas(:, 1) - mean(betas(:, 1))).^2) ./ (1:N)');
mc_error_beta1 = sqrt(cumsum((betas(:, 2) - mean(betas(:, 2))).^2) ./ (1:N)');
figure;
subplot(2, 1, 1);
plot(mc_error_beta0);
title('\beta_0的MC误差');
xlabel('迭代次数');
ylabel('\beta_0的MC误差');
subplot(2, 1, 2);
plot(mc_error_beta1);
title('\beta_1的MC误差')