Square Loss Function in Frequentist and Bayesian View

本文探讨了在频率派和贝叶斯视角下,平方损失函数的不同处理方式。对于频率派,损失函数可以分解为均方误差(MSE),包括方差和偏差平方。而在贝叶斯视角中,我们处理的是参数θ的随机性,计算后验期望,同样得到损失函数的表达式。两个视角下的处理方法不同,但都揭示了损失函数的关键组成部分。

Suppose we have X1,....,X2∼N(θ,σ02)X_1,....,X_2 \sim N(\theta, \sigma_0^2)X1,....,X2N(θ,σ02)
Loss Function: Square Loss
L(δ(x⃗)−θ)=(δ(x⃗)−θ)2L(\delta(\vec{x}) - \theta) = (\delta(\vec{x})-\theta)^2L(δ(x)θ)=(δ(x)θ)2
The parameter you want to estimate is θ\thetaθ.

Under frequentist perspective the Rist Function can be written as:

EX(R(δ(x⃗),θ))=EX(δ(x⃗)−θ)2E_X(R(\delta(\vec{x}), \theta)) = E_X(\delta(\vec{x})- \theta)^2EX(R(δ(x),θ))=EX(δ(x)θ)2 (here we take expectation with respect to X)
is equivalent to MSE, so we can decompose it into variance + bias^2:
MSE=EX(δ(x⃗)−θ)2=EX(δ(x⃗)−E(δ(x⃗))+E(δ(x⃗))−θ)2=EX(δ(x⃗)−E(δ(x⃗)))2+[EX(E(δ(x⃗))−θ)]2=Var(δ(x⃗))+[E(δ(x⃗))−θ)]2=Var(δ(x⃗))+Bias2 \begin{aligned} MSE = E_X(\delta(\vec{x})- \theta)^2 & =E_X(\delta(\vec{x}) - E(\delta(\vec{x})) + E(\delta(\vec{x})) - \theta)^2 \\ & = E_X(\delta(\vec{x}) - E(\delta(\vec{x})) )^2 + [E_X(E(\delta(\vec{x})) - \theta)]^2\\ & =Var(\delta(\vec{x})) + [ E(\delta(\vec{x})) - \theta)]^2\\ &=Var(\delta(\vec{x})) + Bias^2 \end{aligned} MSE=EX(δ(x)θ)2=EX(δ(x)E(δ(x))+E(δ(x))θ)2=EX(δ(x)E(δ(x)))2+[EX(E(δ(x))θ)]2=Var(δ(x))+[E(δ(x))θ)]2=Var(δ(x))+Bias2
The above frequentist way shows that the random variable is the statistic δ(x⃗)\delta(\vec{x})δ(x). And finally we can find its corresponding variance and bias then the computation will be finished.

Under Bayesian perspective the Posterior Expectation can be written as:
Eθ∣X[δ(x⃗)−θ]2=Eθ∣X[θ−Eθ∣X(θ)+Eθ∣X(θ)+δ(x⃗)]2=Eθ∣X[θ−Eθ∣X(θ)]2+[Eθ∣X[Eθ∣X(θ)−δ(x⃗)]]2=Varθ∣X(θ)+[Eθ∣X(θ)−δ(x⃗)]2 \begin{aligned} E_{\theta|X}[\delta(\vec{x}) - \theta]^2 & = E_{\theta|X}[\theta - E_{\theta|X}(\theta) + E_{\theta|X}(\theta)+\delta(\vec{x})]^2\\ &=E_{\theta|X}[\theta - E_{\theta|X}(\theta)]^2 + [E_{\theta|X}[E_{\theta|X}(\theta)-\delta(\vec{x})]]^2\\ &=Var_{\theta|X}(\theta) + [E_{\theta|X}(\theta)-\delta(\vec{x})]^2 \end{aligned} EθX[δ(x)θ]2=EθX[θEθX(θ)+EθX(θ)+δ(x)]2=EθX[θEθX(θ)]2+[EθX[EθX(θ)δ(x)]]2=VarθX(θ)+[EθX(θ)δ(x)]2
The above bayesian way shows that this time, we the theta will be treated as random variable.Then we can find its corresponding posterior mean and variance the computation is done.
Here I just want to mention that in different scenario, the manipulation is different. In frequentist way, since we treat statistic δ(x⃗)\delta(\vec{x})δ(x) as the random variable, we need to subtract E(δ(x)⃗)E(\delta\vec{(x)})E(δ(x)). In the other hand, in the Bayesian way, we should subtract Eθ∣X(θ)E_{\theta|X}(\theta)EθX(θ)

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值