统计学总结

本文深入探讨了统计学中的参数估计方法,包括最大似然估计和贝叶斯估计,以及它们的优缺点。同时,文章还讨论了置信区间的构造方法,显著性检验的过程,并介绍了回归分析的基本概念。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

7. Parameter Estimation

  • Model and parameters
  • Properties of good estimators
    • Unbiasedness, consistency
    • UMVUE, efficiency
  • MLE
  • Bayesian Estimation
    • why?
    • Prior and Posterior
    • Conjugate distribution
    • Limitations

Reason: statistic estimation is not general estimation problem.

  • Formulation:
    X1,X2,...,Xn i.i.d∼f(x;θ)   θ∈unknownEstimator:ϕ^=ϕ(X),ϕ:Rn→E X_1, X_2,..., X_n \ i.i.d \sim f(x ; \theta) \ \ \ \theta \in unknown\\ Estimator: \hat \phi = \phi(X) , \phi: \mathbb{R}^{n} \rightarrow E X1,X2,...,Xn i.i.df(x;θ)   θunknownEstimator:ϕ^=ϕ(X),ϕ:RnE
Properties of Good Estimators:
Correctness:
  • Unbiasedness: 样本量抽样分布的数学期望等于被估计总体的参数
    E[ϕ(X)]=θ for X∼f(x;θ) E[\phi(X)]=\theta \text { for } X \sim f(x ; \theta) E[ϕ(X)]=θ for Xf(x;θ)
  • Consistency: 随样本量增大,估计量收敛于总体的被估计参数
    ϕ(X)→θ in probability for X∼f(x;θ) \phi(X) \rightarrow \theta \text { in probability for } X \sim f(x ; \theta) ϕ(X)θ in probability for Xf(x;θ)
  • Example:
    s2=1n−1∑i=1n(Xi−X‾)2无偏的σ^2=1n∑i=1n(Xi−X‾)2一致的 \begin{aligned} s^{2} &=\frac{1}{n-1} \sum_{i=1}^{n}\left(X_{i}-\overline{X}\right)^{2} 无偏的\\ \hat{\sigma}^{2} &=\frac{1}{n} \sum_{i=1}^{n}\left(X_{i}-\overline{X}\right)^{2} 一致的\end{aligned} s2σ^2=n11i=1n(XiX)2=n1i=1n(XiX)2
  • Accurate:
  • Efficient:
  • UMVUE is very restrictive. Efficient is weaker condition.
Maximum Likelihood Estimation

Why?
MLE is a framework to design consistent and efficient estimator under very general conditions.

Formulation

  • The likelihood function:
    L(X;θ)=∏i=1nf(Xi;θ)X  i.i.d∼f(x;θ)   θ∈unknown L(X ; \theta)=\prod_{i=1}^{n} f\left(X_{i} ; \theta\right) \\ X ~ \ i.i.d \sim f(x ; \theta) \ \ \ \theta \in unknown\\ L(X;θ)=i=1nf(Xi;θ)X  i.i.df(x;θ)   θunknown
  • MLE: For given data samples X=x
    θ^=argmaxθ∈EL(x;θ)=L(x;θ^) \hat\theta=argmax _{\theta \in E} L(x ; \theta)=L(x ; \hat{\theta}) θ^=argmaxθEL(x;θ)=L(x;θ^)

Limitations:

  • To solve MLE, even numerically, could be very challenging.
  • MLE does not guarantee good performance in finite sample.
Bayesian Estimation

With Bayesian estimation, we can easily update our estimator in a fashion that samples are collected sequentially.

Formulation:

  • θ ~ E
  • f0(θ)f_{0}(\theta)f0(θ) as the prior of θ\thetaθ
  • f1(θ)f_{1}(\theta)f1(θ) called posterior, which gives the distribution of θ\thetaθ on condition data
    f1(θ)=f(θ∣X)=L(x;θ)f0(θ)∫EL(x;u)f0(u)du f_{1}(\theta)= f(\theta|X)=\frac{L(x ; \theta) f_{0}(\theta)}{\int_{E} L(x ; u) f_{0}(u) d u} f1(θ)=f(θX)=EL(x;u)f0(u)duL(x;θ)f0(θ)

Sequential Bayesian Estimation
Intuitively, if more data Xn+1,…,Xn+m is available, we can take the previous posterior f1 as the new prior and update the belief again using the new data only:

f2(θ)=L(x;θ)f1(θ)∫EL(x;u)f1(u)du f_{2}(\theta)=\frac{L(x ; \theta) f_{1}(\theta)}{\int_{E} L(x ; u) f_{1}(u) d u} f2(θ)=EL(x;u)f1(u)duL(x;θ)f1(θ)

Limitations:

  • Its dependence on the prior, which can be any distribution on E. A very strong prior could lead to a non-consistent estimation.
    • In the information-based trade example, what will happen if we pick p0 = 1?
      On the other hand, a weak prior could lead to slow convergence.
  • The computation of the posterior could be very costly when the parameter space E is large.

8. Confidence Interval

  • Three constructions of CI for i.i.d samples:
    • normal
    • t
    • bootstrap
  • When and how?

Central Limit Theory

  • Theorem: {Xi}\{X_i\}{Xi} is a sequence of i.i.d. samples of X with E[X]=μE[X] = μE[X]=μ and
    Var(X)=σ2Var(X) = σ^2Var(X)=σ2. Then,
    nσ(X‾n−μ)⇒N(0,1) \frac{\sqrt{n}}{\sigma}\left(\overline{X}_{n}-\mu\right) \Rightarrow N(0,1) σn(Xnμ)N(0,1)
  • Therefore, when n is “large”, for any α > 0
    P(∣nσ(X‾n−μ)∣>a)≈P(∣Z∣>a) P\left(\left|\frac{\sqrt{n}}{\sigma}\left(\overline{X}_{n}-\mu\right)\right|>a\right) \approx P(|Z|>a) P(σn(Xnμ)>a)P(Z>a)
    where Z is a standard normal r.v.
Confidence Interval(z-distribution)
  • For any confidence level aaa, we simply choose ϕ\phiϕ such that
    P(∣Z∣>ϕ)=1−aP(|Z|>\phi)=1-aP(Z>ϕ)=1a, then the a confidence interval is
    [X‾n−ϕσn,X‾n+ϕσn] \left[\overline{X}_{n}-\phi \frac{\sigma}{\sqrt{n}}, \overline{X}_{n}+\phi \frac{\sigma}{\sqrt{n}}\right] [Xnϕnσ,Xn+ϕnσ]
  • 95% CI means that: 如果做了100次抽样,大概有95次找到的区间包含真值,有5次找到的区间不包含真值。
    样本均值的标准误差s.e.为σ.x‾=σ/n 样本均值的标准误差s.e.为\sigma_{ . \overline{x}}=\sigma / \sqrt{n} s.e.σ.x=σ/n

The Effect of Sample Size

  • The magnitude of estimation error, measured by the half length of CI, is
    ϕσn \phi \frac{\sigma}{\sqrt{n}} ϕnσ
  • In order to have the estimation error ≈ ε, we need the sample size
    n≈ϕ2σ2ε2 n \approx \frac{\phi^{2} \sigma^{2}}{\varepsilon^{2}} nε2ϕ2σ2
    Intuitively, to improve the estimation accuracy by 10 times, we need enlarge the sample size by 100 times.
CI for Small Samples
  • Theorem: (CI of t-distribution)
    If X1,X2,...,XnX1, X2,...,XnX1,X2,...,Xn are i.i.d. samples of a normal distribution N(μ,σ2)N(μ,σ^2)N(μ,σ2), then
    ns(X‾n−μ)∼t(n−1)\frac{\sqrt{n}}{s}\left(\overline{X}_{n}-\mu\right) \sim t(n-1)sn(Xnμ)t(n1), a t-distribution with degree of freedom n − 1.
  • Remark:
    • t-distribution is more disperse than normal.
    • When n → ∞, t(n − 1) ⇒ N (0, 1).
Bootstrap

9. Significance Test

  • Formulation of general hypothesis test
    • Parameter space
    • Hypothesis / Alternative
    • Hypothesis testing
  • Significance test
    • 5 steps
    • What is the intuition
    • How to choose the hypothesis and alternative
    • How to interpret the p-value
    • Type I and II errors
Steps of a Significance Test
  1. Assumptions: underlying probability model for population
  2. Hypothesis: Formulate the statement or prediction in your research problem into a statement about the population parameter.
  3. Test Statistic: the test statistic measures how “far” the point estimate of parameter is from its null hypothesis value(s), conditional on that null hypothesis is true.
  4. P-Value: the tail probability beyond the observed value of test statistic, if we presume null hypothesis is true. 事件发生的不可能程度
  5. Conclusion: Report and interpret the p-value in the context of the study. Make a decision about H0 based on p-value.
Type I & Type II errors & Interpreting P-Value
Inference on Single Variables
Population proportion
  • z-test
  • Difference from CI
  • Small sample: binomial test

Population mean
  • t-test
  • Relation with CI
  • Small sample: bootstrap
Inference on Two Variables
  • Independent samples
    • Population proportion: z-test
    • Population mean: t-test
    • Small sample: permutation test
  • Paired data: t_test for single variable
  • standard error of z:
    z=(p1−p2)−(π1−π2)p1(1−p1)n1+p2(1−p2)n2 z=\frac{\left(p_{1}-p_{2}\right)-\left(\pi_{1}-\pi_{2}\right)}{\sqrt{\frac{p_{1}\left(1-p_{1}\right)}{n_{1}}+\frac{p_{2}\left(1-p_{2}\right)}{n_{2}}}} z=n1p1(1p1)+n2p2(1p2)(p1p2)(π1π2)

  • standard error of u:
    一般不做要求,直接给出

  • Conclude CI:
    Given our estimation on the standard error for the estimated mean or proportion difference, we can construct the confidence interval for mean or proportion difference:
    [(x‾−y‾)−ϕαse,(x‾−y‾)+ϕαse] \left[(\overline{x}-\overline{y})-\phi_{\alpha} s e,(\overline{x}-\overline{y})+\phi_{\alpha} s e\right] [(xy)ϕαse,(xy)+ϕαse]
    The coefficient φα is determined by α and model assumptions (normal
    distribution for proportions, t distribution for means).

Permutation Test

检验是两个总体是否是同一个服从同样的分布

Paired data

10. Multiple Regression

  • Assumptions
  • Interpretation of estimation results
  • Inference methods:
    • t-test for single coefficient
    • F-test for nested models
  • Residual analysis
Assumptions(linear regression model)

yi=β0+∑k=1pβkgk(xik)+εi y_{i}=\beta_{0}+\sum_{k=1}^{p} \beta_{k} g_{k}\left(x_{i k}\right)+\varepsilon_{i} yi=β0+k=1pβkgk(xik)+εi

where the functions gkg_kgk are known. Besides, we assume the following conditions on εiε_iεi:

  • Independence: εiε_iεi are independent.
  • Zero mean: E[ε∣x]=0E[ε|x] = 0E[εx]=0 for all possible value of x=(x1,...,xm)x = (x1, ..., xm)x=(x1,...,xm).
  • Equal variance: Var(ε∣x)=σ2Var(ε|x) = σ2Var(εx)=σ2.
  • Normality: εiε_iεi are normal conditional on x.
T-test & F-test
Residual analysis
  • DW-test 检验是否独立,原假设是残差独立不相关
  • JB-test 检验是否正太分布,原假设是残差是正太分布
Assumptions(logistic regression)
n=nrow(data)
tpr=fpr=rep(0,n)
#compute TPR and FPR for different threshold
for (i in 1:n)
{
  threshold=data$prob[i]
  tp=sum(data$prob>threshold&data$obs==1)
  fp=sum(data$prob>threshold&data$obs==0)
  tn=sum(data$prob<=threshold&data$obs==1)
  fn=sum(data$prob<=threshold&data$obs==0)
  tpr[i]=tp/(tp+tn)  #true positive rate
  fpr[i]=fp/(fp+fn)  #false positive rate
}
# plot ROC
plot(fpr,tpr,type='l',ylim = c(0,1),xlim = c(0,1),main = 'ROC')
abline(a=0,b=1)
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值