分位回归中的统计推断

经典分位回归估计量的渐近正态性

本文将使用巴哈杜尔表达式(Bahadur representation)给出分位回归估计量的渐近正态性的完整证明过程。这个证明不仅展示了巴哈杜尔表达式的强大应用,也揭示了分位回归估计量的统计性质。

前提与记号

1. 巴哈杜尔表达式的基本概念

巴哈杜尔表达式由印度统计学家R.R. Bahadur在1966年首次提出,为统计量的渐近理论提供了强大的分析工具。这种表达式最初用于描述样本分位数的渐近行为,后来被扩展到更复杂的统计模型中。

对于简单的样本分位数,巴哈杜尔表达式将其表示为一个线性统计量与一个高阶余项之和。具体而言,对于独立同分布样本 X 1 , X 2 , … , X n X_1, X_2, \ldots, X_n X1,X2,,Xn τ \tau τ分位数估计量 q ^ n ( τ ) \hat{q}_n(\tau) q^n(τ),其巴哈杜尔表达式为:

q ^ n ( τ ) − q ( τ ) = 1 n f ( q ( τ ) ) ∑ i = 1 n [ τ − I ( X i ≤ q ( τ ) ) ] + R n \hat{q}_n(\tau) - q(\tau) = \frac{1}{nf(q(\tau))}\sum_{i=1}^n[\tau - I(X_i \leq q(\tau))] + R_n q^n(τ)q(τ)=nf(q(τ))1i=1n[τI(Xiq(τ))]+Rn

其中:

  • q ( τ ) q(\tau) q(τ)是总体分布的真实 τ \tau τ分位数
  • f ( ⋅ ) f(\cdot) f()是概率密度函数
  • I ( ⋅ ) I(\cdot) I()是示性函数
  • R n R_n Rn是余项,满足 R n = o p ( n − 1 / 2 ) R_n = o_p(n^{-1/2}) Rn=op(n1/2)

2. 技术条件与要求

巴哈杜尔表达式成立需要一些技术条件:

  1. 密度条件:在真实分位数 q ( τ ) q(\tau) q(τ)附近,分布密度函数 f f f存在且严格正( f ( q ( τ ) ) > 0 f(q(\tau)) > 0 f(q(τ))>0
  2. 平滑性条件:密度函数在分位数附近满足Hölder连续性
  3. 矩条件:随机变量具有有限的高阶矩

对于分位回归,还需要额外的条件:

  1. 设计矩阵(协变量)满足一定的正则性条件
  2. 条件分布的密度函数满足一致的平滑性条件
  3. 参数空间是紧集

3.记号

现在让我们考虑经典的线性分位回归模型:
Q Y ∣ X ( τ ∣ X ) = X ′ β ( τ ) Q_{Y|X}(\tau|X) = X'\beta(\tau) QYX(τX)=Xβ(τ)

分位回归估计量 β ^ ( τ ) \hat{\beta}(\tau) β^(τ) 是通过最小化以下目标函数得到的:
min ⁡ β ∑ i = 1 n ρ τ ( Y i − X i ′ β ) \min_{\beta} \sum_{i=1}^n \rho_\tau(Y_i - X_i'\beta) βmini=1nρτ(YiXiβ)

其中 ρ τ ( u ) = u ( τ − I ( u < 0 ) ) \rho_\tau(u) = u(\tau - I(u < 0)) ρτ(u)=u(τI(u<0)) 是分位数检查函数(check loss function)。

证明步骤

步骤1:建立一阶条件

我们先分析目标函数的次梯度(subgradient):
∂ ∑ i = 1 n ρ τ ( Y i − X i ′ β ) = ∑ i = 1 n X i ⋅ ψ τ ( Y i − X i ′ β ) \partial \sum_{i=1}^n \rho_\tau(Y_i - X_i'\beta) = \sum_{i=1}^n X_i \cdot \psi_\tau(Y_i - X_i'\beta) i=1nρτ(YiXiβ)=i=1nXiψτ(YiXiβ)

其中 ψ τ ( u ) = τ − I ( u < 0 ) \psi_\tau(u) = \tau - I(u < 0) ψτ(u)=τI(u<0) 是检查函数的导数(除零点外)。

在最优解 β ^ ( τ ) \hat{\beta}(\tau) β^(τ) 处,次梯度必须包含零向量:
∑ i = 1 n X i ⋅ ψ τ ( Y i − X i ′ β ^ ( τ ) ) = 0 \sum_{i=1}^n X_i \cdot \psi_\tau(Y_i - X_i'\hat{\beta}(\tau)) = 0 i=1nXiψτ(YiXiβ^(τ))=0

步骤2:应用泰勒展开

定义函数:
Z n ( β ) = 1 n ∑ i = 1 n X i ⋅ ψ τ ( Y i − X i ′ β ) Z_n(\beta) = \frac{1}{n}\sum_{i=1}^n X_i \cdot \psi_\tau(Y_i - X_i'\beta) Zn(β)=n1i=1nXiψτ(YiXiβ)

根据一阶条件,我们有 Z n ( β ^ ( τ ) ) = 0 Z_n(\hat{\beta}(\tau)) = 0 Zn(β^(τ))=0

Z n ( β ) Z_n(\beta) Zn(β) 在真实参数 β ( τ ) \beta(\tau) β(τ) 附近进行泰勒展开:
Z n ( β ^ ( τ ) ) = Z n ( β ( τ ) ) + ∂ Z n ( β ) ∂ β ′ ∣ β = β ~ ⋅ ( β ^ ( τ ) − β ( τ ) ) Z_n(\hat{\beta}(\tau)) = Z_n(\beta(\tau)) + \frac{\partial Z_n(\beta)}{\partial \beta'}\bigg|_{\beta = \tilde{\beta}} \cdot (\hat{\beta}(\tau) - \beta(\tau)) Zn(β^(τ))=Zn(β(τ))+βZn(β) β=β~(β^(τ)β(τ))

其中 β ~ \tilde{\beta} β~ 位于 β ^ ( τ ) \hat{\beta}(\tau) β^(τ) β ( τ ) \beta(\tau) β(τ) 之间。

步骤3:计算导数矩阵

我们需要计算 Z n ( β ) Z_n(\beta) Zn(β) β \beta β 的导数:

∂ Z n ( β ) ∂ β ′ = − 1 n ∑ i = 1 n X i X i ′ ⋅ f Y ∣ X ( X i ′ β ∣ X i ) \frac{\partial Z_n(\beta)}{\partial \beta'} = -\frac{1}{n}\sum_{i=1}^n X_i X_i' \cdot f_{Y|X}(X_i'\beta|X_i) βZn(β)=n1i=1nXiXifYX(XiβXi)

其中 f Y ∣ X ( ⋅ ∣ X i ) f_{Y|X}(\cdot|X_i) fYX(Xi) 是给定 X i X_i Xi 条件下 Y Y Y 的条件密度函数。

为了简化记号,定义:
D n ( β ) = − ∂ Z n ( β ) ∂ β ′ = 1 n ∑ i = 1 n X i X i ′ ⋅ f Y ∣ X ( X i ′ β ∣ X i ) D_n(\beta) = -\frac{\partial Z_n(\beta)}{\partial \beta'} = \frac{1}{n}\sum_{i=1}^n X_i X_i' \cdot f_{Y|X}(X_i'\beta|X_i) Dn(β)=βZn(β)=n1i=1nXiXifYX(XiβXi)

当样本量 n → ∞ n \to \infty n 时,由大数定律可知 D n ( β ( τ ) ) D_n(\beta(\tau)) Dn(β(τ)) 收敛到:
D ( τ ) = E [ X X ′ ⋅ f Y ∣ X ( X ′ β ( τ ) ∣ X ) ] D(\tau) = E[X X' \cdot f_{Y|X}(X'\beta(\tau)|X)] D(τ)=E[XXfYX(Xβ(τ)X)]

步骤4:建立巴哈杜尔表达式

将泰勒展开式与一阶条件结合:
0 = Z n ( β ^ ( τ ) ) = Z n ( β ( τ ) ) − D n ( β ~ ) ⋅ ( β ^ ( τ ) − β ( τ ) ) 0 = Z_n(\hat{\beta}(\tau)) = Z_n(\beta(\tau)) - D_n(\tilde{\beta}) \cdot (\hat{\beta}(\tau) - \beta(\tau)) 0=Zn(β^(τ))=Zn(β(τ))Dn(β~)(β^(τ)β(τ))

重新整理这个等式:
β ^ ( τ ) − β ( τ ) = [ D n ( β ~ ) ] − 1 ⋅ Z n ( β ( τ ) ) \hat{\beta}(\tau) - \beta(\tau) = [D_n(\tilde{\beta})]^{-1} \cdot Z_n(\beta(\tau)) β^(τ)β(τ)=[Dn(β~)]1Zn(β(τ))

这可以进一步重写为:
β ^ ( τ ) − β ( τ ) = [ D ( τ ) ] − 1 ⋅ Z n ( β ( τ ) ) + R n \hat{\beta}(\tau) - \beta(\tau) = [D(\tau)]^{-1} \cdot Z_n(\beta(\tau)) + R_n β^(τ)β(τ)=[D(τ)]1Zn(β(τ))+Rn

其中余项:
R n = { [ D n ( β ~ ) ] − 1 − [ D ( τ ) ] − 1 } ⋅ Z n ( β ( τ ) ) R_n = \{[D_n(\tilde{\beta})]^{-1} - [D(\tau)]^{-1}\} \cdot Z_n(\beta(\tau)) Rn={[Dn(β~)]1[D(τ)]1}Zn(β(τ))

在适当的正则性条件下(例如设计矩阵和条件密度函数满足一定的平滑性和矩条件),我们可以证明 R n = o p ( n − 1 / 2 ) R_n = o_p(n^{-1/2}) Rn=op(n1/2)

因此,我们得到巴哈杜尔表达式:
n ( β ^ ( τ ) − β ( τ ) ) = [ D ( τ ) ] − 1 ⋅ 1 n ∑ i = 1 n X i ⋅ ψ τ ( Y i − X i ′ β ( τ ) ) + o p ( 1 ) \sqrt{n}(\hat{\beta}(\tau) - \beta(\tau)) = [D(\tau)]^{-1} \cdot \frac{1}{\sqrt{n}}\sum_{i=1}^n X_i \cdot \psi_\tau(Y_i - X_i'\beta(\tau)) + o_p(1) n (β^(τ)β(τ))=[D(τ)]1n 1i=1nXiψτ(YiXiβ(τ))+op(1)

步骤5:应用中心极限定理

接下来,我们需要研究表达式中的随机项:
1 n ∑ i = 1 n X i ⋅ ψ τ ( Y i − X i ′ β ( τ ) ) \frac{1}{\sqrt{n}}\sum_{i=1}^n X_i \cdot \psi_\tau(Y_i - X_i'\beta(\tau)) n 1i=1nXiψτ(YiXiβ(τ))

我们可以证明:

  1. 这些项是独立同分布的随机向量(因为原始观测是独立同分布的)

  2. 它们的期望为零:
    E [ X i ⋅ ψ τ ( Y i − X i ′ β ( τ ) ) ] = E [ X i ⋅ E [ ψ τ ( Y i − X i ′ β ( τ ) ) ∣ X i ] ] E[X_i \cdot \psi_\tau(Y_i - X_i'\beta(\tau))] = E[X_i \cdot E[\psi_\tau(Y_i - X_i'\beta(\tau))|X_i]] E[Xiψτ(YiXiβ(τ))]=E[XiE[ψτ(YiXiβ(τ))Xi]]

    由于 β ( τ ) \beta(\tau) β(τ) 是条件 τ \tau τ 分位数,所以 P ( Y i ≤ X i ′ β ( τ ) ∣ X i ) = τ P(Y_i \leq X_i'\beta(\tau)|X_i) = \tau P(YiXiβ(τ)Xi)=τ,这意味着:
    E [ ψ τ ( Y i − X i ′ β ( τ ) ) ∣ X i ] = τ − P ( Y i ≤ X i ′ β ( τ ) ∣ X i ) = τ − τ = 0 E[\psi_\tau(Y_i - X_i'\beta(\tau))|X_i] = \tau - P(Y_i \leq X_i'\beta(\tau)|X_i) = \tau - \tau = 0 E[ψτ(YiXiβ(τ))Xi]=τP(YiXiβ(τ)Xi)=ττ=0

    因此 E [ X i ⋅ ψ τ ( Y i − X i ′ β ( τ ) ) ] = 0 E[X_i \cdot \psi_\tau(Y_i - X_i'\beta(\tau))] = 0 E[Xiψτ(YiXiβ(τ))]=0

  3. 它们具有有限的二阶矩(在适当的矩条件下)

根据多元中心极限定理,我们有:
1 n ∑ i = 1 n X i ⋅ ψ τ ( Y i − X i ′ β ( τ ) ) → d N ( 0 , J ( τ ) ) \frac{1}{\sqrt{n}}\sum_{i=1}^n X_i \cdot \psi_\tau(Y_i - X_i'\beta(\tau)) \xrightarrow{d} N(0, J(\tau)) n 1i=1nXiψτ(YiXiβ(τ))d N(0,J(τ))

其中协方差矩阵 J ( τ ) = E [ X i X i ′ ⋅ ψ τ ( Y i − X i ′ β ( τ ) ) 2 ] J(\tau) = E[X_i X_i' \cdot \psi_\tau(Y_i - X_i'\beta(\tau))^2] J(τ)=E[XiXiψτ(YiXiβ(τ))2]

步骤6:导出协方差矩阵

观察到:
ψ τ ( u ) 2 = ( τ − I ( u < 0 ) ) 2 = τ 2 ⋅ I ( u ≥ 0 ) + ( 1 − τ ) 2 ⋅ I ( u < 0 ) \psi_\tau(u)^2 = (\tau - I(u < 0))^2 = \tau^2 \cdot I(u \geq 0) + (1-\tau)^2 \cdot I(u < 0) ψτ(u)2=(τI(u<0))2=τ2I(u0)+(1τ)2I(u<0)

给定 X i X_i Xi,我们有:
P ( Y i − X i ′ β ( τ ) < 0 ∣ X i ) = τ 和 P ( Y i − X i ′ β ( τ ) ≥ 0 ∣ X i ) = 1 − τ P(Y_i - X_i'\beta(\tau) < 0|X_i) = \tau \quad \text{和} \quad P(Y_i - X_i'\beta(\tau) \geq 0|X_i) = 1-\tau P(YiXiβ(τ)<0∣Xi)=τP(YiXiβ(τ)0∣Xi)=1τ

因此:
E [ ψ τ ( Y i − X i ′ β ( τ ) ) 2 ∣ X i ] = τ 2 ⋅ ( 1 − τ ) + ( 1 − τ ) 2 ⋅ τ = τ ( 1 − τ ) E[\psi_\tau(Y_i - X_i'\beta(\tau))^2|X_i] = \tau^2 \cdot (1-\tau) + (1-\tau)^2 \cdot \tau = \tau(1-\tau) E[ψτ(YiXiβ(τ))2Xi]=τ2(1τ)+(1τ)2τ=τ(1τ)

这样我们得到:
J ( τ ) = τ ( 1 − τ ) ⋅ E [ X i X i ′ ] J(\tau) = \tau(1-\tau) \cdot E[X_i X_i'] J(τ)=τ(1τ)E[XiXi]

步骤7:确立渐近正态性

将中心极限定理的结果与巴哈杜尔表达式结合:
n ( β ^ ( τ ) − β ( τ ) ) = [ D ( τ ) ] − 1 ⋅ 1 n ∑ i = 1 n X i ⋅ ψ τ ( Y i − X i ′ β ( τ ) ) + o p ( 1 ) \sqrt{n}(\hat{\beta}(\tau) - \beta(\tau)) = [D(\tau)]^{-1} \cdot \frac{1}{\sqrt{n}}\sum_{i=1}^n X_i \cdot \psi_\tau(Y_i - X_i'\beta(\tau)) + o_p(1) n (β^(τ)β(τ))=[D(τ)]1n 1i=1nXiψτ(YiXiβ(τ))+op(1)

根据斯拉茨基定理(Slutsky’s theorem):
n ( β ^ ( τ ) − β ( τ ) ) → d N ( 0 , [ D ( τ ) ] − 1 ⋅ J ( τ ) ⋅ [ D ( τ ) ] − 1 ) \sqrt{n}(\hat{\beta}(\tau) - \beta(\tau)) \xrightarrow{d} N(0, [D(\tau)]^{-1} \cdot J(\tau) \cdot [D(\tau)]^{-1}) n (β^(τ)β(τ))d N(0,[D(τ)]1J(τ)[D(τ)]1)

代入 J ( τ ) = τ ( 1 − τ ) ⋅ E [ X i X i ′ ] J(\tau) = \tau(1-\tau) \cdot E[X_i X_i'] J(τ)=τ(1τ)E[XiXi]
n ( β ^ ( τ ) − β ( τ ) ) → d N ( 0 , τ ( 1 − τ ) ⋅ [ D ( τ ) ] − 1 ⋅ E [ X i X i ′ ] ⋅ [ D ( τ ) ] − 1 ) \sqrt{n}(\hat{\beta}(\tau) - \beta(\tau)) \xrightarrow{d} N\left(0, \tau(1-\tau) \cdot [D(\tau)]^{-1} \cdot E[X_i X_i'] \cdot [D(\tau)]^{-1}\right) n (β^(τ)β(τ))d N(0,τ(1τ)[D(τ)]1E[XiXi][D(τ)]1)

其中:
D ( τ ) = E [ X i X i ′ ⋅ f Y ∣ X ( X i ′ β ( τ ) ∣ X i ) ] D(\tau) = E[X_i X_i' \cdot f_{Y|X}(X_i'\beta(\tau)|X_i)] D(τ)=E[XiXifYX(Xiβ(τ)Xi)]

结论与直观解释

我们已经完整证明了分位回归估计量 β ^ ( τ ) \hat{\beta}(\tau) β^(τ) 的渐近正态性。这个证明的核心是巴哈杜尔表达式,它将非线性的估计问题线性化,使我们能够直接应用中心极限定理。

从直观上理解,渐近正态性意味着当样本量足够大时,分位回归估计量与真实参数的偏差(乘以 n \sqrt{n} n )近似服从正态分布。渐近方差的结构显示:

  1. 方差与 τ ( 1 − τ ) \tau(1-\tau) τ(1τ) 成正比,这表明当 τ \tau τ 接近 0 或 1 时(极端分位点),估计的精度会下降。

  2. 条件密度 f Y ∣ X f_{Y|X} fYX 出现在分母位置,表明条件分布在分位点附近的密度越高,估计的精度越高。

  3. 设计矩阵的二阶矩 E [ X i X i ′ ] E[X_i X_i'] E[XiXi] 影响估计精度,这与线性回归类似。

巴哈杜尔表达式不仅为渐近正态性提供了证明工具,也为构建置信区间和进行假设检验提供了理论基础,使分位回归成为一种强大的统计方法。

参考文献

Bahadur, R. Raj. “A note on quantiles in large samples.” The Annals of Mathematical Statistics 37.3 (1966): 577-580.
Belloni, Alexandre, et al. “Conditional Quantile Processes Based on Series or Many Regressors.” arXiv, 2011, arXiv:1105.6154.
Chernozhukov, Victor, et al. “Inference on Counterfactual Distributions.” Econometrica, vol. 81, no. 6, 2013, pp. 2205-2268.
Gutenbrunner, Christian, and Jana Jurečková. “Regression Rank Scores and Regression Quantiles.” The Annals of Statistics, vol. 20, no. 1, 1992, pp. 305-330.
He, Xuming, and Qi-Man Shao. “A General Bahadur Representation of M-Estimators and Its Application to Linear Regression with Nonstochastic Designs.” The Annals of Statistics, vol. 24, no. 6, 1996, pp. 2608-2630.
Jurečková, Jana. “Asymptotic Relations of M-Estimates and R-Estimates in Linear Regression Model.” The Annals of Statistics, vol. 5, no. 3, 1977, pp. 464-472.
Kato, Kengo. “Asymptotic Normality of Powell’s Kernel Estimator.” Annals of the Institute of Statistical Mathematics, vol. 64, no. 2, 2012, pp. 255-273.
Knight, Keith. “Limiting Distributions for L1 Regression Estimators Under General Conditions.” The Annals of Statistics, vol. 26, no. 2, 1998, pp. 755-770.
Koenker, Roger. Quantile Regression. Cambridge University Press, 2005.
Koenker, Roger, and Gilbert Bassett. “Regression Quantiles.” Econometrica, vol. 46, no. 1, 1978, pp. 33-50.
Koenker, Roger, and Stephen Portnoy. “L-Estimation for Linear Models.” Journal of the American Statistical Association, vol. 82, no. 399, 1987, pp. 851-857.
Pollard, David. “Asymptotics for Least Absolute Deviation Regression Estimators.” Econometric Theory, vol. 7, no. 2, 1991, pp. 186-199.
Portnoy, Stephen. “Asymptotic Behavior of Regression Quantiles in Non-Stationary, Dependent Cases.” Journal of Multivariate Analysis, vol. 38, no. 1, 1991, pp. 100-113.
van der Vaart, Aad W., and Jon A. Wellner. Weak Convergence and Empirical Processes. Springer, 1996.
Welsh, Alan H. “On M-Processes and M-Estimation.” The Annals of Statistics, vol. 17, no. 1, 1989, pp. 337-361.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

Infinity343

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值