浅析广义线性模型（举例线性回归和对数几率回归）

本文链接：https://blog.youkuaiyun.com/bunuli12138/article/details/104806460

本文介绍了广义线性模型（GLM），包括其通用形式、重要概念和指数族分布。GLM通过联结函数将线性预测值映射到期望输出，适合于数据分布为指数族的情况。文中详细讨论了线性回归的假设、正则化方法，并介绍了对数几率回归，特别强调了Sigmoid函数在二分类问题中的作用。此外，还探讨了多分类策略和凸函数求解。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

文章目录

通用形式

简介

广义线性模型（Generalized Linear Model），将线性模型通过联结函数（link function）联结线性预测值（linear prediction）（即线性模型的直接输出）和期望输出值（expected value）[1].[2]；通过寻找因变量（dependent variables）所属的指数族分布（exponential family distribution），建立合适联结函数，将线性模型的输出映射到期望输出，损失函数通常是指数族分布的似然函数（likelihood function）[1].[2].[3]；即采用极大似然函数求取损失函数。
$\begin{aligned} LM:\; &{\hat{y}}=W^\top X+\epsilon\\ GLM:\; &{\hat{y}}=g(W^\top X+\epsilon)\\ \end{aligned}$
可采用一对一（One Versus One）、一对多（One Versus Rest）、归一化指数函数（softmax function）等策略转变为多分类问题[1].[4]。
$softmax:\;S_i=\frac{\exp{z_i}}{\sum_{j=1}^n\exp{z_j}}$
广义线性模型由于使用了联结函数，所以对数据分布有假设（指数族分布）；所以可进行一定的预处理使数据分布更符合指数族分布，有助于提高模型得分（model score）[4]。

对比极大似然估计（Maximum Likelihood Estimate, MLE）本质，可知广义线性模型和极大似然估计都有假设，且进行逐点估计（point-wise estimate），因此[10]；
$\begin{aligned} {\color{gray}{posterior}} &= {\color{red}{likelihood}} * {\color{blue}{prior}} / {\color{green}{evidence}}\\ MLE: {\color{gray}{P(\alpha∣y)}} &= {\color{red}{P(y∣\alpha)}} * {\color{blue}{P(\alpha)}} / {\color{green}{P(y)}}\\ &={\color{red}{L(\alpha∣y)}} * {\color{blue}{P(\alpha)}} / {\color{green}{P(y)}}\\ {\color{red}{L(\alpha∣y)}} &=\max_\alpha{\sum_i\ln{p(y_i|\alpha)}} \end{aligned}$

重要概念

属性：参数模型（parametric models）、监督学习（supervised learning）、判别模型（discriminant model）、支持核方法（kernel methods）、面试常考。

求解：使用指数族分布的似然损失函数计算经验风险（empirical risk），可加入正则化项（normalization term）计算结构风险（structural risk）：凸函数证明后1.进行凸优化（convex optimization），2.使用梯度下降方法，包括随机梯度下降（Stochastic Gradient Descent）、牛顿法（Newton method）。

扩展：广义线性模型无法解决非线性模型问题，可引入核方法；线性回归正则化的三种形式；对数几率回归扩展到多元回归（softmax regression）。

指数族分布

指数族分布是指概率密度函数（probability density function）满足下面分布公式[2].[3].[5]：
$\begin{aligned} &f(y;\theta,\phi)=A(\phi)\exp[\frac{y\cdot\theta-B(\theta)}{\phi}−C(y,\phi)] \\ &X\beta=g(\mu)\\ define: &\theta:自然实参;\phi:离散实参\\ &A:单参实函;B:单参实函;C:双参实函\\ &g:联结函数\\ \end{aligned}$
举例：高斯分布（ $x\sim N(\mu,\sigma^2)$ ），也是线性回归中 $\epsilon$ 的假设分布（假设数据沿真实回归线左右呈高斯分布），故可知：设分布方差 $\sigma$ 对线性回归参数不干扰[2].[3].[5]
$\begin{aligned} 概率密度函数: f_{g}(x;\mu,\sigma^2) &=\frac{1}{\sigma\sqrt{2\pi}}\exp(-\frac{(x-\mu)^2}{2\sigma^2})\\ 指数族分布： f(y;{\color{blue}{\theta}},{\color{orange}{\phi}}) &={\color{red}{A(\phi)}} \exp[\frac{y\cdot {\color{blue}{\theta}}-{\color{green}{B(\theta)}}} { {\color{orange}{\phi}}}−{\color{brown}{C(y,\phi)}}]\\ f_{g}(x;{\color{blue}{\mu}},{\color{orange}{\sigma^2}}) &={\color{red}{\frac{1}{\sigma\sqrt{2\pi}}}} \exp[{\color{gray}{-2}}\frac{x\cdot {\color{blue}{\mu}}-{\color{green}{\frac{1}{2}\mu^2}}} { {\color{orange}{\sigma^2}}}-{\color{brown}{x^2\sigma^2}}] \\ 联结函数：g(\mu) &=X\beta=\mu \\ \end{aligned}\\$