2021.7.28魔鬼训练报告

这篇博客详述了一次魔鬼训练的内容,包括数学表达式的操作,如累加、累乘、积分,三重累加的应用,定积分的手算与程序验证,最小二乘法的验证例子,线性回归的公式推导,以及逻辑回归的损失函数推导、特点分析。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

数学表达式魔鬼训练

作业

  1. 将向量下标为偶数的分量 ( x 2 , x 4 , … x_2, x_4, \dots x2,x4,) 累加, 写出相应表达式.
    ∑ i m o d    2 = 0 x i \sum_{i \mod2 =0} {x_i } imod2=0xi
  2. 各出一道累加、累乘、积分表达式的习题, 并给出标准答案.
    (1)将100以内的, m o d    3 = 0 \mod3=0 mod3=0的数累加起来
    ∑ 1 ≤ i ≤ 100 , i m o d    3 = 0 i \sum_{1\leq i \leq100, i \mod3 =0} i 1i100,imod3=0i
    (2)写出 1 , 2 , . . . , 10 1,2,...,10 1,2,...,10分之一的积
    ∏ i = 1 10 1 i \prod_{i = 1}^{10} \frac{1}{i} i=110i1
    (3)求以原点为中心,半径为R的圆的面积
    ∫ − R + R 2 π R d x \int_{-R}^{+R} 2\pi R \mathrm{d}x R+R2πRdx
  3. 你使用过三重累加吗? 描述一下其应用.
    ∑ 1 ≤ i ≤ 100 ∑ 1 ≤ j ≤ 100 ∑ 1 ≤ k ≤ 100 ( x i j k ) \sum_{1\leq i\leq100}\sum_{1\leq j\leq100}\sum_{1\leq k\leq100} \left(x_{ijk}\right) 1i1001j1001k100(xijk)
  4. 给一个常用的定积分, 将手算结果与程序结果对比.
    定积分: ∫ 1 2 x d x \int_{1}^{2} x \mathrm{d}x 12xdx
    手算: ∫ 1 2 x d x = 1 2 x 2 ∣ 1 2 = 3 2 \int_{1}^{2} x \mathrm{d}x = \frac{1}{2}x^2 |_{1}^2=\frac{3}{2} 12xdx=21x212=23
    程序:

from sympy import *
x = symbols(‘x’)
print(integrate(x, (x, 1, 2)))
在这里插入图片描述

  1. 自己写一个小例子来验证最小二乘法.

[ α β ] = ( [ 1 x 1 1 x 2 ⋮ ⋮ 1 x n ] T [ 1 x 1 1 x 2 ⋮ ⋮ 1 x n ] ) − 1 [ 1 x 1 1 x 2 ⋮ ⋮ 1 x n ] T [ y 1 y 2 ⋮ y n ] \left[\begin{array}{c}\alpha \\ \beta\end{array}\right]=\left(\left[\begin{array}{cc}1 & x_{1} \\ 1 & x_{2} \\ \vdots & \vdots \\ 1 & x_{n}\end{array}\right]^{T}\left[\begin{array}{cc}1 & x_{1} \\ 1 & x_{2} \\ \vdots & \vdots \\ 1 & x_{n}\end{array}\right]\right)^{-1}\left[\begin{array}{cc}1 & x_{1}\\ 1 & x_{2}\\ \vdots & \vdots \\ 1 & x_{n}\end{array}\right]^{T}\left[\begin{array}{c}y_{1} \\ y_{2} \\ \vdots \\ y_{n}\end{array}\right] [αβ]=111x1x2xnT111x1x2xn1111x1x2xnTy1y2yn

X = [ 1 , 2 , 3 ] , Y = [ 2 , 3 , 7 ] \mathbf{X} = [1,2,3],\mathbf{Y} = [2,3,7] X=[1,2,3],Y=[2,3,7]
[ α β ] = [ 2.5 − 1 ] \begin{bmatrix} \alpha \\ \beta\end{bmatrix}\quad = \begin{bmatrix} 2.5 \\ -1\end{bmatrix}\quad [αβ]=[2.51]
得: y = 2.5 x − 1 y=2.5x-1 y=2.5x1
6. 线性回归公式推导
推导过程参考2020年魔鬼训练闵老师授课内容。
损失函数: ∑ i = 1 m ( x i w − y i ) 2 \sum_{i=1}^{m}\left(\mathbf{x}_{i} \mathbf{w}-y_{i}\right)^{2} i=1m(xiwyi)2
矩阵化表达: ∥ X w − Y ∥ 2 \|\mathbf{X} \mathbf{w}-\mathbf{Y}\|^{2} XwY2
矩阵化展开式: L ( X , Y , w ) = ( X w − Y ) T ( X w − Y ) L(\mathbf{X}, \mathbf{Y}, \mathbf{w})=(\mathbf{X} \mathbf{w}-\mathbf{Y})^{\mathrm{T}}(\mathbf{X} \mathbf{w}-\mathbf{Y}) L(X,Y,w)=(XwY)T(XwY)
求解推导: L ( X , Y , w ) = ( X w − Y ) T ( X w − Y ) = ( w T X T − Y T ) ( X w − Y ) = w T X T X w − w T X T Y − Y T X w + Y T Y \begin{aligned} &L(\mathbf{X}, \mathbf{Y}, \mathbf{w}) \\ &=(\mathbf{X} \mathbf{w}-\mathbf{Y})^{\mathrm{T}}(\mathbf{X} \mathbf{w}-\mathbf{Y}) \\ &=\left(\mathbf{w}^{\mathrm{T}} \mathbf{X}^{\mathrm{T}}-\mathbf{Y}^{\mathrm{T}}\right)(\mathbf{X} \mathbf{w}-\mathbf{Y}) \\ &=\mathbf{w}^{\mathrm{T}} \mathbf{X}^{\mathrm{T}} \mathbf{X} \mathbf{w}-\mathbf{w}^{\mathrm{T}} \mathbf{X}^{\mathrm{T}} \mathbf{Y}-\mathbf{Y}^{\mathrm{T}} \mathbf{X} \mathbf{w}+\mathbf{Y}^{\mathrm{T}} \mathbf{Y} \end{aligned} L(X,Y,w)=(XwY)T(XwY)=(wTXTYT)(XwY)=wTXTXwwTXTYYTXw+YTY
w \mathbf{w} w求导,让其结果为0。由矩阵求导法则得: ∂ A w ∂ w = A ∂ w T A ∂ w = A T ∂ w T A w ∂ w = 2 w T A \begin{aligned} &\frac{\partial A \mathbf{w}}{\partial \mathbf{w}}=A \\ &\frac{\partial \mathbf{w}^{\mathrm{T}} A}{\partial \mathbf{w}}=A^{\mathrm{T}} \\ &\frac{\partial \mathbf{w}^{\mathrm{T}} A \mathbf{w}}{\partial \mathbf{w}}=2 \mathbf{w}^{\mathrm{T}} A \end{aligned} wAw=AwwTA=ATwwTAw=2wTA
可知:
∂ L ( X , Y , w ) ∂ w = ∂ w T X T X w ∂ w − ∂ w T X T Y ∂ w − ∂ Y T X w ∂ w + ∂ Y T Y ∂ w = 2 w T X T X − Y T X − Y T X + 0 = 2 w T X T X − 2 Y T X \begin{aligned} &\frac{\partial L(\mathbf{X}, \mathbf{Y}, \mathbf{w})}{\partial \mathbf{w}} \\ &=\frac{\partial \mathbf{w}^{\mathrm{T}} \mathbf{X}^{\mathrm{T}} \mathbf{X} \mathbf{w}}{\partial \mathbf{w}}-\frac{\partial \mathbf{w}^{\mathrm{T}} \mathbf{X}^{\mathrm{T}} \mathbf{Y}}{\partial \mathbf{w}}-\frac{\partial \mathbf{Y}^{\mathrm{T}} \mathbf{X} \mathbf{w}}{\partial \mathbf{w}}+\frac{\partial \mathbf{Y}^{\mathrm{T}} \mathbf{Y}}{\partial \mathbf{w}} \\ &=2 \mathbf{w}^{\mathrm{T}} \mathbf{X}^{\mathrm{T}} \mathbf{X}-\mathbf{Y}^{\mathrm{T}} \mathbf{X}-\mathbf{Y}^{\mathrm{T}} \mathbf{X}+0 \\ &=2 \mathbf{w}^{\mathrm{T}} \mathbf{X}^{\mathrm{T}} \mathbf{X}-2 \mathbf{Y}^{\mathrm{T}} \mathbf{X} \end{aligned} wL(X,Y,w)=wwTXTXwwwTXTYwYTXw+wYTY=2wTXTXYTXYTX+0=2wTXTX2YTX

2 w ^ T X T X − 2 Y T X = 0 2 \hat{\mathbf{w}}^{\mathrm{T}} \mathbf{X}^{\mathrm{T}} \mathbf{X}-2 \mathbf{Y}^{\mathrm{T}} \mathbf{X}=0 2w^TXTX2YTX=0
可得
w ^ T X T X = Y T X \hat{\mathbf{w}}^{\mathrm{T}} \mathbf{X}^{\mathrm{T}} \mathbf{X}=\mathbf{Y}^{\mathrm{T}} \mathbf{X} w^TXTX=YTX
两边转置
X T X w ^ = X T Y \mathbf{X}^{\mathrm{T}} \mathbf{X} \hat{\mathbf{w}}=\mathbf{X}^{\mathrm{T}} \mathbf{Y} XTXw^=XTY
最后
w ^ = ( X T X ) − 1 X T Y \hat{\mathbf{w}}=\left(\mathbf{X}^{\mathrm{T}} \mathbf{X}\right)^{-1} \mathbf{X}^{\mathrm{T}} \mathbf{Y} w^=(XTX)1XTY
7. 自己推导一遍逻辑回归, 并描述这个方法的特点 (不少于 5 条).
损失函数看做概率问题:下式越大越好 P ( y i ∣ x i ; w ) = ( σ ( x i w ) ) y i ( 1 − σ ( x i w ) ) 1 − y i P\left(y_{i} \mid \mathbf{x}_{i} ; \mathbf{w}\right)=\left(\sigma\left(\mathbf{x}_{i} \mathbf{w}\right)\right)^{y_{i}}\left(1-\sigma\left(\mathbf{x}_{i} \mathbf{w}\right)\right)^{1-y_{i}} P(yixi;w)=(σ(xiw))yi(1σ(xiw))1yi
求似然函数:假设训练样本独立, 且同等重要

为获得全局最优, 将不同样本涉及的概率连乘, 获得似然函数:
L ( w ) = P ( Y ∣ X ; w ) = ∏ i = 1 m P ( y i ∣ x i ; w ) = ∏ i = 1 m ( σ ( x i w ) ) y i ( 1 − σ ( x i w ) ) 1 − y i \begin{aligned} L(\mathbf{w}) &=P(\mathbf{Y} \mid \mathbf{X} ; \mathbf{w}) \\ &=\prod_{i=1}^{m} P\left(y_{i} \mid \mathbf{x}_{i} ; \mathbf{w}\right) \\ &=\prod_{i=1}^{m}\left(\sigma\left(\mathbf{x}_{i} \mathbf{w}\right)\right)^{y_{i}}\left(1-\sigma\left(\mathbf{x}_{i} \mathbf{w}\right)\right)^{1-y_{i}} \end{aligned} L(w)=P(YX;w)=i=1mP(yixi;w)=i=1m(σ(xiw))yi(1σ(xiw))1yi
对数函数具有单调性:
l ( w ) = log ⁡ L ( w ) = log ⁡ ∏ i = 1 m P ( y i ∣ x i ; w ) = ∑ i = 1 m y i log ⁡ σ ( x i w ) + ( 1 − y i ) log ⁡ ( 1 − σ ( x i w ) ) \begin{aligned} l(\mathbf{w}) &=\log L(\mathbf{w}) \\ &=\log \prod_{i=1}^{m} P\left(y_{i} \mid \mathbf{x}_{i} ; \mathbf{w}\right) \\ &=\sum_{i=1}^{m} y_{i} \log \sigma\left(\mathbf{x}_{i} \mathbf{w}\right)+\left(1-y_{i}\right) \log \left(1-\sigma\left(\mathbf{x}_{i} \mathbf{w}\right)\right) \end{aligned} l(w)=logL(w)=logi=1mP(yixi;w)=i=1myilogσ(xiw)+(1yi)log(1σ(xiw))
损失函数(平均损失): min ⁡ w 1 m ∑ i = 1 m − y i log ⁡ σ ( x i w ) − ( 1 − y i ) log ⁡ ( 1 − σ ( x i w ) ) \min _{\mathbf{w}} \frac{1}{m} \sum_{i=1}^{m}-y_{i} \log \sigma\left(\mathbf{x}_{i} \mathbf{w}\right)-\left(1-y_{i}\right) \log \left(1-\sigma\left(\mathbf{x}_{i} \mathbf{w}\right)\right) wminm1i=1myilogσ(xiw)(1yi)log(1σ(xiw))
优化目标: min ⁡ w 1 m ∑ i = 1 m − y i log ⁡ σ ( x i w ) − ( 1 − y i ) log ⁡ ( 1 − σ ( x i w ) ) \min _{\mathbf{w}} \frac{1}{m} \sum_{i=1}^{m}-y_{i} \log \sigma\left(\mathbf{x}_{i} \mathbf{w}\right)-\left(1-y_{i}\right) \log \left(1-\sigma\left(\mathbf{x}_{i} \mathbf{w}\right)\right) wminm1i=1myilogσ(xiw)(1yi)log(1σ(xiw))

梯度下降法,迭代式推导:
由于
l ( w ) = ∑ i = 1 m y i log ⁡ σ ( x i w ) + ( 1 − y i ) log ⁡ ( 1 − σ ( x i w ) ) ∂ l ( w ) ∂ w j = ∑ i = 1 m ( y i σ ( x i w ) − 1 − y i 1 − σ ( x i w ) ) ∂ σ ( x i w ) ∂ w j = ∑ i = 1 m ( y i σ ( x i w ) − 1 − y i 1 − σ ( x i w ) ) σ ( x i w ) ( 1 − σ ( x i w ) ) ∂ x i w ∂ w j = ∑ i = 1 m ( y i σ ( x i w ) − 1 − y i 1 − σ ( x i w ) ) σ ( x i w ) ( 1 − σ ( x i w ) ) x i j = ∑ i = 1 m ( y i − σ ( x i w ) ) x i j \begin{gathered} l(\mathbf{w})=\sum_{i=1}^{m} y_{i} \log \sigma\left(\mathbf{x}_{i} \mathbf{w}\right)+\left(1-y_{i}\right) \log \left(1-\sigma\left(\mathbf{x}_{i} \mathbf{w}\right)\right) \\ \frac{\partial l(\mathbf{w})}{\partial w_{j}}=\sum_{i=1}^{m}\left(\frac{y_{i}}{\sigma\left(\mathbf{x}_{i} \mathbf{w}\right)}-\frac{1-y_{i}}{1-\sigma\left(\mathbf{x}_{i} \mathbf{w}\right)}\right) \frac{\partial \sigma\left(\mathbf{x}_{i} \mathbf{w}\right)}{\partial w_{j}} \\ =\sum_{i=1}^{m}\left(\frac{y_{i}}{\sigma\left(\mathbf{x}_{i} \mathbf{w}\right)}-\frac{1-y_{i}}{1-\sigma\left(\mathbf{x}_{i} \mathbf{w}\right)}\right) \sigma\left(\mathbf{x}_{i} \mathbf{w}\right)\left(1-\sigma\left(\mathbf{x}_{i} \mathbf{w}\right)\right) \frac{\partial \mathbf{x}_{i} \mathbf{w}}{\partial w_{j}} \\ =\sum_{i=1}^{m}\left(\frac{y_{i}}{\sigma\left(\mathbf{x}_{i} \mathbf{w}\right)}-\frac{1-y_{i}}{1-\sigma\left(\mathbf{x}_{i} \mathbf{w}\right)}\right) \sigma\left(\mathbf{x}_{i} \mathbf{w}\right)\left(1-\sigma\left(\mathbf{x}_{i} \mathbf{w}\right)\right) x_{i j} \\ \quad=\sum_{i=1}^{m}\left(y_{i}-\sigma\left(\mathbf{x}_{i} \mathbf{w}\right)\right) x_{i j} \end{gathered} l(w)=i=1myilogσ(xiw)+(1yi)log(1σ(xiw))wjl(w)=i=1m(σ(xiw)yi1σ(xiw)1yi)wjσ(xiw)=i=1m(σ(xiw)yi1σ(xiw)1yi)σ(xiw)(1σ(xiw))wjxiw=i=1m(σ(xiw)yi1σ(xiw)1yi)σ(xiw)(1σ(xiw))xij=i=1m(yiσ(xiw))xij
逻辑回归可以自己写例如或者直接调包

调包-sklearn实现
model = LogisticRegression()
model.fit(X, y)

逻辑回归的特点:
(1)由于sigmoid函数作用,预测结果为[0,1]之间的概率
(2)预测结果分为2类
(3)容易理解,可以通过公式进行推导,可解释性强
(4)准确率不高,在机器学习算法中表现一般
(5)模型简单

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值