Machine Learning(李宏毅公开课笔记)-Machine Learning and Deep Learning

这篇博客介绍了机器学习和深度学习的基础,包括用于回归、分类和其他任务的函数,如PM2.5和国际象棋。接着讨论了寻找函数的过程,如梯度下降法,以及线性模型和复杂模型的构建。文章还提到了激活函数如硬 sigmoid 和 ReLU,并展示了如何通过增加特征来构建新的模型。最后,讲述了模型优化过程中参数的更新方法,并给出了一个迭代更新的例子。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

Machine Learning and Deep Learning

1. Functions

  1. Regression:PM2.5
  2. Classification:chess
  3. Others:structured learning

2. The procedures of finding the functions

  1. Functions with unknown parameters
  2. Define loss from training data
  3. Optimization
    1. gradient descent
      A) randomly set an initial value w
      B) compute the ∂ l / ∂ w \partial l/\partial w l/w
      C) update w iteratively

3. Models

  1. linear model
  2. sophisticated model
  1. linear curves:curves
    all piecewise linear curves = constant + sum of set (sigmoid)
    activation function:
    1.hard sigmoid: which can be represented by sum of two ReLU

    在这里插入图片描述
    2.rectified linear unit(ReLU): m a x ( 0 , w x + b ) max(0,wx+b) max(0,wx+b)
    在这里插入图片描述
    3.soft sigmoid: c 1 + e − ( w x + b ) = c ∗ s i g m o i d ( w x + b ) \cfrac{c}{1+e^{-(wx+b)}}=c*sigmoid(wx+b) 1+e(wx+b)c=csigmoid(wx+b)
    在这里插入图片描述
  1. Beyond piecewise curves
    在这里插入图片描述
    approximate continuous curve by a piecewise linear curve
    to have a good approximate, we need sufficient pieces
  1. New model: More Features
    y = b + ∑ i c i ∗ s i g m o i d ( ∑ j w i j x j + b i ) y = b + \sum_{i}{c_i * sigmoid(\sum_{j}w_{ij}x_j+b_i)} y=b+icisigmoid(jwijxj+bi)
    r i = W i X + b i , a i = s i g m o i d ( r i ) r_i = W_i X+b_i ,a_i=sigmoid(ri) ri=WiX+biai=sigmoid(ri)
    y = b + C A y = b + CA y=b+CA
    optimization of new model:
    Θ = [ W B C ] \varTheta = [W B C] Θ=[WBC]
    g r a d i e n t = ∣ ∂ L ∂ Θ 1 ∂ L ∂ Θ 2 . . . ∂ L ∂ Θ n ∣ gradient = \begin{vmatrix} \cfrac{\partial L}{\partial\varTheta_1} \\ \cfrac{\partial L}{\partial\varTheta_2} \\...\\\cfrac{\partial L}{\partial\varTheta_n} \end{vmatrix} gradient=Θ1LΘ2L...ΘnL
    g = ∇ L ( Θ 0 ) g = \nabla{L(\varTheta^0)} g=L(Θ0)
    ∣ Θ 1 1 Θ 2 1 . . . Θ n 1 ∣ = ∣ Θ 1 0 Θ 2 0 . . . Θ n 0 ∣ − η ∗ g \begin{vmatrix}\varTheta_1^1 \\ \varTheta_2^1\\...\\\varTheta_n^1 \end{vmatrix}=\begin{vmatrix} \varTheta_1^0 \\ \varTheta_2^0\\...\\\varTheta_n^0 \end{vmatrix} - \eta * g Θ11Θ21...Θn1=Θ10Θ20...Θn0ηg
  1. epoch | batch | update | iteration
    number of samples: 1000
    batch size: 10
    iterations: 100
    epoch:1

video link: https://speech.ee.ntu.edu.tw/~hylee/ml/2021-spring.html.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值