一天搞懂深度学习

部署运行你感兴趣的模型镜像

Lecture I: Introduction of Deep Learning

Three Steps for Deep Learning
  1. define a set of function(Neural Network)
  2. goodness of function
  3. pick the best function
Soft max layer as the output layer.
FAQhow many layers? How many neutons for each layer?

Trial and Error + Intuition(试错+直觉)

Gradient Descent
  1. pick an initial value for w
    • random(good enough)
    • RBM pre-train
  2. compute Lw
    wwηLw, where η is called “learning rate”
  3. repeat Until Lw is approximately small

But gradient descent never guarantee global minima

Modularization(模块化)

Deep Modularization

  • Each basic classifier can have sufficient training examples

  • Sharing by the following classifiers as module

  • The modulrization is automatically learned from data

Lecture II: Tips for Training DNN

  1. Do not always blame overfitting

    • Reason for overfitting: Training data and testing data can be different
    • Panacea for Overfitting: Have more training data or Create more training data
  2. Different approaches for different problems

  3. Chossing proper loss

    • square error(mse)
      • (yiyi^)2
    • cross entropy(categorical_crosssentropy)
      • (yi^lnyi)
      • When using softmax output layer, choose cross entropy
  4. Mini-batch

    • Mini-batch is Faster
      1. Randomly initialize network parameters
      2. Pick the 1 st batch, update parameters once
      3. Pick the 2 nd batch, update parameters once
      4. Until all mini-batches have been picked(one epoch finished)
      5. Repeat the above process(2-5)
  5. New activation function

    • Vanishing Gradient Problem
    • RBM pre-training
    • Rectified Linear Unit (ReLU)
    • Fast to compute
    • Biological reason
    • Infinite sigmoid z with different biases
    • A Thinner linear network
    • A special cases of Maxout
    • Vanishing gradient problem
    • 这里写图片描述
    • ReLU - variant
    • 这里写图片描述
  6. Adaptive Learning Rate

    • Popular & Simple Idea: Reduce the learning rate by some factor every few epochs

      • ηt=ηt+1
    • Adagrad

      • Original: wwηLw
      • Adagrad:wwηwLw,ηw=ηti=0(gi)2
  7. Momentum

    • Movement = Negative of L/w + Momentum
    • Adam = RMSProp (Advanced Adagrad) + Momentum
  8. Early Stopping

  9. Weight Decay

    • Original: wwηLw
    • Weight Decay: w0.99wηLw
  10. Dropout

    • Training:
    • Each neuron has p% to dropout
    • The structure of the network is changed.
    • Using the new network for training
    • Testing:
    • If the dropout rate at training is p%, all the weights times (1-p)%
    • Dropout is a kind of ensemble

Lecture III: Variants of Neural Networks

Convolutional Neural Network (CNN)

这里写图片描述

这里写图片描述

这里写图片描述

  • The convolution is not fully connected
  • The convolution is sharing weights
  • Learning: gradient descent

Recurrent Neural Network (RNN)

这里写图片描述

这里写图片描述

Long Short-term Memory (LSTM)

这里写图片描述

这里写图片描述

这里写图片描述

  • Gated Recurrent Unit (GRU): simpler than LSTM

Lecture IV: Next Wave

Supervised Learning

Ultra Deep Network
  • Worry about training first!

  • This ultra deep network have special structure

  • Ultra deep network is the ensemble of many networks with different depth

  • Ensemble: 6 layers, 4 layers or 2 layers

  • FractalNet

    这里写图片描述

  • Residual Network

    这里写图片描述

  • Highway Network

    这里写图片描述

Attention Model
  • Attention-based Model

    这里写图片描述

  • Attention-based Model v2

    这里写图片描述

Reinforcement Learning

  • Agent learns to take actions to maximize expected reward.
  • Difficulties of Reinforcement Learning
    • It may be better to sacrifice immediate reward to gain more long-term reward
    • Agent’s actions affect the subsequent data it receives
  • 这里写图片描述

Unsupervised Learning

  • Image: Realizing what the World Looks Like
  • Text: Understanding the Meaning of Words
  • Audio: Learning human language without supervision

您可能感兴趣的与本文相关的镜像

Stable-Diffusion-3.5

Stable-Diffusion-3.5

图片生成
Stable-Diffusion

Stable Diffusion 3.5 (SD 3.5) 是由 Stability AI 推出的新一代文本到图像生成模型,相比 3.0 版本,它提升了图像质量、运行速度和硬件效率

评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值