一天搞懂深度学习

Lecture I: Introduction of Deep Learning

Three Steps for Deep Learning
  1. define a set of function(Neural Network)
  2. goodness of function
  3. pick the best function
Soft max layer as the output layer.
FAQhow many layers? How many neutons for each layer?

Trial and Error + Intuition(试错+直觉)

Gradient Descent
  1. pick an initial value for w
    • random(good enough)
    • RBM pre-train
  2. compute Lw
    wwηLw , where η is called “learning rate”
  3. repeat Until Lw is approximately small

But gradient descent never guarantee global minima

Modularization(模块化)

Deep Modularization

  • Each basic classifier can have sufficient training examples

  • Sharing by the following classifiers as module

  • The modulrization is automatically learned from data

Lecture II: Tips for Training DNN

  1. Do not always blame overfitting

    • Reason for overfitting: Training data and testing data can be different
    • Panacea for Overfitting: Have more training data or Create more training data
  2. Different approaches for different problems

  3. Chossing proper loss

    • square error(mse)
      • (yiyi^)2
    • cross entropy(categorical_crosssentropy)
      • (yi^lnyi)
      • When using softmax output layer, choose cross entropy
  4. Mini-batch

    • Mini-batch is Faster
      1. Randomly initialize network parameters
      2. Pick the 1 st batch, update parameters once
      3. Pick the 2 nd batch, update parameters once
      4. Until all mini-batches have been picked(one epoch finished)
      5. Repeat the above process(2-5)
  5. New activation function

    • Vanishing Gradient Problem
    • RBM pre-training
    • Rectified Linear Unit (ReLU)
    • Fast to compute
    • Biological reason
    • Infinite sigmoid z with different biases
    • A Thinner linear network
    • A special cases of Maxout
    • Vanishing gradient problem
    • 这里写图片描述
    • ReLU - variant
    • 这里写图片描述
  6. Adaptive Learning Rate

    • Popular & Simple Idea: Reduce the learning rate by some factor every few epochs

      • ηt=ηt+1
    • Adagrad

      • Original: wwηLw
      • Adagrad: wwηwLw,ηw=ηti=0(gi)2
  7. Momentum

    • Movement = Negative of L/w + Momentum
    • Adam = RMSProp (Advanced Adagrad) + Momentum
  8. Early Stopping

  9. Weight Decay

    • Original: wwηLw
    • Weight Decay: w0.99wηLw
  10. Dropout

    • Training:
    • Each neuron has p% to dropout
    • The structure of the network is changed.
    • Using the new network for training
    • Testing:
    • If the dropout rate at training is p%, all the weights times (1-p)%
    • Dropout is a kind of ensemble

Lecture III: Variants of Neural Networks

Convolutional Neural Network (CNN)

这里写图片描述

这里写图片描述

这里写图片描述

  • The convolution is not fully connected
  • The convolution is sharing weights
  • Learning: gradient descent

Recurrent Neural Network (RNN)

这里写图片描述

这里写图片描述

Long Short-term Memory (LSTM)

这里写图片描述

这里写图片描述

这里写图片描述

  • Gated Recurrent Unit (GRU): simpler than LSTM

Lecture IV: Next Wave

Supervised Learning

Ultra Deep Network
  • Worry about training first!

  • This ultra deep network have special structure

  • Ultra deep network is the ensemble of many networks with different depth

  • Ensemble: 6 layers, 4 layers or 2 layers

  • FractalNet

    这里写图片描述

  • Residual Network

    这里写图片描述

  • Highway Network

    这里写图片描述

Attention Model
  • Attention-based Model

    这里写图片描述

  • Attention-based Model v2

    这里写图片描述

Reinforcement Learning

  • Agent learns to take actions to maximize expected reward.
  • Difficulties of Reinforcement Learning
    • It may be better to sacrifice immediate reward to gain more long-term reward
    • Agent’s actions affect the subsequent data it receives
  • 这里写图片描述

Unsupervised Learning

  • Image: Realizing what the World Looks Like
  • Text: Understanding the Meaning of Words
  • Audio: Learning human language without supervision
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值