Class1-Week4-Deep Neural Network

最新推荐文章于 2023-01-24 16:11:19 发布

原创最新推荐文章于 2023-01-24 16:11:19 发布 · 318 阅读

0 ·

CC 4.0 BY-SA版权

Deep Learning 专栏收录该内容

18 篇文章

订阅专栏

本文详细解析了深度学习中前向传播和反向传播的过程，包括层间输入输出、权重更新及矩阵维度等关键概念。同时，探讨了参数与超参数的区别，并介绍了如何调整超参数以优化模型。

文章目录

Compute Process
Parameters vs Hyperparameters
- Defination
- Tune for Hyperparameters

Compute Process

Forward Propagation

Layer-l:

Input: $A^{[l-1]}$
Compute Process:
$Z^{[l]}=W^{[l]}A^{[l-1]}+b^{[l]}$
$A^{[l]}=g(Z^{[l]})$
Output: $A^{[l]}$
Cache: $Z^{[l]},W^{[l]},b^{[l]}$

Backward Propagation

Layer-l:

Input: $dA^{[l]}$
Compute Process:
$dZ^{[l]} = dA^{[l]} * g'(Z^{[l]})$
$dW^{[l]} = \frac{1}{m} * dZ^{[l]}A^{[l-1]T}$
$db^{[l]} = \frac{1}{m} * np.sum(dZ^{[l]}, axis=1, keepdims=True)$
$dA^{[l-1]} = W^{[l]T}dZ^{[l]}$
Output: $dA^{[l-1]}$
Update:
$W^{[l]} = W^{[l]} - \alpha dW^{[l]}$
$b^{[l]} = b^{[l]} - \alpha db^{[l]}$

Matrix Dimensions

Layer-l:

$\begin{aligned} dW^{[l]} = W^{[l]} &: (n^{[l]}, n^{[l-1]}) \\ db^{[l]} = b^{[l]} &: (n^{[l]}, 1) \\ dZ^{[l]} = Z^{[l]} &: (n^{[l]}, m) \\ dA^{[l]} = A^{[l]} &: (n^{[l]}, m) \end{aligned}$

Parameters vs Hyperparameters

Defination

In machine learning, a hyperparameter is a parameter whose value is set before the learning process begins. By contrast, the values of other parameters are derived via training.
1. Parameters: W, b
2. Hyperparameters:
- Learning_rate $\alpha$ – we can set a proper learning rate by drawing the relationship graph between iterations and cost in different learning rate.
- Iteration_numbers
- Network architecture
- Activation functions
- …