1 预备函数
按照一次迭代来说
输入→计算linear,缓存linear_cache→计算action,缓存action_cache→反向传播→更新参数
1.1 初始化参数
# GRADED FUNCTION: initialize_parameters_deep
def initialize_parameters_deep(layer_dims):
"""
Arguments:
layer_dims -- python array (list) containing the dimensions of each layer in our network
Returns:
parameters -- python dictionary containing your parameters "W1", "b1", ..., "WL", "bL":
Wl -- weight matrix of shape (layer_dims[l], layer_dims[l-1])
bl -- bias vector of shape (layer_dims[l], 1)
"""
np.random.seed(3)
parameters = {}
L = len(layer_dims) # number of layers in the network
for l in range(1, L):
### START CODE HERE ### (≈ 2 lines of code)
parameters['W' + str(l)] = np.random.randn(layer_dims[l], layer_dims[l-1]) * 0.01
parameters['b' + str(l)] = np.zeros((layer_dims[l],1))
### END CODE HERE ###
assert(parameters['W' + str(l)].shape == (layer_dims[l], layer_dims[l-1]))
assert(parameters['b' + str(l)].shape == (layer_dims[l], 1))
return parameters
1.2 前向传播
1.2.1 前向传播——计算Z值并缓存对应的A,W,b
# GRADED FUNCTION: linear_forward
def linear_forward(A, W, b):
"""
实现前向传播的线性部分——linear部分
参数::A , W, b
返回:
Z -- 激活函数的输入,也称为预激活参数
cache -- 包含“A”、“W”和“b”的python元组;存储用于有效地计算向后传递
"""
### START CODE HERE ### (≈ 1 line of code)
Z = np.dot(W,A)+b
### END CODE HERE ###
assert(Z.shape == (W.shape[0], A.shape[1]))
cache = (A, W, b)
return Z, cache
1.2.2 前向传播——计算激励值并缓存
linear_cache 是线性缓存,缓存Z值对应的A,W,b
activation_cache 是激励缓存,缓存的每一层的激励值
# GRADED FUNCTION: linear_activation_forward
def linear_activation_forward(A_prev, W, b, activation):
"""
Implement the forward propagation for the LINEAR->ACTIVATION layer
Arguments:
A_prev -- activations from previous layer (or input data): (size of previous layer, number of examples)
W -- weights matrix: numpy array of shape (size of current layer, size of previous layer)
b -- bias vector, numpy array of shape (size of the current layer, 1)
activation -- the activation to be used in this layer, stored as a text string: "sigmoid" or "relu"
Returns:
A -- the output of the activation function, also called the post-activation value
cache -- a python tuple containing "linear_cache" and "activation_cache";
stored for computing the backward pass efficiently
"""
if activation == "sigmoid":
# Inputs: "A_prev, W, b". Outputs: "A, activation_cache".
### START CODE HERE ### (≈ 2 lines of code)
Z, linear_cache = linear_forward(A_prev,W,b)
A, activation_cache = sigmoid(Z)
### END CODE HERE ###
elif activation == "relu":
# Inputs: "A_prev, W, b". Outputs: "A, activation_cache".
### START CODE HERE ### (≈ 2 lines of code)
Z, linear_cache = linear_forward(A_prev,W,b)
A, activation_cache = relu(Z)
### END CODE HERE ###
assert (A.shape == (W.shape[0], A_prev.shape[1]))
cache = (linear_cache, activation_cache)
return A, cache
1.2.3 前向传播——计算整体NN
# GRADED FUNCTION: L_model_forward
def L_model_forward(X, parameters):
"""
# 本示例中实现了前向传播,前L-1层为relu函数,第L层为sigmoid函数。
# 每一层都要缓存cache
# 返回参数:
## AL:最后一层的激励值
## caches:缓存的参数,L个,下标为0-(L-1)
"""
caches = []
A = X
L = len(parameters) // 2 # number of layers in the neural network
# 实现前L-1层的激励值并缓存,for循环,(1,L)不包含L层
for l in range(1, L):
A_prev = A
###---------- START CODE HERE -------------### (≈ 2 lines of code)
A, cache = linear_activation_forward(A_prev, parameters['W'+str(l)], parameters['b'+str(l)], activation = 'relu')
caches.append(cache)
###---------- END CODE HERE ---------- ###
# 实现第L层的sigmoid并缓存
### ----------START CODE HERE---------- ### (≈ 2 lines of code)
AL, cache = linear_activation_forward(A, parameters['W'+str(L)], parameters['b'+str(L)], activation = 'sigmoid')
caches.append(cache)
### ----------END CODE HERE---------- ###
assert(AL.shape == (1,X.shape[1]))
return AL, caches
1.3 计算costFunction
现在开始实现前向传播和反向传播,并且计算cost ,因为我想知道我的模型是否真的在学习
Compute the cross-entropy cost JJJ, using the following formula: (7)−1m∑i=1m(y(i)log(a[L](i))+(1−y(i))log(1−a[L](i)))-\frac{1}{m} \sum\limits_{i = 1}^{m} (y^{(i)}\log\left(a^{[L] (i)}\right) + (1-y^{(i)})\log\left(1- a^{[L](i)}\right)) \tag{7}−m1i=1∑m(y(i)log(a[L](i