Deep Leaning 学习笔记之神经网络(4)—— 多隐藏层神经网络

这篇博客详细介绍了深度学习中的多隐藏层神经网络,从预备函数开始,包括初始化参数、前向传播过程(计算Z值、激励值及其缓存),接着讨论了计算成本函数,重点讲解了反向传播的各个步骤,如LINEAR和LINEAR-ACTIVATION的反向传播,并给出了计算梯度的公式,最后阐述了L层模型的反向传播及参数更新方法。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

1 预备函数

按照一次迭代来说
输入→计算linear,缓存linear_cache→计算action,缓存action_cache→反向传播→更新参数

1.1 初始化参数

# GRADED FUNCTION: initialize_parameters_deep

def initialize_parameters_deep(layer_dims):
    """
    Arguments:
    layer_dims -- python array (list) containing the dimensions of each layer in our network
    
    Returns:
    parameters -- python dictionary containing your parameters "W1", "b1", ..., "WL", "bL":
                    Wl -- weight matrix of shape (layer_dims[l], layer_dims[l-1])
                    bl -- bias vector of shape (layer_dims[l], 1)
    """
    
    np.random.seed(3)
    parameters = {}
    L = len(layer_dims)            # number of layers in the network

    for l in range(1, L):
        ### START CODE HERE ### (≈ 2 lines of code)
        parameters['W' + str(l)] = np.random.randn(layer_dims[l], layer_dims[l-1]) * 0.01
        parameters['b' + str(l)] = np.zeros((layer_dims[l],1))
        ### END CODE HERE ###
        
        assert(parameters['W' + str(l)].shape == (layer_dims[l], layer_dims[l-1]))
        assert(parameters['b' + str(l)].shape == (layer_dims[l], 1))

        
    return parameters

1.2 前向传播

1.2.1 前向传播——计算Z值并缓存对应的A,W,b

# GRADED FUNCTION: linear_forward

def linear_forward(A, W, b):
    """
    实现前向传播的线性部分——linear部分

    参数::A , W, b
    返回:
    Z -- 激活函数的输入,也称为预激活参数
    cache -- 包含“A”、“W”和“b”的python元组;存储用于有效地计算向后传递
    """
    
    ### START CODE HERE ### (≈ 1 line of code)
    Z = np.dot(W,A)+b
    ### END CODE HERE ###
    
    assert(Z.shape == (W.shape[0], A.shape[1]))
    cache = (A, W, b)
    
    return Z, cache

1.2.2 前向传播——计算激励值并缓存

linear_cache 是线性缓存,缓存Z值对应的A,W,b
activation_cache 是激励缓存,缓存的每一层的激励值
# GRADED FUNCTION: linear_activation_forward

def linear_activation_forward(A_prev, W, b, activation):
    """
    Implement the forward propagation for the LINEAR->ACTIVATION layer

    Arguments:
    A_prev -- activations from previous layer (or input data): (size of previous layer, number of examples)
    W -- weights matrix: numpy array of shape (size of current layer, size of previous layer)
    b -- bias vector, numpy array of shape (size of the current layer, 1)
    activation -- the activation to be used in this layer, stored as a text string: "sigmoid" or "relu"

    Returns:
    A -- the output of the activation function, also called the post-activation value 
    cache -- a python tuple containing "linear_cache" and "activation_cache";
             stored for computing the backward pass efficiently
    """
    
    if activation == "sigmoid":
        # Inputs: "A_prev, W, b". Outputs: "A, activation_cache".
        ### START CODE HERE ### (≈ 2 lines of code)
        Z, linear_cache = linear_forward(A_prev,W,b)
        A, activation_cache = sigmoid(Z)
        ### END CODE HERE ###
    
    elif activation == "relu":
        # Inputs: "A_prev, W, b". Outputs: "A, activation_cache".
        ### START CODE HERE ### (≈ 2 lines of code)
        Z, linear_cache = linear_forward(A_prev,W,b)
        A, activation_cache = relu(Z)
        ### END CODE HERE ###
    
    assert (A.shape == (W.shape[0], A_prev.shape[1]))
    cache = (linear_cache, activation_cache)

    return A, cache

1.2.3 前向传播——计算整体NN

# GRADED FUNCTION: L_model_forward

def L_model_forward(X, parameters):
    """
# 本示例中实现了前向传播,前L-1层为relu函数,第L层为sigmoid函数。
# 每一层都要缓存cache
# 返回参数:
## AL:最后一层的激励值
## caches:缓存的参数,L个,下标为0-(L-1)
    """
caches = []
    A = X
    L = len(parameters) // 2                  # number of layers in the neural network
    
    # 实现前L-1层的激励值并缓存,for循环,(1,L)不包含L层
    for l in range(1, L):
        A_prev = A 
        ###---------- START CODE HERE -------------### (≈ 2 lines of code)
        A, cache = linear_activation_forward(A_prev, parameters['W'+str(l)], parameters['b'+str(l)], activation = 'relu')
        caches.append(cache)
        ###---------- END CODE HERE ---------- ###
    
    # 实现第L层的sigmoid并缓存
    ### ----------START CODE HERE---------- ### (≈ 2 lines of code)
    AL, cache = linear_activation_forward(A, parameters['W'+str(L)], parameters['b'+str(L)], activation = 'sigmoid')
    caches.append(cache)
    ### ----------END CODE HERE---------- ###
    
    assert(AL.shape == (1,X.shape[1]))
            
    return AL, caches

1.3 计算costFunction

现在开始实现前向传播和反向传播,并且计算cost ,因为我想知道我的模型是否真的在学习

Compute the cross-entropy cost JJJ, using the following formula: (7)−1m∑i=1m(y(i)log⁡(a[L](i))+(1−y(i))log⁡(1−a[L](i)))-\frac{1}{m} \sum\limits_{i = 1}^{m} (y^{(i)}\log\left(a^{[L] (i)}\right) + (1-y^{(i)})\log\left(1- a^{[L](i)}\right)) \tag{7}m1i=1m(y(i)log(a[L](i

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值