完成了Gradient+Checking

最新推荐文章于 2023-12-27 18:56:28 发布

Suelovepython

最新推荐文章于 2023-12-27 18:56:28 发布

阅读量635

点赞数

CC 4.0 BY-SA版权

本文链接：https://blog.youkuaiyun.com/Suelovepython/article/details/79514609

这篇博客分享了作者完成Gradient Checking作业的体会，强调专注能有效提高效率。作者指出，梯度检查在应用dropout前是必要的，以验证反向传播的正确性，但之后则不再适用。同时，作者提到了正在进行的英语竞赛练习，但由于事务繁多，进度受到影响，且提出对于一道题目中逗号后下划线含义的疑问。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

这么久才进行了这个作业，其实只要专心地做不走神就能很快地完成作业。梯度检查很实用，需要注意的是在用dropout之前用梯度检查确保反向传播是正确的，用了dropout以后就不要梯度检查了。

把所有的theta转换成各个参数对应的：

# 向量theta分解为适用于计算的W1, b1, W2, b2, W3, b3

def vector_to_dictionary(theta):
    """
    Unroll all our parameters dictionary from a single vector satisfying our specific required shape.
    """
    parameters = {}
    parameters["W1"] = theta[:20].reshape((5,4))
    parameters["b1"] = theta[20:25].reshape((5,1))
    parameters["W2"] = theta[25:40].reshape((3,5))
    parameters["b2"] = theta[40:43].reshape((3,1))
    parameters["W3"] = theta[43:46].reshape((1,3))
    parameters["b3"] = theta[46:47].reshape((1,1))

    return parameters

# 向量theta分解为适用于计算的W1, b1, W2, b2, W3, b3

def gradients_to_vector(gradients):
    """
    Roll all our gradients dictionary into a single vector satisfying our specific required shape.
    """
    
    count = 0
    for key in ["dW1", "db1", "dW2", "db2", "dW3", "db3"]:
        # flatten parameter
        new_vector = np.reshape(gradients[key], (-1,1))
        
        if count == 0:
            theta = new_vector
        else:
            theta = np.concatenate((theta, new_vector), axis=0)
        count = count + 1

    return theta

英语竞赛的题做完了第二套，最近事情比较多计划总是被打乱。还是没有连贯的做一整套题，这样对做题速度的提升来说很不好。

有一个问题，

parameters_values, _ = dictionary_to_vector(parameters)

这里的逗号后面的下划线是什么意思呢，百度了一下也没有解决。