cs224n-naive-softmax-的推导与实现

StarLib

于 2019-09-09 17:50:34 发布

阅读量1.2k

点赞数 3

CC 4.0 BY-SA版权

分类专栏： NLP

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.youkuaiyun.com/StarLib/article/details/100669485

This blog covers the derivation and implementation of naive softmax in the context of CS224n (2019). It starts with an introduction, followed by a detailed explanation of the formula derivation using chain rule and the gradient of the softmax function. The post concludes with a brief mention of the provided code for the implementation." 88195562,5040019,在线编译器推荐：从调试到汇编查看,"['编译器', '工具', '在线开发', '调试工具', '代码编辑']

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

cs224n-naive-softmax-的推导与实现

文章目录

- cs224n-naive-softmax-的推导与实现

0、简介

在cs224n（2019）第二次课后作业Assignment 2的手写作业b（推导梯度公式）和编程作业a中对naive softmax的实现。首先手写作业的推导公式，编程作业则是对这些公式的简单实现

1、公式推出

首先看下背景和问题

首先问题a中已经证明了损失函数 $\hat{y})$ ，而且有 $\hat{y}=softmax(U^Tv_c)$ , 不妨用 $\theta=U^tv_c$ ,现在有 $\hat{y}=softmax(\theta)$ , 然后用chain rule推导J关于vc的偏导数。

最低0.47元/天解锁文章

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。