class PositionwiseFeedForward(nn.Module):
"Implements FFN equation."
def __init__(self, d_model, d_ff, dropout=0.1):
super(PositionwiseFeedForward, self).__init__()
self.w_1 = nn.Linear(d_model, d_ff)#剖析点1
self.w_2 = nn.Linear(d_ff, d_model)
self.dropout = nn.Dropout(dropout)#剖析点2
def forward(self, x):
return self.w_2(self.dropout(F.relu(self.w_1(x))))
剖析源码
1 剖析点1:self.w_1 = nn.Linear(d_model, d_ff)
这里的d_model是embedding的长度一般取512
d_ff是inner_layer的维度:2048
2 剖析点2:nn.Dropout(dropout)
参考:https://blog.youkuaiyun.com/weixin_42979152/article/details/113769291
注意区别nn.Dropout(dr

本文深入解析了PyTorch中PositionwiseFeedForward层的工作原理,包括其内部结构如Linear层的作用及Dropout层的具体实现细节。
最低0.47元/天 解锁文章
3366

被折叠的 条评论
为什么被折叠?



