题目
循环神经网络(RNN)是一种能够处理序列数据的神经网络,其特点是能够处理时间序列数据。
RNN的具体步骤如下:
- 计算隐藏状态更新
h t = tanh ( W x ⋅ x t + W h ⋅ h t − 1 + b ) h_t = \tanh(W_x \cdot x_t + W_h \cdot h_{t-1} + b) ht=tanh(Wx⋅xt+Wh⋅ht−1+b) - 计算输出
y t = W y ⋅ h t + b y y_t = W_y \cdot h_t + b_y yt=Wy⋅ht+by - 计算损失
l o s s = ∑ t = 1 T ( y t − y ^ t ) 2 loss = \sum_{t=1}^{T} (y_t - \hat{y}_t)^2 loss=t=1∑T(yt−y^t)2 - 反向传播
∂ l o s s ∂ W h = ∑ t = 1 T ∂ l o s s ∂ y t ⋅ ∂ y t ∂ h t ⋅ ∂ h t ∂ W h \frac{\partial loss}{\partial W_h} = \sum_{t=1}^{T} \frac{\partial loss}{\partial y_t} \cdot \frac{\partial y_t}{\partial h_t} \cdot \frac{\partial h_t}{\partial W_h} ∂Wh∂loss=t=1∑T∂yt∂loss⋅∂ht∂yt⋅∂Wh∂ht
本题只要求实现前向传播,反向传播不要求实现。
标准代码如下
def rnn_forward(input_sequence, initial_hidden_state, Wx, Wh, b):
h = np.array(initial_hidden_state)
Wx = np.array(Wx)
Wh = np.array(Wh)
b = np.array(b)
for x in input_sequence:
x = np.array(x)
h = np.tanh(np.dot(Wx, x) + np.dot(Wh, h) + b)
final_hidden_state = np.round(h, 4)
return final_hidden_state.tolist()
2084

被折叠的 条评论
为什么被折叠?



