pytorch 中RNN接口参数

weixin_42924890

已于 2024-03-21 09:27:57 修改

阅读量815

点赞数 9

CC 4.0 BY-SA版权

文章标签： pytorch rnn 深度学习

于 2024-03-08 16:42:45 首次发布

本文链接：https://blog.youkuaiyun.com/weixin_42924890/article/details/136565982

torch中RNN模块详细接口参数解析

rnn = torch.nn.RNN(
    input_size: int,
    hidden_size: int,
    num_layers: int = 1,
    nonlinearity: str = 'tanh',
    bias: bool = True,
    batch_first: bool = False,
    dropout: float = 0.0,
    bidirectional: bool = False,
)

input_size (int)：输入序列中每个时间步的特征维度（nlp词嵌入的维度）。

hidden_size (int)：隐藏状态（记忆单元）的维度。

num_layers (int, 默认为1)：RNN 层的堆叠数量。

nonlinearity (str, 默认为’tanh’)：激活函数的选择，可以是 ‘tanh’ 或 ‘relu’。不过在标准 RNN 中通常使用 ‘tanh’。

bias (bool, 默认为True)：是否在计算中包含偏置项。

batch_first (bool, 默认为False)：如果设为 True，则输入和输出张量的第一个维度将被视为批次大小，而不是时间步长。即数据格式为 (batch_size, seq_len, input_size) 而不是 (seq_len, batch, input_size)。

dropout (float, 默认为0.0)：应用于隐层到隐层之间的失活率，用于正则化以防止过拟合。只有当 num_layers > 1 时才会生效。

bidirectional (bool, 默认为False)：若设置为 True，将会创建一个双向 RNN，这样模型可以同时处理过去和未来的上下文信息。
注意torch.nn.RNN 本身并不直接支持双向模式；要实现双向RNN，应使用 torch.nn.Bidirectional 包装器包裹一个单向RNN。

outputs, hn = rnn(...)

outputs: Tensor 如果batch_first=True，则为则为 (batch_size, seq_len, num_directions * hidden_size)。否则 (seq_len, batch_size, num_directions * hidden_size)；
RNN 对输入序列每个时间步的输出。对于双向 RNN，num_directions 为2反之为1，输出是正向和反向隐藏状态的串联或拼接结果。

hn: Tensor ：形状(num_layers * num_directions, batch_size, hidden_size)
每层最后一个时间步的隐藏状态（或者在双向情况下，正向和反向隐藏状态）。等价 output[:, -1, :]

实例化一个单向的RNN单元

import torch.nn as nn
import torch

batch_size

最低0.47元/天解锁文章

200万优质内容无限畅学