LSTNet

代码

https://github.com/laiguokun/LSTNet

论文

Modeling Long- and Short-Term Temporal Patterns with Deep Neural Networks.

LSTNet

在这里插入图片描述

部分参数解释

参数默认值解释
model(str)‘LSTNet’
hidCNN(int)100number of CNN hidden units
hidRNN(int)100number of RNN hidden units
window(int)24 * 7window size
CNN_kernel(int)6the kernel size of the CNN layers
highway_window(int)24The window size of the highway component
clip(float)10.gradient clipping
epochs(int)100upper epoch limit
batch_size(int)32batch size
dropout(float)0.2dropout applied to layers (0 = no dropout)
seed(int)54321random seed
gpu(int)None
log_interval(int)2000report interval
save(str)‘model/model.pt’path to save the final model
cuda(str)True
optim(str)‘adam’
lr(float)0.001
horizon(int)12
skip(float)24
hidSkip(int)5
L1Loss(bool)True
normalize(int)2
output_fun(str)‘sigmoid’

model

这是作者提供的

import torch
import torch.nn as nn
import torch.nn.functional as F

class Model(nn.Module):
    def __init__(self, args, data):
        super(Model, self).__init__()
        self.use_cuda = args.cuda
        self.P = args.window  # 输入窗口大小
        self.m = data.m  # 列数,变量数
        self.hidR = args.hidRNN
        self.hidC = args.hidCNN  # 卷积核数
        self.hidS = args.hidSkip
        self.Ck = args.CNN_kernel  # 卷积核大小
        self.skip = args.skip;
        self.pt = (self.P - self.Ck)//self.skip
        self.hw = args.highway_window
        self.conv1 = nn.Conv2d(1, self.hidC, kernel_size = (self.Ck, self.m));
        self.GRU1 = nn.GRU(self.hidC, self.hidR);
        self.dropout = nn.Dropout(p = args.dropout);
        if (self.skip > 0):
            self.GRUskip = nn.GRU(self.hidC, self.hidS);
            self.linear1 = nn.Linear(self.hidR + self.skip * self.hidS, self.m);
        else:
            self.linear1 = nn.Linear(self.hidR, self.m);
        if (self.hw > 0):
            self.highway = nn.Linear(self.hw, 1);
        self.output = None;
        if (args.output_fun == 'sigmoid'):
            self.output = F.sigmoid;
        if (args.output_fun == 'tanh'):
            self.output = F.tanh;
 
    def forward(self, x):
        batch_size = x.size(0);   # x: [batch, window, n_val]
        
        # CNN
        c = x.view(-1, 1, self.P, self.m)  # c: [batch, 1, window, n_val]
        c = F.relu(self.conv1(c))  # c: [batch, hidCNN, window-kernelsize+1, 1]
        c = self.dropout(c)
        c = torch.squeeze(c, 3)  # c: [batch, hidCNN, window-kernelsize+1]
        
        # RNN 
        r = c.permute(2, 0, 1).contiguous()  # c: [window-kernelsize+1, batch, hidCNN]
        _, r = self.GRU1(r)  # r: [1, batch, hidRNN]
        r = self.dropout(torch.squeeze(r,0))  # r: [batch, hidRNN]

        
        # skip-rnn
        
        if (self.skip > 0):
            s = c[:,:, int(-self.pt * self.skip):].contiguous()  # s: [batch, hidCNN, pt*skip]
            s = s.view(batch_size, self.hidC, self.pt, self.skip)  # s: [batch, hidCNN, pt, skip]
            s = s.permute(2,0,3,1).contiguous()  # s: [pt, batch, skip, hidCNN]
            s = s.view(self.pt, batch_size * self.skip, self.hidC)   # s: [pt, batch * skip, hidCNN]
            _, s = self.GRUskip(s)   # s: [1, batch * skip, hidSkip]
            s = s.view(batch_size, self.skip * self.hidS)   # s: [batch, skip * hidSkip]
            s = self.dropout(s)
            r = torch.cat((r,s),1)  # r: [batch, skip * hidSkip + hidRNN]
        
        res = self.linear1(r)  # res: [batch, n_val]
        
        # highway
        
        if (self.hw > 0):
            z = x[:, -self.hw:, :]  # z: [batch, hw, n_val]
            z = z.permute(0,2,1).contiguous().view(-1, self.hw)  # z: [batch*n_val, hw]
            z = self.highway(z)  # z: [batch*n_val, 1]
            z = z.view(-1,self.m) # z: [batch, n_val]
            res = res + z  # res: [batch, n_val]
            
        if (self.output):
            res = self.output(res)
        return res

代码中用到 GRU 作为 RNN 单元
r = σ ( W i r x + b i r + W h r h + b h r ) z = σ ( W i z x + b i z + W h z h + b h z ) n = tanh ⁡ ( W i n x + b i n + r ∗ ( W h n h + b h n ) ) h ′ = ( 1 − z ) ∗ n + z ∗ h \begin{array}{ll} r = \sigma(W_{ir} x + b_{ir} + W_{hr} h + b_{hr}) \\ z = \sigma(W_{iz} x + b_{iz} + W_{hz} h + b_{hz}) \\ n = \tanh(W_{in} x + b_{in} + r * (W_{hn} h + b_{hn})) \\ h' = (1 - z) * n + z * h \end{array} r=σ(Wirx+bir+Whrh+bhr)z=σ(Wizx+biz+Whzh+bhz)n=tanh(Winx+bin+r(Whnh+bhn))h=(1z)n+zh
Inputs: input, h 0 h_0 h0

  • input of shape (seq_len, batch, input_size): tensor containing the features of the input sequence. The input can also be a packed variable length sequence.
  • h_0 of shape (num_layers * num_directions, batch, hidden_size): tensor containing the initial hidden state for each element in the batch. Defaults to zero if not provided. If the RNN is bidirectional, num_directions should be 2, else it should be 1.

Outputs: output, h n h_n hn

  • output of shape (seq_len, batch, num_directions * hidden_size): tensor containing the output features h_t from the last layer of the GRU, for each t. If a :class: torch.nn.utils.rnn.PackedSequence has been given as the input, the output will also be a packed sequence. For the unpacked case, the directions can be separated using output.view(seq_len, batch, num_directions, hidden_size), with forward and backward being direction 0 and 1 respectively.

Similarly, the directions can be separated in the packed case.

  • h_n of shape (num_layers * num_directions, batch, hidden_size): tensor containing the hidden state for t = seq_len

Like output, the layers can be separated using h_n.view(num_layers, num_directions, batch, hidden_size).

在 代码中,只用到了输出的状态 h n h_n hn

评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

颹蕭蕭

白嫖?

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值