今天看到关于时间序列预测知识点,竟然要收费!本着开源第一的思想,自己也找到相关的代码尝试一下写几个通用的模版。
模型想要
输入:Input = (input_size, hidden_size),其中:input_size = time_stemp,因为是单个变量因此hidden_size = 1;
输出: output_size; 输出的步长;
1 数据预处理
一般数据都是按照时间步长展开,然后每一步可能有很多的特征。
比如下面的(来自科大讯飞的比赛数据,想要的可私聊):target:就是目标,new_dt 就是时间。
输出处理模块:通过这个模块就会得到一个 X =【batch_size,time_stemp, 1】y=[batch_size,output_size],batch_size 就是样本的个数;
def create_dataset(X, n_steps_in, n_steps_out):
# n_steps_in 输入步长
# n_steps_out输出步长
print(f"Input data shape before processing: {X.shape}")
Xs, ys = [], []
for i in range(len(X) - n_steps_in - n_steps_out + 1):
Xs.append(X[i:(i + n_steps_in)])
ys.append(X[(i + n_steps_in):(i + n_steps_in + n_steps_out)])
Xs = np.array(Xs)
ys = np.array(ys)
print(f"Xs shape after processing: {Xs.shape}")
print(f"ys shape after processing: {ys.shape}")
return Xs, ys
2 LSTM模型
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from tqdm import tqdm
from sklearn.preprocessing import LabelEncoder
#import h3
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')
from tqdm import tqdm
from tensorflow.keras.preprocessing.sequence import TimeseriesGenerator
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Flatten, Reshape, LSTM, Dropout, Dense, Bidirectional, BatchNormalization, Input, LayerNormalization, GRU, Conv1D, Concatenate, MaxPooling1D, MultiHeadAttention, GlobalAveragePooling1D, Activation, SpatialDropout1D, Lambda
from tensorflow.keras.losses import MeanSquaredError, Huber
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint, ReduceLROnPlateau
import warnings
import tensorflow as tf
from tenso