Multi-Step LSTM Network/多步LSTM网络
In this section, we will use the persistence example as a starting point and look at the changes needed to fit an LSTM to the training data and make multi-step forecasts for the test dataset.
本节,我们将用持续性例子作为起点,看看需要做哪些改变来在训练数据上拟合LSTM,并且在测试数据集上做出多步预测。
Prepare Data/准备数据
The data must be prepared before we can use it to train an LSTM.
数据必须被准备(说处理应该更合适),在用其训练一个LSTM之前
Specifically, two additional changes are required:
具体来说,需要两个额外的更改。
- Stationary. The data shows an increasing trend that must be removed by differencing.
固化。数据显示一个增长趋势其必须用差分来消除。(解释的一点都不清楚) - Scale. The scale of the data must be reduced to values between -1 and 1, the activation function of the LSTM units.
比例。数据的比例必须被减小到-1~1之间的值,这是LSTM单元的激活函数(的值的取值范围)
We can introduce a function to make the data stationary called difference(). This will transform the series of values into a series of differences, a simpler representation to work with.
我们可以引入一个函数来使数据固化,称为差分()。 这将把一个值的序列转化为一个差分的序列,一个更简单的表现形式。
1 2 3 4 5 6 7 | # create a differenced series def difference(dataset, interval=1): diff = list() for i in range(interval, len(dataset)): value = dataset[i] - dataset[i - interval] diff.append(value) return Series(diff) |
We can use the MinMaxScaler from the sklearn library to scale the data.
我们可以使用sklearn库中的MinMaxScaler来缩放数据。
Putting this together, we can update the prepare_data() function to first difference the data and rescale it, then perform the transform into a supervised learning problem and train test sets as we did before with the persistence example.
综合起来,我们可以更新prepare_data()函数来首先差分数据并重缩放它,然后执行转换为监督学习问题,并像以前一样使用持续性示例来训练测试集。
The function now returns a scaler in addition to the train and test datasets.
除了训练和测试数据集函数现在还返回一个缩放器。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | # transform series into train and test sets for supervised learning def prepare_data(series, n_test, n_lag, n_seq): # extract raw values raw_values = series.values # transform data to be stationary diff_series = difference(raw_values, 1) diff_values = diff_series.values diff_values = diff_values.reshape(len(diff_values), 1) # rescale values to -1, 1 scaler = MinMaxScaler(feature_range=(-1, 1)) scaled_values = scaler.fit_transform(diff_values) scaled_values = scaled_values.reshape(len(scaled_values), 1) # transform into supervised learning problem X, y supervised = series_to_supervised(scaled_values, n_lag, n_seq) supervised_values = supervised.values # split into train and test sets train, test = supervised_values[0:-n_test], supervised_values[-n_test:] return scaler, train, test |
We can call this function as follows:
我们可以调用这个函数如下:
1 2 | # prepare data scaler, train, test = prepare_data(series, n_test, n_lag, n_seq) |