keras中 LSTM 的 [samples, time_steps, features] 最终解释

本文解析了LSTM神经网络的工作原理及配置,特别是输入数据的维度调整对于网络性能的影响,并探讨了Keras中LSTM网络的状态管理。

I am going through the following blog on LSTM neural network:http://machinelearningmastery.com/understanding-stateful-lstm-recurrent-neural-networks-python-keras/

The author reshapes the input vector X as [samples, time steps, features] for different configuration of LSTMs.

The author writes

Indeed, the sequences of letters are time steps of one feature rather than one time step of separate features. We have given more context to the network, but not more sequence as it expected

What does this mean?

=========================================

I found this just below the [samples, time_steps, features] you are concerned with.

X = numpy.reshape(dataX, (len(dataX), seq_length, 1))

Samples - This is the len(dataX), or the amount of data points you have.

Time steps - This is equivalent to the amount of time steps you run your recurrent neural network. If you want your network to have memory of 60 characters, this number should be 60.

Features - this is the amount of features in every time step. If you are processing pictures, this is the amount of pixels. In this case you seem to have 1 feature per time step.

 

ASK:

can you explain the difference between : X = numpy.reshape(dataX, (len(dataX), 3, 1)) and X = numpy.reshape(dataX, (len(dataX), 1, 3)) How does this affect the lstm?

ANSWER:

(len(dataX), 3, 1) runs LSTM for 3 iterations, inputting a input vector of shape (1,). (len(dataX), 1, 3) runs LSTM for 1 iteration. Which means that it is quite useless to even have recurrent connections since there can't be any feedback from previous iterations. In this case input shape to RNN is of shape (3,)。

其实TimeSteps就是unfold的意思,就是tensorflow中的 NUM_STEPS 的意思。

Features其实就是输入的维度,也就是特征,一个维度一个特征。

 

 

The LSTM networks are stateful. They should be able to learn the whole alphabet sequence, but by default the Keras implementation resets the network state after each training batch.

LSTM网络本是状态传递的,这种网络本应该是学习整个序列的; 但是keras的默认实现却会在每个batch训练结束时重置网络的状态。

 

import numpy as np import tensorflow as tf from tensorflow.keras.models import Model from tensorflow.keras.layers import Input, Conv1D, Dense, Dropout, Flatten, TimeDistributed from sklearn.preprocessing import StandardScaler from sklearn.metrics import mean_absolute_error # 生成虚拟交通流量数据 def generate_virtual_taxi_data(num_nodes=50, time_steps=288, features=2): np.random.seed(42) # 确保可复现性 base_flow = np.linspace(100, 500, num_nodes) # 基础流量(辆/小时) inflow = np.random.normal(loc=base_flow*0.8, scale=base_flow*0.1, size=(time_steps, num_nodes)) outflow = np.random.normal(loc=base_flow*0.7, scale=base_flow*0.1, size=(time_steps, num_nodes)) # 添加早晚高峰特征 peak_mask = np.zeros((time_steps, num_nodes), dtype=bool) peak_mask[np.logical_or(time_steps*0.25<np.arange(time_steps), time_steps*0.75>np.arange(time_steps))] = True inflow[peak_mask] *= 1.5 outflow[peak_mask] *= 1.3 # 构建完整数据集 traffic_data = np.stack([inflow, outflow], axis=-1) return traffic_data # 创建序列数据 def create_sequences(data, seq_length): X, y = [], [] for i in range(len(data) - seq_length): X.append(data[i:i+seq_length]) y.append(data[i+seq_length]) return np.array(X), np.array(y) # 构建多任务模型 def build_multi_task_model(input_shape): inputs = Input(shape=input_shape) x = tf.keras.layers.TimeDistributed(Conv1D(filters=64, kernel_size=3, activation='relu', padding='same'))(inputs) x = tf.keras.layers.TimeDistributed(Flatten())(x) x = tf.keras.layers.LSTM(128, return_sequences=True)(x) # 分支预测流入和流出 inflow_output = Dense(1, name='inflow')(x) outflow_output = Dense(1, name='outflow')(x) model = Model(inputs, [inflow_output, outflow_output]) model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001), loss={'inflow': 'mse', 'outflow': 'mse'}, metrics={'inflow': 'mae', 'outflow': 'mae'}) return model # 主函数 def main(): num_nodes = 50 time_steps = 288 features = 2 seq_length = 12 # 生成虚拟数据 traffic_data = generate_virtual_taxi_data(num_nodes, time_steps, features) print(f"Virtual Traffic Data Shape: {traffic_data.shape}") # 应输出 (288,50,2) # 数据预处理 scaler = StandardScaler() traffic_data_scaled = scaler.fit_transform(traffic_data.reshape(-1, features)).reshape(traffic_data.shape) # 创建序列数据 X, y = create_sequences(traffic_data_scaled, seq_length) # 划分训练集/测试集 split = int(0.8 * X.shape[0]) X_train, X_test = X[:split], X[split:] y_train, y_test = y[:split], y[split:] # 模型训练 model = build_multi_task_model(input_shape=(seq_length, num_nodes, features)) history = model.fit(X_train, {'inflow': y_train[..., 0].reshape(-1, 1), 'outflow': y_train[..., 1].reshape(-1, 1)}, epochs=50, batch_size=32, validation_split=0.2) # 结果评估 y_pred_inflow, y_pred_outflow = model.predict(X_test) y_pred_inflow = scaler.inverse_transform(y_pred_inflow).flatten() y_pred_outflow = scaler.inverse_transform(y_pred_outflow).flatten() y_test_inflow = scaler.inverse_transform(y_test[..., 0]).flatten() y_test_outflow = scaler.inverse_transform(y_test[..., 1]).flatten() mae_inflow = mean_absolute_error(y_test_inflow, y_pred_inflow) mae_outflow = mean_absolute_error(y_test_outflow, y_pred_outflow) print(f"MAE Inflow: {mae_inflow:.2f}辆/小时") print(f"MAE Outflow: {mae_outflow:.2f}辆/小时") if __name__ == '__main__': main()帮我修改这段代码
最新发布
03-11
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值