TensorFlow十三 LSTM练习

本文详细介绍了如何使用TensorFlow构建一个长短期记忆(LSTM)网络来处理MNIST手写数字数据集。从数据预处理到网络结构搭建,再到训练过程,文章全面覆盖了LSTM在序列数据上的应用,特别强调了数据变换和网络参数的重要性。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

https://yq.aliyun.com/articles/202939

Mnist: BATCH_SIZE X 784 array

CCN:BATCH_SIZE X28X28 -->BATCH_SIZE X28x28X1 array

LSTM:28(NUM_STEPS)个BATCH_SIZE X28 list

先试试数据变换:

# coding=utf-8
import os  
os.environ["TF_CPP_MIN_LOG_LEVEL"]='2' # 只显示 warning 和 Error 

import tensorflow as tf
from tensorflow.contrib import rnn
import numpy as np
sess=tf.Session()
x=np.array([[[111,112],[121,122]],[[211,212],[221,222]],[[311,312],[321,322]]])
x1=np.transpose(x, [1,0,2])
x2=np.stack(x,1)
x2_u=tf.unstack(x,2,1)
x3=np.reshape(x1,[-1,2])
x4=np.split(x3,2,0)
print(x.shape)
print(x)
print(x1.shape)
print(x1)
print(x2.shape)
print(x2)
print('x2_u is list')
print(sess.run(x2_u))
print(x3.shape)
print(x3)
print('x4 is list')
print(x4)
(3, 2, 2)
[[[111 112]
  [121 122]]

 [[211 212]
  [221 222]]

 [[311 312]
  [321 322]]]
(2, 3, 2)
[[[111 112]
  [211 212]
  [311 312]]

 [[121 122]
  [221 222]
  [321 322]]]
(2, 3, 2)
[[[111 112]
  [211 212]
  [311 312]]

 [[121 122]
  [221 222]
  [321 322]]]
x2_u is list
[array([[111, 112],
       [211, 212],
       [311, 312]]), array([[121, 122],
       [221, 222],
       [321, 322]])]
(6, 2)
[[111 112]
 [211 212]
 [311 312]
 [121 122]
 [221 222]
 [321 322]]
x4 is list
[array([[111, 112],
       [211, 212],
       [311, 312]]), array([[121, 122],
       [221, 222],
       [321, 322]])]

再试试ANN-LSTM,对每个时间步,网络结构为28X128X10(NUM_INPUT x NUM_HIDDEN x NUM_CLASSES),输入为每行像素,关键步骤:

定义一个LSTM元胞:

lstm_layer=rnn.BasicLSTMCell(NUM_HIDDEN,forget_bias=1)

构建网络:

outputs,_=rnn.static_rnn(lstm_layer,x_input_step,dtype="float32")

注意,输入x_input_step为list:NUM_STEPS个array:(BATCH_SIZE , NUM_INPUT)

输出outputs为一个输出列表其中每个元素对应一个输入),长度为NUM_STEPS;

另一个输出为states,元胞的最终状态。

###data (50000,784),(1000,784),(1000,784):
import pickle
import gzip

def load_data():
    f = gzip.open('../data/mnist.pkl.gz', 'rb')
    training_data, validation_data, test_data = pickle.load(f,encoding='bytes')
    f.close()
    return (training_data, validation_data, test_data)

def vectorized_result(j):
    e = np.zeros(10)
    e[j] = 1.0
    return e

training_data, validation_data, test_data = load_data()
trainData_in=training_data[0][:50000]
trainData_out=[vectorized_result(j) for j in training_data[1][:50000]]
validData_in=validation_data[0]
validData_out=[vectorized_result(j) for j in validation_data[1]]
testData_in=test_data[0][:100]
testData_out=[vectorized_result(j) for j in test_data[1][:100]]

#define constants
#unrolled through 28 time steps 28行对应28个时间步:
TIME_STEPS=28
#hidden LSTM units
NUM_HIDDEN=128
#???rows of 28 pixels 每行28个像素:
NUM_INPUT=28
#learning rate for adam
LEARNING_RATE=0.001
#mnist is meant to be classified in 10 classes(0-9).
NUM_CLASSES=10
#size of batch
BATCH_SIZE=128

TRAINING_EPOCHS=30#

#weights and biases of appropriate shape to accomplish above task
out_weights=tf.Variable(tf.random_normal([NUM_HIDDEN,NUM_CLASSES]))
out_bias=tf.Variable(tf.random_normal([NUM_CLASSES]))
#defining placeholders
#input image placeholder:    
x_input=tf.placeholder("float",[None,TIME_STEPS,NUM_INPUT])
#input label placeholder:
y_desired=tf.placeholder("float",[None,NUM_CLASSES])
#processing the input tensor from [BATCH_SIZE,NUM_STEPS,NUM_INPUT] to "TIME_STEPS" number of [BATCH-SIZE,NUM_INPUT] tensors!:
#对输入的一个张量的第二维解包变成TIME_STEPS个张量!:
x_input_step=tf.unstack(x_input ,TIME_STEPS,1)

#defining the network:
lstm_layer=rnn.BasicLSTMCell(NUM_HIDDEN,forget_bias=1)
outputs,_=rnn.static_rnn(lstm_layer,x_input_step,dtype="float32")
#converting last output of dimension [batch_size,num_hidden] to [batch_size,num_classes] by out_weight multiplication
z_prediction=tf.matmul(outputs[-1],out_weights)+out_bias

#loss_function:
loss=tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=z_prediction,labels=y_desired))
#optimization
opt=tf.train.AdamOptimizer(learning_rate=LEARNING_RATE).minimize(loss)
#model evaluation
correct_prediction=tf.equal(tf.argmax(z_prediction,1),tf.argmax(y_desired,1))
accuracy=tf.reduce_mean(tf.cast(correct_prediction,tf.float32))

#initialize variables:
init=tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init)
    num_batches=int(len(trainData_in)/BATCH_SIZE)
    for epoch in range(TRAINING_EPOCHS):
        for i in range(num_batches):
            batch_x=trainData_in[i*BATCH_SIZE:(i+1)*BATCH_SIZE]
            batch_x=batch_x.reshape((BATCH_SIZE,TIME_STEPS,NUM_INPUT))#
            batch_y=trainData_out[i*BATCH_SIZE:(i+1)*BATCH_SIZE]            
            sess.run(opt, feed_dict={x_input: batch_x, y_desired: batch_y})
            if i %10==0:
                acc=sess.run(accuracy,feed_dict={x_input:batch_x,y_desired:batch_y})
                los=sess.run(loss,feed_dict={x_input:batch_x,y_desired:batch_y})
                print('epoch:%4d,'%epoch,'%4d'%i)
                print("Accuracy ",acc)
                print("Loss ",los)
                print("__________________")

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值