[tensorflow高级API使用] Dataset-优快云博客

本文详细介绍如何使用TensorFlow的Dataset API处理Numpy数组，包括数据集的加载、预处理及批处理，实现高效的模型训练。通过具体代码示例，读者将学会如何在训练过程中正确使用feed_dict，以及如何处理数据类型转换等问题。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

https://tensorflow.google.cn/guide/datasets#consuming_numpy_arrays

https://blog.youkuaiyun.com/weixin_42806169/article/details/89764602

https://blog.youkuaiyun.com/qq_16234613/article/details/81703228

上篇的总结（https://blog.youkuaiyun.com/YQMind/article/details/82901442）

https://www.jianshu.com/p/f580f4fc2ba0

——————————————————————————————————————————————————

使用Dataset return的变量都是tensor类型

feed_dict的时候会报错

The value of a feed cannot be a tf.Tensor object 类似的意思

可以使用 data, label = sess.run(data_tensor, label_tensor) 得到ndarray的类型，就可以使用feed_dict了


import tensorflow as tf
import scipy.io as sio
import numpy as np
 
 
def get_Batch(data, label, batch_size):
    print(data.shape, label.shape)
    input_queue = tf.train.slice_input_producer([data, label], num_epochs=1, shuffle=True, capacity=32 ) 
    x_batch, y_batch = tf.train.batch(input_queue, batch_size=batch_size, num_threads=1, capacity=32, allow_smaller_final_batch=False)
    return x_batch, y_batch
 
 
data = sio.loadmat('data.mat')
train_x = data['train_x']
train_y = data['train_y']
test_x = data['test_x']
test_y = data['test_y']
 
x = tf.placeholder(tf.float32, [None, 10])
y = tf.placeholder(tf.float32, [None, 2])
 
w = tf.Variable(tf.truncated_normal([10, 2], stddev=0.1))
b = tf.Variable(tf.truncated_normal([2], stddev=0.1))
pred = tf.nn.softmax(tf.matmul(x, w) + b)
 
loss = tf.reduce_mean(-tf.reduce_sum(y * tf.log(pred), reduction_indices=[1]))
optimizer = tf.train.AdamOptimizer(2e-5).minimize(loss)
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(pred, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32), name='evaluation')
 
x_batch, y_batch = get_Batch(train_x, train_y, 1000)
# 训练
with tf.Session() as sess:
    #初始化参数
    sess.run(tf.global_variables_initializer())
    sess.run(tf.local_variables_initializer())
    # 开启协调器
    coord = tf.train.Coordinator()
    # 使用start_queue_runners 启动队列填充
    threads = tf.train.start_queue_runners(sess, coord)
    epoch = 0
    try:
        while not coord.should_stop():
            # 获取训练用的每一个batch中batch_size个样本和标签
            data, label = sess.run([x_batch, y_batch])
            sess.run(optimizer, feed_dict={x: data, y: label})
            train_accuracy = accuracy.eval({x: data, y: label})
            test_accuracy = accuracy.eval({x: test_x, y: test_y})
            print("Epoch %d, Training accuracy %g, Testing accuracy %g" % (epoch, train_accuracy, test_accuracy))
            epoch = epoch + 1
    except tf.errors.OutOfRangeError:  # num_epochs 次数用完会抛出此异常
        print("---Train end---")
    finally:
        # 协调器coord发出所有线程终止信号
        coord.request_stop()
        print('---Programm end---')
    coord.join(threads)  # 把开启的线程加入主线程，等待threads结束

--------------------- 
作者：非典型废言 
来源：优快云 
原文：https://blog.youkuaiyun.com/sinat_35821976/article/details/82668555 
版权声明：本文为博主原创文章，转载请附上博文链接！