Tensor Flow shuffle_batch 的方式读csv文件的例子

最新推荐文章于 2022-05-25 14:52:08 发布

xljiulong

最新推荐文章于 2022-05-25 14:52:08 发布

阅读量8.8k

点赞数

CC 4.0 BY-SA版权

分类专栏：深度学习

本文链接：https://blog.youkuaiyun.com/xljiulong/article/details/51385743

深度学习专栏收录该内容

7 篇文章

订阅专栏

本文通过一个简单的示例介绍了如何使用TensorFlow进行文件读取并利用shuffle_batch实现数据集的随机读取。该过程包括定义读取格式、创建文件队列、设置批处理参数等关键步骤。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

用最简单的代码展示了一个tensor flow shuffle的方式读文件

代码

#coding=utf-8                                                                                                                                                                                                                                                                 

import tensorflow as tf
import numpy as np

def readMyFileFormat(fileNameQueue):
    reader = tf.TextLineReader()
    key, value = reader.read(fileNameQueue)

   record_defaults = [[1], [1], [1]]
    col1, col2, col3 = tf.decode_csv(value, record_defaults = record_defaults)
    features = tf.pack([col1, col2])
    label = col3
    return features, label

def inputPipeLine(fileNames = ["file0.csv", "file1.csv"], batchSize = 4, numEpochs = None):
    fileNameQueue = tf.train.string_input_producer(fileNames, num_epochs = numEpochs)
    example, label = readMyFileFormat(fileNameQueue)
    min_after_dequeue = 8
    capacity = min_after_dequeue + 3 * batchSize
    exampleBatch, labelBatch = tf.train.shuffle_batch([example, label], batch_size = batchSize, num_threads = 3,  capacity = capacity, min_after_dequeue = min_after_dequeue)
    return exampleBatch, labelBatch

featureBatch, labelBatch = inputPipeLine(["file0.csv", "file1.csv"], batchSize = 4)
with tf.Session() as sess:
    # Start populating the filename queue.                                                                                                                                                                                                                                    
    coord = tf.train.Coordinator()
    threads = tf.train.start_queue_runners(coord=coord)

   # Retrieve a single instance:                                                                                                                                                                                                                                             
    try:
        #while not coord.should_stop():                                                                                                                                                                                                                                       
        while True:
            example, label = sess.run([featureBatch, labelBatch])
            print example
    except tf.errors.OutOfRangeError:
        print 'Done reading'
    finally:
        coord.request_stop()

   coord.join(threads)
    sess.close()

file0.csv 的内容

9,1,1
10,2,3
11,3,1
12,4,2

file1.csv 的内容

1,1,7
2,2,8
3,3,5
4,4,9
5,5,5
6,6,1
7,7,2
8,8,4