tensorflow图像数据预处理

最新推荐文章于 2025-06-06 18:36:43 发布

原创最新推荐文章于 2025-06-06 18:36:43 发布 · 6k 阅读

29 ·

CC 4.0 BY-SA版权

文章标签：

#图像处理

深度学习专栏收录该内容

6 篇文章

订阅专栏

本文介绍了使用TensorFlow进行图像预处理的方法，包括调整图像大小、裁剪填充、色彩调整等操作，通过具体代码示例展示了如何提升图像识别模型的准确性。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

tensorflow图像数据预处理

图像的亮度、对比度等属性对图像的影响是非常大的，相同物体在不同亮度、对比度差别非常大。然而在很多图像识别问题中，这些因素都不应该影响最后的识别结果。通过对图像的预处理，可以尽量避免模型受到无关因素的影响。在大部分图像识别问题中，通过图像处理过程可以提高模型的准确率。

Tensorflow图像处理函数

Tensorflow提供了一种统一的格式来存储数据，这个格式就是TFRecord。

%matplotlib inline
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np

image_raw_data = tf.gfile.FastGFile("C:\\tensorflow-tutorial-master\\Deep_Learning_with_TensorFlow\\datasets\\cat.jpg","r").read()

with tf.Session() as sess:
    # 解压jpeg格式的图像文件从而得到图像对应的三维矩阵
    img_data = tf.image.decode_jpeg(image_raw_data)

    print(img_data.eval())

[[[163 161 140]
  [163 161 138]
  [163 161 140]
  ..., 
  [106 139  48]
  [101 137  47]
  [104 140  52]]

 [[164 162 141]
  [164 162 139]
  [163 161 138]
  ..., 
  [107 138  45]
  [103 138  46]
  [108 138  50]]

 [[167 162 142]
  [167 162 140]
  [164 162 139]
  ..., 
  [105 136  42]
  [103 137  43]
  [108 139  46]]

 ..., 
 [[207 200 181]
  [207 200 181]
  [206 199 180]
  ..., 
  [109  84  54]
  [109  83  56]
  [107  82  52]]

 [[207 200 181]
  [206 199 180]
  [206 199 180]
  ..., 
  [108  83  52]
  [106  81  51]
  [106  81  50]]

 [[207 200 181]
  [206 199 180]
  [206 199 180]
  ..., 
  [109  85  51]
  [107  82  51]
  [106  81  50]]]

# 显示图片
with tf.Session() as sess:
    plt.imshow(img_data.eval())
    plt.show()

这里写图片描述

一般来说网络上获取的图像大小是不固定，但神经网络输入节点个数是固定的，所以在将图像的像素作为输入提供给神经网络之前，需要先将图像大小统一。这个就是图形学大小调整要完成的任务。

with tf.Session() as sess:
    """参数：原始图像、调整后的大小、method：图像大小的算法"""
    resized = tf.image.resize_images(img_data,[300,300],method=0)

    # tensorflow的函数处理图片后存储的数据是float32格式，需要转换成uint8才能正确打印图片
    print("Digital type: ",resized.dtype)
    cat = np.asarray(resized.eval(),dtype='uint8')

    plt.imshow(cat)
    plt.show()

Digital type:  <dtype: 'float32'>

这里写图片描述

method取值          图像大小调整算法
======================================
0                    双线性插值法

1                    最近邻居法

2                    双三次插值法

3                    面积插值法

不同算法调整出来的结果会有细微差别，但不会相差太远。

# 裁剪和填充图片
"""
resize_image_with_crop_or_pad:
    input:原始图像、后面两个参数是调整后的目标图像的大小。
    说明：如果原始图像的尺寸大于目标图像，则函数自动截取原始图像中居中的部分
          如果目标图像大于原始图像的尺寸，则函数自动在原始图像的四周填充0背景
"""
with tf.Session() as sess:
    croped = tf.image.resize_image_with_crop_or_pad(img_data,1000,1000)
    padded = tf.image.resize_image_with_crop_or_pad(img_data,3000,3000)
    plt.imshow(croped.eval())
    plt.show()
    plt.imshow(padded.eval())
    plt.show()

这里写图片描述

# 通过比例调整图像大小
"""
central_crop:
    参数：原始图像、一个（0,1]的比例值
"""
with tf.Session() as sess:
    central_cropped = tf.image.central_crop(img_data,0.5)
    plt.imshow(central_cropped.eval())
    plt.show()

这里写图片描述

with tf.Session() as sess:
    # 上下翻转
    filpped1 = tf.image.flip_up_down(img_data)
    # 左右翻转
    filpped2 = tf.image.flip_left_right(img_data)
    # 以一定概率上下翻转图片。
    filpped3 = tf.image.random_flip_up_down(img_data)
    # 以一定概率左右翻转图片。
    filpped4 = tf.image.random_flip_left_right(img_data)
    #对角线翻转
    transposed = tf.image.transpose_image(img_data)

    plt.imshow(filpped1.eval())
    plt.show()
    plt.imshow(filpped2.eval())
    plt.show()
    plt.imshow(filpped3.eval())
    plt.show()
    plt.imshow(filpped4.eval())
    plt.show()

这里写图片描述

with tf.Session() as sess:
    # 将图片的亮度-0.5
    adjusted1 = tf.image.adjust_brightness(img_data,-0.5)

    # 将图片的亮度+0.5
    adjusted2 = tf.image.adjust_brightness(img_data,0.5)   

    # 在[-max_delta, max_delta)的范围随机调整图片的亮度。
    adjusted3 = tf.image.random_brightness(img_data,max_delta=0.5)

    # 将图片的对比度-5
    adjusted4 = tf.image.adjust_contrast(img_data,-0.5)

    # 将图片的对比度+5
    adjusted5 = tf.image.adjust_contrast(img_data,0.5)   

    # 在[lower, upper]的范围随机调整图的对比度。
    adjusted6 = tf.image.random_contrast(img_data,lower=0.5,upper=1.5)

    plt.imshow(adjusted1.eval())
    plt.show()
    plt.imshow(adjusted2.eval())
    plt.show()
    plt.imshow(adjusted3.eval())
    plt.show()
    plt.imshow(adjusted4.eval())
    plt.show()
    plt.imshow(adjusted5.eval())
    plt.show()
    plt.imshow(adjusted6.eval())
    plt.show()

这里写图片描述

# 调整色相
with tf.Session() as sess:
    adjusted1 = tf.image.adjust_hue(img_data,0.1)
    adjusted2 = tf.image.adjust_hue(img_data,0.3)
    adjusted3 = tf.image.adjust_hue(img_data,0.6)
    adjusted4 = tf.image.adjust_hue(img_data,0.9)

    # 在[-max_delta, max_delta]的范围随机调整图片的色相。max_delta的取值在[0, 0.5]之间
    adjusted5 = tf.image.random_hue(img_data,0.5)

    # 将图片的饱和度-5。
    adjusted6 = tf.image.adjust_saturation(img_data,-5)
    # 将图片的饱和度+5。
    adjusted7 = tf.image.adjust_saturation(img_data,5)

    # 在[lower, upper]的范围随机调整图的饱和度。
    adjusted8 = tf.image.random_saturation(img_data, lower=0.5, upper=1.5)

    # 将代表一张图片的三维矩阵中的数字均值变为0，方差变为1。
    adjusted9 = tf.image.per_image_standardization(img_data)

    plt.imshow(adjusted1.eval())
    plt.show()
    plt.imshow(adjusted2.eval())
    plt.show() 
    plt.imshow(adjusted3.eval())
    plt.show() 
    plt.imshow(adjusted4.eval())
    plt.show()
    plt.imshow(adjusted5.eval())
    plt.show() 
    plt.imshow(adjusted6.eval())
    plt.show()
    plt.imshow(adjusted7.eval())
    plt.show()
    plt.imshow(adjusted8.eval())
    plt.show() 
    plt.imshow(adjusted9.eval())
    plt.show()

这里写图片描述

# 处理标注框
with tf.Session() as sess:
    # 将图像缩小一些，这样可视化能让标注更加清楚
    img_data = tf.image.resize_images(img_data,[180,267],method=1)
    """
    图像矩阵转化为实数类型。tf.image.draw_bounding_boxes函数图像的输入是一个batched的数据，
    也就是多张图像组成的四维矩阵，所以需要将解码之后的图像矩阵加一维。
    """
    batched = tf.expand_dims(tf.image.convert_image_dtype(img_data,tf.float32),0)
    """
    一个标准框有四个数字，分别代表[Ymin,Xmin,Ymax,Xmax]
    这里给出的数字都是图像的相对位置。比如在180*267的图像中[0.35,0.47,0.5,0.56]代表(63,125)到(90,150)
    """
    boxes = tf.constant([[[0.05,0.05,0.9,0.7],[0.35,0.47,0.5,0.56]]])
    result = tf.image.draw_bounding_boxes(batched,boxes)

    plt.imshow(result[0].eval())
    plt.show()

这里写图片描述

# 处理标注框
with tf.Session() as sess:
    boxes = tf.constant([[[0.005,0.005,0.9,0.7],[0.35,0.47,0.5,0.56]]])
    # 可以通过提供标准框的方式来告诉随机截取图像的算法哪些部分是"有信息"的
    begin,size,bbox_for_draw = tf.image.sample_distorted_bounding_box(tf.shape(img_data),bounding_boxes=boxes)

    # 通过标注框可视化随机截取得到的图像
    batched = tf.expand_dims(tf.image.convert_image_dtype(img_data,tf.float32),0)
    image_with_box = tf.image.draw_bounding_boxes(batched,bbox_for_draw)

    # 截取
    distorted_image = tf.slice(img_data, begin, size)
    plt.imshow(distorted_image.eval())
    plt.show()

这里写图片描述

图像预处理完整样例

在解决真实的图像识别问题时，一般会同时使用多种处理方法。以下程序完成了从图像片段截取，到图像大小调整再到图像翻转及色彩调整的整个图像预处理过程。

%matplotlib inline
import numpy as np
import tensorflow as tf
import matplotlib.pylab as plt

# 给定一张图像，随机调整图像的色彩。因为调整亮度、对比度、饱和度和色相的顺序会影响，最后的结果
def distort_color(image,color_ordering=0):
    if color_ordering == 0:
        image = tf.image.random_brightness(image,max_delta=32./255.)
        image = tf.image.random_saturation(image,lower=0.5,upper=1.5)
        image = tf.image.random_hue(image,max_delta=0.2)
        image = tf.image.random_contrast(image,lower=0.5,upper=1.5)
    elif color_ordering == 1:
        image = tf.image.random_saturation(image,lower=0.5,upper=1.5)
        image = tf.image.random_brightness(image,max_delta=32./255.)
        image = tf.image.random_contrast(image,lower=0.5,upper=1.5)
        image = tf.image.random_hue(image,max_delta=0.2)
    # 还可以定义其他的排列，不再一一列出   
    return tf.clip_by_value(image,0.0,1.0)

def preprocess_for_train(image,image_shape,bbox):
    # 如果没有提供标注框，则认为整个图像就是需要关注的部分
    if bbox is None:
        bbox = tf.constant([0.0,0.0,1.0,1.0],dtype=tf.float32,shape=[1,1,4])

    # 转换图像张量的类型
    if image.dtype != tf.float32:
        image = tf.image.convert_image_dtype(image,dtype=tf.float32)

    # 随机截取图像，减小需要关注的物体大小对图像识别算法的影响
    bbox_beign,bbox_size,_ = tf.image.sample_distorted_bounding_box(tf.shape(image),bounding_boxes=bbox)
    bbox_begin, bbox_size, _ = tf.image.sample_distorted_bounding_box(tf.shape(image), bounding_boxes=bbox)
    distorted_image = tf.slice(image,bbox_beign,bbox_size)

    # 将随机截取的图像调整为神经网络输入层的大小。大小调整的算法是随机选择
    distored_image = tf.image.resize_images(distorted_image,image_shape,method=np.random.randint(4))
    # 随机左右翻转图像
    distored_image = tf.image.random_flip_left_right(distored_image)
    # 使用一种随机的顺序调整图像色彩
    distored_image = distort_color(distorted_image,np.random.randint(2))

    return distored_image

# 读取图片
image_raw_data = tf.gfile.FastGFile("C:\\tensorflow-tutorial-master\\Deep_Learning_with_TensorFlow\\datasets\\cat.jpg","r").read()
with tf.Session() as sess:
    img_data = tf.image.decode_jpeg(image_raw_data)
    boxes = tf.constant([[[0.05, 0.05, 0.9, 0.7], [0.35, 0.47, 0.5, 0.56]]])
    for i in range(9):
        result = preprocess_for_train(img_data,[299,299],boxes)
        plt.figure(1) 
        plt.imshow(result.eval())
        plt.show()

这里写图片描述