对图像进行一些变换
1 结构
from torchvision import transforms
常用的类:
- ToTensor类:把一个PIL的Image或者numpy数据类型的图片转换成 tensor 的数据类型
为什么需要tensor的数据类型:
它包装了神经网络所需要的一些参数,如_backward_hooks、_grad等(先转换成tensor数据类型,再进行训练)
- ToPILImage类:把一个图片转换成PIL Image
- Normalize类:归一化
- Resize类:尺寸变换
- CenterCrop类:中心裁剪
2 用法
2.1 ToTensor
from PIL import Image
from torch.utils.tensorboard import SummaryWriter
from torchvision import transforms
# 绝对路径:E:\360MoveData\Users\BLACK\Desktop\学习资料\learn_pytorch\dataset\train\ants\0013035.jpg
# 相对路径:dataset/train/ants/0013035.jpg
img_path = "dataset/train/ants/0013035.jpg"
img = Image.open(img_path)
# print(img)
tensor_trans = transforms.ToTensor()
tensor_img = tensor_trans(img)
# print(tensor_img)
writer = SummaryWriter("logs")
writer.add_image("Tensor_img", tensor_img)
writer.close()
2.2 归一化(Normalize)
from PIL import Image
from torch.utils.tensorboard import SummaryWriter
from torchvision import transforms
writer = SummaryWriter("logs")
img = Image.open("dataset/train/ants/0013035.jpg")
# print(img)
# ToTensor
trans_totensor = transforms.ToTensor()
img_tensor =trans_totensor(img)
writer.add_image("ToTensor",img_tensor)
# Normalize
trans_norm = transforms.Normalize([0.5,0.5,0.5],[0.5,0.5,0.5]) # 参数是均值和标准差,因为图片是RGB三信道,故传入三个数
img_norm = trans_norm(img_tensor)
writer.add_image("Normalize",img_norm)
writer.close()
公式:output[channel] = (input[channel] - mean[channel]) / std[channel]
举例:(input-0.5)/0.5=2*input-1
input [0,1]
result [-1,1]
2.3 Resize
将输入转变到给定尺寸
输入:PIL Image
- 序列:(h,w)
- 整数:不改变高和宽的比例,最小的边将会匹配这个数(等比缩放)
返回值:PIL Image
# Resize
# print(img.size)
trans_resize = transforms.Resize((512,512))
# img (PIL) -> resize -> img_resize(PIL)
img_resize = trans_resize(img)
# img_resize(PIL) -> ToTensor -> img_resize(tensor)
img_resize = trans_totensor(img_resize)
writer.add_image("Resize",img_resize,0)
# print(img_resize)
2.4 Compose
- 把不同的transforms方法结合在一起
- 参数是一个数组,里面是不同的transforms方法(注意顺序,前一个的输出是下一个的输入)
举例:
Example:
>>> transforms.Compose([
>>> transforms.CenterCrop(10),
>>> transforms.PILToTensor(),
>>> transforms.ConvertImageDtype(torch.float),
>>> ])
# Compose
trans_resize_2 = transforms.Resize(512)
trans_compose = transforms.Compose([trans_resize_2,trans_totensor])
img_resize_2 = trans_compose(img)
writer.add_image("Resize",img_resize_2,1)
2.5 RandomCrop
随机裁剪
输入:PIL Image
# RandomCrop
trans_random = transforms.RandomCrop(512)
# trans_random = transforms.RandomCrop((200,300))
trans_compose_2 = transforms.Compose([trans_random,trans_totensor])
# 裁剪10个
for i in range(10):
img_crop = trans_compose_2(img)
writer.add_image("RandomCrop",img_crop,i)