这种彩色像素块是色彩溢出
的现象,简单来说就是:深度学习的图像处理计算中,常常需要在数据增强的ToTensor
除以255.0归一化为float
(0-1), 但显示和存储的时候,需要乘以255.0再转换为uint8
(0-255),但是中间的数值经过一系列运算已经改变,从float
到uint8
的转换过程容易发生像素值溢出[0,255]
,转换为uint8之前需要对tensor使用clip
截断!
def unnormalize_and_convert(frames, mean=[-1.0,-1.0,-1.0], std=[1.0/0.5, 1.0/0.5, 1.0/0.5]):
"""
Unnormalizes the given frames and converts them to uint8 numpy array for video saving.
Parameters:
- frames (torch.Tensor): The frames to unnormalize, shape should be (C, T, H, W).
- mean (list or torch.Tensor): The mean values for each channel for normalization.
- std (list or torch.Tensor): The standard deviation values for each channel for normalization.
Returns:
- video (numpy.ndarray): Unnormalized frames as a uint8 numpy array, shape should be (T, H, W, C).
"""
# Calculate the unnormalize mean and std
unnormalize_mean = [-m / s for m, s in zip(mean, std)]
unnormalize_std = [1 / s for s in std]
# Define the unnormalize transform
unnormalize = transforms.Normalize(mean=unnormalize_mean, std=unnormalize_std)
# Permute frames to (T, C, H, W), clone and detach to avoid modifying the original tensor
frame_list = [unnormalize(frame) for frame in frames.permute(1, 0, 2, 3).clone().detach().cpu().float()]
# Stack and permute back to (C, T, H, W)
frames_unnorm = torch.stack(frame_list, dim=0).permute(1, 0, 2, 3)
# Convert to numpy, permute to (T, H, W, C)
video = frames_unnorm.permute(1, 2, 3, 0).detach().cpu().numpy()
# Scale [0,1] float to [0, 255] uint8 and clip
video = (video * 255.0).clip(0, 255).astype('uint8')
return video