使用tf.data.Dataset.map()报错The Conv2D op currently only supports the NHWC tensor format on the CPU,but

在使用TensorFlow的`tf.data.Dataset.map`进行数据预处理时,遇到了一个错误,该错误涉及到图像的通道格式问题。在尝试将图片模糊并连接清晰图片通道时,由于CPU不支持NCHW格式,导致`Conv2D`操作失败。代码中通过`tf.image.decode_jpeg`读取并解码图片,然后进行模糊处理和通道连接。目前的问题可能是由于在增加batch维度时,系统误将该维度理解为了通道维度。解决方案可能包括确保数据格式为NHWC,或者在GPU上运行整个数据处理流程。
部署运行你感兴趣的模型镜像

最近在使用tf.data.Dataset.map(data_map),我想使用data_map对读取的图片进行模糊,并将模糊后的图片与清晰的图片在通道上连接从而得到一个新的数据集,但在运行时却报了以下的错误:

tensorflow.python.framework.errors_impl.UnimplementedError: 2 root error(s) found.
  (0) Unimplemented: 2 root error(s) found.
  (0) Unimplemented: {{function_node __inference_Dataset_map_Training_data.data_map_101}} The Conv2D op currently only supports the NHWC tensor format on the CPU. The op was given the format: NCHW
	 [[{{node Conv2D_2}}]]
	 [[concat/_16]]
  (1) Unimplemented: {{function_node __inference_Dataset_map_Training_data.data_map_101}} The Conv2D op currently only supports the NHWC tensor format on the CPU. The op was given the format: NCHW
	 [[{{node Conv2D_2}}]]
0 successful operations.
0 derived errors ignored.
	 [[training_data/IteratorGetNext_1]]
	 [[training_data/IteratorGetNext_1/_1]]
  (1) Unimplemented: 2 root error(s) found.
  (0) Unimplemented: {{function_node __inference_Dataset_map_Training_data.data_map_101}} The Conv2D op currently only supports the NHWC tensor format on the CPU. The op was given the format: NCHW
	 [[{{node Conv2D_2}}]]
	 [[concat/_16]]
  (1) Unimplemented: {{function_node __inference_Dataset_map_Training_data.data_map_101}} The Conv2D op currently only supports the NHWC tensor format on the CPU. The op was given the format: NCHW
	 [[{{node Conv2D_2}}]]
0 successful operations.
0 derived errors ignored.
	 [[training_data/IteratorGetNext_1]]
0 successful operations.
0 derived errors ignored.

Dataset.map(data_map)貌似是在CPU中运行的,因为我把data_map的单独一个创建py文件使用GPU运行完全正常,但一使用Dataset.map()调用data_map程序就会报错。

报错说我输入的是NCHW格式的张量,CPU只支持NHWC的格式。于是我跟踪到rgb = tf.image.decode_jpeg(tf.read_file(img_path), channels=3)发现rgb是一个tensor且shape是(?,?,3),之后我用tf.expand_dims(data, 0) 增加了一个batch维(不然后面不能进行卷积操作)变为(1,?,?,3),难道他把我增加的维度视为Chanel了吗?

我的代码如下:

# training_data.py            输入的图片大小为(64,64,3)

 dataset = tf.data.Dataset.from_tensor_slices(data_path_list)
 dataset = dataset.map(self.data_map, num_parallel_calls=8)

 training_dataset = dataset.batch(n_batch).prefetch(64)
 training_iterator = training_dataset.make_one_shot_iterator()
 training_batch = training_iterator.get_next()
    
def data_map(self, img_path): 
        rgb = tf.image.decode_jpeg(tf.read_file(img_path), channels=3)  # 读取图片
        
        # 对图片进行模糊操作
        data = tf.cast(rgb, tf.float32)
        data = tf.expand_dims(data, 0)      # 拓展一个batch维度

        blur = 10  # 模糊系数,决定模糊程度
        filter_ = np.array([[1 / (blur ** 2)] * (blur ** 2)]).reshape(blur, blur, 1, 1)  # 模糊核

        temp = []
        data_list = tf.split(data, 3, 3)        # 在通道维度上将图片分割成单通道图片
        for data in data_list:          # 对每个通道分别进行模糊
            rgb_blur = tf.nn.conv2d(data, filter_, [1, 1, 1, 1], 'SAME')  
            temp.append(rgb_blur)
        rgb_blur = tf.concat(temp, 3)       # 将模糊后的单通图片重新在通道上进行连接
        rgb_blur = tf.squeeze(rgb_blur, [0])  # 去掉batch维度

        rgb = tf.concat([rgb_blur, rgb], 2)  # 在通道上将清晰和模糊两个图片相连

        return rgb

有没有大佬知道我的问题出在哪里了,应该怎么解决?万分感谢!

 

您可能感兴趣的与本文相关的镜像

TensorFlow-v2.15

TensorFlow-v2.15

TensorFlow

TensorFlow 是由Google Brain 团队开发的开源机器学习框架,广泛应用于深度学习研究和生产环境。 它提供了一个灵活的平台,用于构建和训练各种机器学习模型

# =================================================================== # FULL CODE: Lane Detection with Multi-Task Learning on TuSimple # Model: ModifiedResNet50 + FPN + MultiTaskHead # Loss: Focal + Regress + Distance + Variance # Dataset: TuSimple # Config: epoch=50, batch_size=4, lr=0.001, Adam # =================================================================== import os import json import cv2 import numpy as np from PIL import Image import torch import torch.nn as nn import torch.nn.functional as F from torch.utils.data import Dataset, DataLoader import torchvision.transforms as T import torchvision.models as models # ====================================== # 1. 可变形卷积模块(简化版 DCN) # ====================================== class DeformableConv(nn.Module): def __init__(self, in_channels, out_channels, kernel_size=3, stride=1, padding=1, bias=False): super().__init__() self.offset_conv = nn.Conv2d(in_channels, 2 * kernel_size * kernel_size, kernel_size, stride, padding, bias=bias) self.dcn = torch.ops.torchvision.deform_conv2d self.kernel_size = kernel_size self.stride = stride self.padding = padding self.bilinear_offset = nn.Upsample(scale_factor=1, mode='bilinear', align_corners=True) def forward(self, x): offset = self.offset_conv(x) # deform_conv only supports certain configurations return torch.ops.torchvision.deform_conv2d( x, offset, weight=torch.randn_like(torch.empty(out_channels, in_channels, self.kernel_size, self.kernel_size), device=x.device), padding=self.padding, stride=self.stride ) # This is a placeholder; we'll use standard conv for now to avoid C++ extension issues # In practice, install `deform_conv` or use grid_sample based implementation # Use regular Conv if DCN not available class SimpleDeformConv(nn.Module): def __init__(self, in_channels, out_channels, kernel_size=3, stride=1, padding=1): super().__init__() self.conv = nn.Conv2d(in_channels, out_channels, kernel_size, stride=stride, padding=padding, bias=False) self.norm = nn.BatchNorm2d(out_channels) self.relu = nn.ReLU(inplace=True) def forward(self, x): return self.relu(self.norm(self.conv(x))) # ====================================== # 2. Bottleneck with Optional DCN # ====================================== class Bottleneck(nn.Module): expansion = 4 def __init__(self, inplanes, planes, stride=1, downsample=None, dilation=1, use_dcn=False): super(Bottleneck, self).__init__() self.use_dcn = use_dcn conv_layer = SimpleDeformConv if use_dcn else nn.Conv2d self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, bias=False) self.bn1 = nn.BatchNorm2d(planes) padding = dilation if use_dcn: self.conv2 = SimpleDeformConv(planes, planes, kernel_size=3, stride=stride, padding=padding) else: self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride, padding=padding, dilation=dilation, bias=False) self.bn2 = nn.BatchNorm2d(planes) self.conv3 = nn.Conv2d(planes, planes * self.expansion, kernel_size=1, bias=False) self.bn3 = nn.BatchNorm2d(planes * self.expansion) self.relu = nn.ReLU(inplace=True) self.downsample = downsample self.stride = stride def forward(self, x): identity = x out = self.conv1(x) out = self.bn1(out) out = self.relu(out) out = self.conv2(out) if not self.use_dcn: out = self.bn2(out) out = self.relu(out) out = self.conv3(out) out = self.bn3(out) if self.downsample is not None: identity = self.downsample(x) out += identity out = self.relu(out) return out # ====================================== # 3. 构建修改后的 ResNet50 # ====================================== def make_res_layer(block, inplanes, planes, blocks, stride=1, dilation=1, use_dcn=False): downsample = None if stride != 1 or inplanes != planes * block.expansion: downsample = nn.Sequential( nn.Conv2d(inplanes, planes * block.expansion, kernel_size=1, stride=stride, bias=False), nn.BatchNorm2d(planes * block.expansion), ) layers = [] first_dilation = 1 if dilation == 1 else dilation // 2 layers.append(block(inplanes, planes, stride, downsample, dilation=first_dilation, use_dcn=False)) for _ in range(1, blocks): layers.append(block(planes * block.expansion, planes, dilation=dilation, use_dcn=use_dcn)) return nn.Sequential(*layers) class ModifiedResNet50(nn.Module): def __init__(self, pretrained=True): super(ModifiedResNet50, self).__init__() original = models.resnet50(pretrained=pretrained) self.inplanes = 64 self.conv1 = original.conv1 self.bn1 = original.bn1 self.relu = original.relu self.maxpool = original.maxpool self.layer1 = original.layer1 # res2 self.layer2 = original.layer2 # res3 self.layer3 = make_res_layer(Bottleneck, 512, 128, 23, stride=1, dilation=2, use_dcn=True) # res4 self.layer4 = make_res_layer(Bottleneck, 1024, 256, 3, stride=1, dilation=4, use_dcn=True) # res5 def forward(self, x): c1 = self.relu(self.bn1(self.conv1(x))) c1 = self.maxpool(c1) c2 = self.layer1(c1) # 256, H/4, W/4 c3 = self.layer2(c2) # 512, H/8, W/8 c4 = self.layer3(c3) # 1024, H/8, W/8 (dilation=2) c5 = self.layer4(c4) # 2048, H/8, W/8 (dilation=4) return c3, c4, c5 # ====================================== # 4. FPN Neck # ====================================== class FPN(nn.Module): def __init__(self, in_channels_list=[512, 1024, 2048], out_channels=256): super(FPN, self).__init__() self.lateral_convs = nn.ModuleList([ nn.Conv2d(in_c, out_channels, 1) for in_c in in_channels_list ]) self.fpn_convs = nn.ModuleList([ nn.Conv2d(out_channels, out_channels, 3, padding=1) for _ in range(len(in_channels_list)) ]) def forward(self, inputs): c3, c4, c5 = inputs p5 = self.lateral_convs[2](c5) p4 = self.lateral_convs[1](c4) + F.interpolate(p5, scale_factor=2, mode='nearest') p3 = self.lateral_convs[0](c3) + F.interpolate(p4, scale_factor=2, mode='nearest') p5 = self.fpn_convs[2](p5) p4 = self.fpn_convs[1](p4) p3 = self.fpn_convs[0](p3) out = F.interpolate(p3, scale_factor=4, mode='bilinear', align_corners=False) + \ F.interpolate(p4, scale_factor=2, mode='bilinear', align_corners=False) + \ p5 return out # ====================================== # 5. 多任务 Head # ====================================== class MultiTaskHead(nn.Module): def __init__(self, in_channels, num_classes=5): super(MultiTaskHead, self).__init__() self.cls = nn.Conv2d(in_channels, num_classes, kernel_size=1) # 分类 self.offset = nn.Conv2d(in_channels, 2, kernel_size=1) # 偏移 dx, dy self.distance = nn.Conv2d(in_channels, 1, kernel_size=1) # 距离场 self.variance = nn.Conv2d(in_channels, 2, kernel_size=1) # 方差(对应 offset) def forward(self, x): size = x.shape[2:] return { 'cls': F.interpolate(self.cls(x), size=size, mode='bilinear', align_corners=False), 'offset': F.interpolate(self.offset(x), size=size, mode='bilinear', align_corners=False), 'distance': F.interpolate(self.distance(x), size=size, mode='bilinear', align_corners=False), 'variance': F.interpolate(self.variance(x), size=size, mode='bilinear', align_corners=False) } # ====================================== # 6. 主模型 # ====================================== class LaneSegNet_MultiTask(nn.Module): def __init__(self, num_classes=5): super(LaneSegNet_MultiTask, self).__init__() self.backbone = ModifiedResNet50(pretrained=True) self.fpn = FPN(out_channels=256) self.head = MultiTaskHead(256, num_classes) def forward(self, x): feats = self.backbone(x) fpn_out = self.fpn(feats) return self.head(fpn_out) # ====================================== # 7. TuSimple Dataset # ====================================== class TuSimpleDataset(Dataset): def __init__(self, root, split='train', img_size=(384, 640), transform=None): self.root = root self.img_size = img_size self.transform = transform or T.Compose([ T.ToTensor(), T.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) ]) self.image_paths = [] self.lanes_data = [] label_files = [ os.path.join(root, 'label_data_0313.json'), os.path.join(root, 'label_data_0601.json') ] if split == 'train' else [os.path.join(root, 'test_label.json')] for file in label_files: if not os.path.exists(file): raise FileNotFoundError(f"{file} not found.") with open(file, 'r') as f: lines = f.readlines() for line in lines: try: data = json.loads(line.strip()) raw_file = data['raw_file'] img_path = os.path.join(root, raw_file) if os.path.exists(img_path): self.image_paths.append(img_path) self.lanes_data.append(data) except: continue if split != 'test': valid_indices = [i for i, d in enumerate(self.lanes_data) if 'lanes' in d and 'h_samples' in d] self.image_paths = [self.image_paths[i] for i in valid_indices] self.lanes_data = [self.lanes_data[i] for i in valid_indices] def __len__(self): return len(self.image_paths) def __getitem__(self, idx): img_path = self.image_paths[idx] data = self.lanes_data[idx] image = cv2.imread(img_path) image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) orig_h, orig_w = image.shape[:2] image_pil = Image.fromarray(image) image_resized = image_pil.resize((self.img_size[1], self.img_size[0]), Image.BILINEAR) cls_map = np.zeros((self.img_size[0], self.img_size[1]), dtype=np.int64) offset_map = np.zeros((2, self.img_size[0], self.img_size[1]), dtype=np.float32) distance_map = np.full((1, self.img_size[0], self.img_size[1]), 1e6, dtype=np.float32) ratio_h, ratio_w = self.img_size[0] / orig_h, self.img_size[1] / orig_w lanes = data.get('lanes', []) h_samples = data.get('h_samples', []) all_lane_points = [] for lane_id, xs in enumerate(lanes): points = [] for x, y in zip(xs, h_samples): if x >= 0: px = int(x * ratio_w) py = int(y * ratio_h) if 0 <= px < self.img_size[1] and 0 <= py < self.img_size[0]: points.append([px, py]) cls_map[py, px] = lane_id + 1 # background=0 if len(points) > 1: all_lane_points.append(np.array(points)) if len(all_lane_points) == 0: cls_map[:] = 0 offset_map[:] = 0 distance_map[:] = 1.0 else: yy, xx = np.mgrid[0:self.img_size[0], 0:self.img_size[1]] coords = np.stack([xx, yy], axis=-1).astype(np.float32) # [H, W, 2] for points in all_lane_points: dists = np.linalg.norm(coords[:, :, None] - points[None, None, :, :], axis=-1) # [H, W, N] min_dists = dists.min(axis=-1) # [H, W] nearest_idx = dists.argmin(axis=-1) nearest_pts = points[nearest_idx] # [H, W, 2] offsets = nearest_pts - coords # [H, W, 2] -> dx, dy update_mask = min_dists < distance_map[0] distance_map[0][update_mask] = min_dists[update_mask] offset_map[:, update_mask] = offsets[update_mask].transpose(2, 0, 1) # Normalize offset offset_map /= 16.0 distance_map = np.clip(distance_map, 0, 100) / 50.0 # normalize to ~[0, 2] image_tensor = self.transform(image_resized) label_tensor = torch.from_numpy(cls_map).long() offset_tensor = torch.from_numpy(offset_map).float() distance_tensor = torch.from_numpy(distance_map).float() return { 'image': image_tensor, 'label': label_tensor, 'offset': offset_tensor, 'distance': distance_tensor } # ====================================== # 8. 损失函数 # ====================================== class FocalLoss(nn.Module): def __init__(self, alpha=1, gamma=2): super().__init__() self.alpha = alpha self.gamma = gamma def forward(self, pred, target): ce_loss = F.cross_entropy(pred, target, ignore_index=0, reduction='none') pt = torch.exp(-ce_loss) focal_loss = self.alpha * (1 - pt) ** self.gamma * ce_loss return focal_loss.mean() class RegressLoss(nn.Module): def __init__(self): super().__init__() self.criterion = nn.SmoothL1Loss(reduction='mean') def forward(self, pred, target, mask): if mask.sum() == 0: return pred.new_zeros([]) return self.criterion(pred[mask], target[mask]) class DistanceLoss(nn.Module): def __init__(self): super().__init__() self.criterion = nn.MSELoss() def forward(self, pred, target): return self.criterion(pred, target) class VarianceLoss(nn.Module): def __init__(self): super().__init__() def forward(self, pred_mean, pred_logvar, target): precision = torch.exp(-pred_logvar) loss = precision * (target - pred_mean) ** 2 + pred_logvar return loss.mean() # ====================================== # 9. 主函数 # ====================================== def main(): device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') print(f"Using device: {device}") # 修改为你本地的 TuSimple 路径 TUSIMPLE_ROOT = "/your/path/to/tusimple" # Transform transform = T.Compose([ T.ToTensor(), T.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) ]) # Dataset & Dataloader dataset = TuSimpleDataset(root=TUSIMPLE_ROOT, split='train', img_size=(384, 640), transform=transform) dataloader = DataLoader(dataset, batch_size=4, shuffle=True, num_workers=4, pin_memory=True) # Model model = LaneSegNet_MultiTask(num_classes=5).to(device) # Optimizer optimizer = torch.optim.Adam(model.parameters(), lr=0.001) # Losses criterion_focal = FocalLoss().to(device) criterion_regress = RegressLoss().to(device) criterion_distance = DistanceLoss().to(device) criterion_variance = VarianceLoss().to(device) # Weights w_cls, w_reg, w_dist, w_var = 1.0, 1.0, 0.5, 0.3 # Training Loop epochs = 50 for epoch in range(epochs): model.train() total_loss = 0.0 for i, data in enumerate(dataloader): images = data['image'].to(device) labels = data['label'].to(device) offsets = data['offset'].to(device) distances = data['distance'].to(device) masks = (labels > 0).unsqueeze(1).expand_as(offsets) optimizer.zero_grad() outputs = model(images) loss_cls = criterion_focal(outputs['cls'], labels) loss_reg = criterion_regress(outputs['offset'], offsets, masks) loss_dist = criterion_distance(outputs['distance'], distances) loss_var = criterion_variance(outputs['offset'], outputs['variance'], offsets) loss = w_cls * loss_cls + w_reg * loss_reg + w_dist * loss_dist + w_var * loss_var loss.backward() optimizer.step() total_loss += loss.item() if (i + 1) % 20 == 0: print(f"Epoch [{epoch + 1}/50], Step [{i + 1}/{len(dataloader)}], Loss: {loss.item():.4f}") avg_loss = total_loss / len(dataloader) print(f"Epoch [{epoch + 1}/50] Average Loss: {avg_loss:.4f}") # Save Model save_path = "tusimple_lane_model.pth" torch.save(model.state_dict(), save_path) print(f"Training completed. Model saved to {save_path}") if __name__ == "__main__": main()
最新发布
11-28
评论 1
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值