RuntimeError: Exception thrown from dataset pipeline. Refer to ‘Dataset Pipeline Error Message‘.

项目场景:

通过mindspore创建自定义训练数据集时,设置批处理数据,报错。

RuntimeError: Exception thrown from dataset pipeline. Refer to 'Dataset Pipeline Error Message'.

bug 详细:[ERROR] MD(,220c,?):2024-7-25 15:43:40 [mindspore\ccsrc\minddata\dataset\util\task_manager.cc:227] mindspore::dataset::TaskManager::InterruptMaster] MindSpore dataset is terminated with err msg: Exception thrown from dataset pipeline. Refer to 'Dataset Pipeline Error Message'. Inconsistent batch shapes, batch operation expects same shape for each data row, but got inconsistent shape in column 0, expected shape for this column is:<3,231,349>, got shape:<3,314,362>

解释:MindSpore数据集终止并显示错误消息:数据集管道抛出异常。请参阅“数据集管道错误消息”。

原因分析:批处理形状不一致,批处理操作期望每个数据行的形状相同,但在第 0 列中出现不一致的形状,此列的预期形状为:❤️,231,349>,得到的形状:❤️,314,362>。

总之,是因为自定义数据中图像尺寸不一致,没有缩放裁剪一致,才不能batch化。


解决方案:

  1. 调整数据中的图像为统一尺寸;需要在transform.Compose中增加回调函数:
    mindspore.dataset.vision.Resize对输入图像使用给定的尺寸,默认是mindspore.dataset.vision.Inter 插值方式去调整为给定的尺寸大小。

  2. 如果为目标检测数据集,需要在transform.Compose中增加回调函数:
    vision.ResizeWithBBox(100)(func_data, func_bboxes)
    Resize the input image to the given size and adjust bounding boxes accordingly.
    注意,如果你是目标检测,标签带框的这类数据,需要同步调整标签。
    mindspore.dataset.vision.ResizeWithBBox将输入图像调整为给定的尺寸大小并相应地调整边界框的大小。

例如:
在这里插入图片描述

import mindspore.dataset as ds data_train=ds.MnistDataset(dataset_dir="./MNIST_Data",usage="train",shuffle=True) import matplotlib.pyplot as plt images=[] for image,label in data_train: images.append(image) if len(images)>5: break plt.plot(images) 以上代码为什么报错如下? RuntimeError Traceback (most recent call last) Cell In[1], line 6 4 import matplotlib.pyplot as plt 5 images=[] ----> 6 for image,label in data_train: 7 images.append(image) 8 if len(images)>5: break File D:\10_The_Programs\4_The_Codes\00_virtual_environment\DeepLearning\Lib\site-packages\mindspore\dataset\engine\iterators.py:360, in Iterator.__next__(self) 358 from mindspore.profiler import mstx 359 range_id = mstx.range_start('dataloader', None) --> 360 out = self._parallel_transformation_iteration() if self.parallel_convert else self.serial_conversion_iteration() 361 mstx.range_end(range_id) 362 return out File D:\10_The_Programs\4_The_Codes\00_virtual_environment\DeepLearning\Lib\site-packages\mindspore\dataset\engine\iterators.py:341, in Iterator.serial_conversion_iteration(self) 337 """ 338 Fetch data to serial conversion 339 """ 340 # Note offload is applied inside _get_next() if applicable since get_next converts to output format --> 341 data = self._get_next() 342 if not data: 343 if self.__index == 0: File D:\10_The_Programs\4_The_Codes\00_virtual_environment\DeepLearning\Lib\site-packages\mindspore\dataset\engine\iterators.py:513, in TupleIterator._get_next(self) 504 """ 505 Returns the next record in the dataset as a list 506 507 Returns: 508 List, the next record in the dataset. 509 """ 511 if self.offload_model is None: 512 return [_transform_md_to_output(t, self._output_numpy, self._do_copy) for t in --> 513 self._iterator.GetNextAsList()] 514 data = [_transform_md_to_tensor(t, self._do_copy) for t in self._iterator.GetNextAsList()] 515 if data: RuntimeError: Exception thrown from dataset pipeline. Refer to 'Dataset Pipeline Error Message'. ------------------------------------------------------------------ - Dataset Pipeline Error Message: ------------------------------------------------------------------ [ERROR] Invalid data, MnistDataset API can't read the data file (interface mismatch or no data found). Check mnist file in directory: ./MNIST_Data ------------------------------------------------------------------ - C++ Call Stack: (For framework developers) ------------------------------------------------------------------ mindspore\ccsrc\minddata\dataset\engine\datasetops\source\mnist_op.cc(229).
10-28
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

柏常青

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值