【pytorch】thread: [16,0,0] Assertion `t >= 0 && t < n_classes` failed

报错信息:C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Loss.cu:250: block: [0,0,0], thread: [16,0,0] Assertion `t >= 0 && t < n_classes` failed

可能原因:标签label或者预测的结果pred,超出了数据的范围,比如标签里面只有0-15的数字,但是pred中出现19这个数字,那么在交叉熵损失计算中就会报错。

C:\actions-runner\_work\pytorch\pytorch\pytorch\aten\src\ATen\native\cuda\Loss.cu:242: block: [0,0,0], thread: [0,0,0] Assertion `t >= 0 && t < n_classes` failed. C:\actions-runner\_work\pytorch\pytorch\pytorch\aten\src\ATen\native\cuda\Loss.cu:242: block: [0,0,0], thread: [1,0,0] Assertion `t >= 0 && t < n_classes` failed. C:\actions-runner\_work\pytorch\pytorch\pytorch\aten\src\ATen\native\cuda\Loss.cu:242: block: [0,0,0], thread: [2,0,0] Assertion `t >= 0 && t < n_classes` failed. C:\actions-runner\_work\pytorch\pytorch\pytorch\aten\src\ATen\native\cuda\Loss.cu:242: block: [0,0,0], thread: [3,0,0] Assertion `t >= 0 && t < n_classes` failed. C:\actions-runner\_work\pytorch\pytorch\pytorch\aten\src\ATen\native\cuda\Loss.cu:242: block: [0,0,0], thread: [4,0,0] Assertion `t >= 0 && t < n_classes` failed. C:\actions-runner\_work\pytorch\pytorch\pytorch\aten\src\ATen\native\cuda\Loss.cu:242: block: [0,0,0], thread: [6,0,0] Assertion `t >= 0 && t < n_classes` failed. C:\actions-runner\_work\pytorch\pytorch\pytorch\aten\src\ATen\native\cuda\Loss.cu:242: block: [0,0,0], thread: [7,0,0] Assertion `t >= 0 && t < n_classes` failed. C:\actions-runner\_work\pytorch\pytorch\pytorch\aten\src\ATen\native\cuda\Loss.cu:242: block: [0,0,0], thread: [8,0,0] Assertion `t >= 0 && t < n_classes` failed. C:\actions-runner\_work\pytorch\pytorch\pytorch\aten\src\ATen\native\cuda\Loss.cu:242: block: [0,0,0], thread: [9,0,0] Assertion `t >= 0 && t < n_classes` failed. C:\actions-runner\_work\pytorch\pytorch\pytorch\aten\src\ATen\native\cuda\Loss.cu:242: block: [0,0,0], thread: [10,0,0] Assertion `t >= 0 && t < n_classes` failed. C:\actions-runner\_work\pytorch\pytorch\pytorch\aten\src\ATen\native\cuda\Loss.cu:242: block: [0,0,0], thread: [11,0,0] Assertion `t >= 0 && t < n_classes` failed. C:\actions-runner\_work\pytorch\pytorch\pytorch\aten\src\ATen\native\cuda\Loss.cu:242: block: [0,0,0], thread: [13,0,0] Assertion `t >= 0 && t < n_classes` failed. C:\actions-runner\_work\pytorch\pytorch\pytorch\aten\src\ATen\native\cuda\Loss.cu:242: block: [0,0,0], thread: [14,0,0] Assertion `t >= 0 && t < n_classes` failed. C:\actions-runner\_work\pytorch\pytorch\pytorch\aten\src\ATen\native\cuda\Loss.cu:242: block: [0,0,0], thread: [15,0,0] Assertion `t >= 0 && t < n_classes` failed. C:\actions-runner\_work\pytorch\pytorch\pytorch\aten\src\ATen\native\cuda\Loss.cu:242: block: [0,0,0], thread: [16,0,0] Assertion `t >= 0 && t < n_classes` failed. C:\actions-runner\_work\pytorch\pytorch\pytorch\aten\src\ATen\native\cuda\Loss.cu:242: block: [0,0,0], thread: [17,0,0] Assertion `t >= 0 && t < n_classes` failed. C:\actions-runner\_work\pytorch\pytorch\pytorch\aten\src\ATen\native\cuda\Loss.cu:242: block: [0,0,0], thread: [18,0,0] Assertion `t >= 0 && t < n_classes` failed. C:\actions-runner\_work\pytorch\pytorch\pytorch\aten\src\ATen\native\cuda\Loss.cu:242: block: [0,0,0], thread: [19,0,0] Assertion `t >= 0 && t < n_classes` failed. C:\actions-runner\_work\pytorch\pytorch\pytorch\aten\src\ATen\native\cuda\Loss.cu:242: block: [0,0,0], thread: [20,0,0] Assertion `t >= 0 && t < n_classes` failed. C:\actions-runner\_work\pytorch\pytorch\pytorch\aten\src\ATen\native\cuda\Loss.cu:242: block: [0,0,0], thread: [21,0,0] Assertion `t >= 0 && t < n_classes` failed. C:\actions-runner\_work\pytorch\pytorch\pytorch\aten\src\ATen\native\cuda\Loss.cu:242: block: [0,0,0], thread: [22,0,0] Assertion `t >= 0 && t < n_classes` failed. C:\actions-runner\_work\pytorch\pytorch\pytorch\aten\src\ATen\native\cuda\Loss.cu:242: block: [0,0,0], thread: [23,0,0] Assertion `t >= 0 && t < n_classes` failed. C:\actions-runner\_work\pytorch\pytorch\pytorch\aten\src\ATen\native\cuda\Loss.cu:242: block: [0,0,0], thread: [24,0,0] Assertion `t >= 0 && t < n_classes` failed. C:\actions-runner\_work\pytorch\pytorch\pytorch\aten\src\ATen\native\cuda\Loss.cu:242: block: [0,0,0], thread: [25,0,0] Assertion `t >= 0 && t < n_classes` failed. C:\actions-runner\_work\pytorch\pytorch\pytorch\aten\src\ATen\native\cuda\Loss.cu:242: block: [0,0,0], thread: [26,0,0] Assertion `t >= 0 && t < n_classes` failed. C:\actions-runner\_work\pytorch\pytorch\pytorch\aten\src\ATen\native\cuda\Loss.cu:242: block: [0,0,0], thread: [28,0,0] Assertion `t >= 0 && t < n_classes` failed. C:\actions-runner\_work\pytorch\pytorch\pytorch\aten\src\ATen\native\cuda\Loss.cu:242: block: [0,0,0], thread: [30,0,0] Assertion `t >= 0 && t < n_classes` failed. C:\actions-runner\_work\pytorch\pytorch\pytorch\aten\src\ATen\native\cuda\Loss.cu:242: block: [0,0,0], thread: [31,0,0] Assertion `t >= 0 && t < n_classes` failed. Traceback (most recent call last): File "C:\Users\86139\Desktop\wt_flower\train.py", line 79, in <module> loss = criterion(outputs, labels) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\anaconda3\envs\pytorch\Lib\site-packages\torch\nn\modules\module.py", line 1751, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\anaconda3\envs\pytorch\Lib\site-packages\torch\nn\modules\module.py", line 1762, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\anaconda3\envs\pytorch\Lib\site-packages\torch\nn\modules\loss.py", line 1297, in forward return F.cross_entropy( ^^^^^^^^^^^^^^^^ File "D:\anaconda3\envs\pytorch\Lib\site-packages\torch\nn\functional.py", line 3494, in cross_entropy return torch._C._nn.cross_entropy_loss( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ RuntimeError: CUDA error: device-side assert triggered Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
最新发布
08-10
你遇到的错误信息: ``` Assertion `t >= 0 && t < n_classes` failed. ``` 这是 PyTorch 的 `nn.CrossEntropyLoss` 在 GPU 上执行时抛出的断言错误,意思是:**某些标签(label)的值不在合法范围内**。 --- ## 🔍 错误原因分析 - `t >= 0`:标签值不能为负数; - `t < n_classes`:标签值不能大于或等于类别总数; - `n_classes` 是你在模型输出层设置的类别数量(例如 `FlowerNet(num_classes=5)`); - 如果你的类别是 0~4(共 5 类),那么标签值 **必须** 是 0 到 4 之间的整数; - 如果某个标签是 `-1` 或 `5` 或 `100`,就会触发这个错误。 --- ## ✅ 解决方案 ### ✅ 1. 检查标签值范围 在训练循环中加入打印标签的最小值和最大值: ```python for images, labels in train_dataloader: print(f"Labels min: {labels.min().item()}, max: {labels.max().item()}") ``` 输出示例: ``` Labels min: -1, max: 5 ``` 这说明你的标签中存在非法值。 --- ### ✅ 2. 检查数据集结构是否正确 `ImageFolder` 会根据文件夹顺序自动分配类别标签(0~n_classes-1),所以要确保: - `datasets/train/` 和 `datasets/valid/` 下的子目录数量正确; - 子目录名称是你希望的类别名; - 没有隐藏文件夹或 `.ipynb_checkpoints` 等干扰目录; - 所有图像都在正确的类别目录下。 --- ### ✅ 3. 检查是否手动修改了标签映射 如果你手动设置了 `class_to_idx`,例如: ```python train_dataset = ImageFolder('datasets/train', transform=train_transform, class_to_idx={'daisy': 1, 'dandelion': 2, ...}) ``` 请确保所有值都在 `[0, num_classes - 1]` 范围内,且没有跳过数字。 --- ### ✅ 4. 检查配置文件中的 `num_classes` 确保 `configs/config.toml` 中的 `num-classes` 设置正确: ```toml num-classes = 5 ``` 如果实际类别是 6 个,但你设置的是 5,也会导致标签越界。 --- ### ✅ 5. 检查是否加载了错误的 checkpoint 如果你从之前的训练中加载了模型,但类别数变了,也会导致错误。 ```python model = FlowerNet(num_classes=5, pretrained=False) ``` 如果你之前训练的是 5 类,现在变成 6 类,而你又加载了旧的模型权重,就会出错。 --- ## ✅ 推荐调试代码 在训练循环中加入以下代码,检查每个 batch 的标签: ```python for epoch in range(num_epochs): model.train() for batch, (images, labels) in enumerate(train_dataloader, start=1): images = images.to(device) labels = labels.to(device) # 调试:检查标签范围 if labels.min() < 0 or labels.max() >= configs['num-classes']: print(f"Invalid labels found in batch {batch}: min={labels.min().item()}, max={labels.max().item()}") continue optimizer.zero_grad() outputs = model(images) loss = criterion(outputs, labels) loss.backward() optimizer.step() ``` --- ## ✅ 总结 你遇到的错误是: > `Assertion `t >= 0 && t < n_classes` failed.` 说明你的标签值 **不在 [0, num_classes - 1]** 范围内。 --- ##
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值