autoanchor: Analyzing anchors... anchors/target = 4.27, Best Possible Recall (BPR) = 0.9935
Image sizes 640 train, 640 val
Using 1 dataloader workers
Logging results to runs\train\test42
Starting training for 3 epochs...
Epoch gpu_mem box obj cls labels img_size
0/2 1.86G nan nan nan 113 640: 100%|██████████| 16/16 [00:23<00:00, 1.44s/it]
C:\Users\monst\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\optim\lr_scheduler.py:129: UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`. Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
warnings.warn("Detected call of `lr_scheduler.step()` before `optimizer.step()`. "
Class Images Labels P R mAP@.5 mAP@.5:.95: 100%|█| 8/8 [00:03<00:00, 2.45
all 128 0 0 0 0 0
Epoch gpu_mem box obj cls labels img_size
1/2 2.45G nan nan nan 128 640: 100%|██████████| 16/16 [00:17<00:00, 1.08s/it]
Class Images Labels P R mAP@.5 mAP@.5:.95: 100%|█| 8/8 [00:03<00:00, 2.48
all 128 0 0 0 0 0
Epoch gpu_mem box obj cls labels img_size
2/2 2.45G nan nan nan 221 640: 100%|██████████| 16/16 [00:17<00:00, 1.09s/it]
Class Images Labels P R mAP@.5 mAP@.5:.95: 100%|█| 8/8 [00:03<00:00, 2.39
all 128 0 0 0 0 0
找到了解决方法
https://github.com/ultralytics/yolov5/issues/4839
https://docs.nvidia.com/deeplearning/cudnn/release-notes/rel_8.html#rel-822
CUDA
https://developer.nvidia.cn/cuda-10.2-download-archive?target_os=Windows&target_arch=x86_64
cudnn
https://developer.nvidia.com/rdp/cudnn-download
安装了CUDA10.2还是不行,看别人pytorch是1.9.1的,我是1.8.1的,就去升级了pytorch(用anaconda升的),自动升到了1.10.1。
但是升完直接找不到cuda了,torch.version.cuda显示None,torch和cuda版本不匹配。
又在anaconda发现cudatoolkit库还是11.1,没降级,然后去把cudatoolkit从11.1降到了10.2,还是不行。
无奈,最后直接把原来的环境删了,重新装了一个,正好CUDA10.2的,pytorch也是1.10.1。
Pytorch
https://pytorch.org/
conda install pytorch == 1.10.1 torchvision == 0.11.2 torchaudio == 0.10.1 cudatoolkit=10.2 -c pytorch
等号两边的空格去掉,有六个
Anaconda安装pytorch
https://blog.youkuaiyun.com/qq_45297730/article/details/121652951
只能说重装解决一切
更多关于NAN的讨论
https://github.com/ultralytics/yolov5/issues/4084
https://github.com/ultralytics/yolov5/issues/1625
https://github.com/ultralytics/yolov5/issues/1749