内存溢出
在使用pytorch训练的模型进行推理操作时,出现以下错误:
RuntimeError: CUDA out of memory. Tried to allocate 416.00 MiB (GPU 0; 2.00 GiB total capacity; 1.32 GiB already allocated; 0 bytes free; 1.34 GiB reserved in total by PyTorch)
从上述报错信息中可以看出,GPU0共有2GiB容量,已经分配出去1.32 GiB,0 bytes可用,PyTorch占用1.34 GiB。
使用下述命令查看GPU的使用情况:
> nvidia-smi
Wed Jul 13 15:20:18 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 512.95 Driver Version: 512.95 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... WDDM | 00000000:01:00.0 Off | N/A |
| N/A 39C P0 N/A / N/A | 0MiB / 2048MiB | 2% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
发现并没有进程占用GPU资源。
然后使用torch包内的命令查看内存占用情况,结果如下:
> print(torch.cuda.memory.memory_summary())
|===========================================================================|
| PyTorch CUDA memory summary, device ID 0 |
|---------------------------------------------------------------------------|
| CUDA OOMs: 0 | cudaMalloc retries: 0 |
|===========================================================================|
| Metric | Cur Usage | Peak Usage | Tot Alloc | Tot Freed |
|---------------------------------------------------------------------------|
| Allocated memory | 0 B | 0 B | 0 B | 0 B |
| from large pool | 0 B | 0 B | 0 B | 0 B |
| from small pool | 0 B | 0 B | 0 B | 0 B |
|---------------------------------------------------------------------------|
| Active memory | 0 B | 0 B | 0 B | 0 B |
| from large pool | 0 B | 0 B | 0 B | 0 B |
| from small pool | 0 B | 0 B | 0 B | 0 B |
|---------------------------------------------------------------------------|
| GPU reserved memory | 0 B | 0 B | 0 B | 0 B |
| from large pool | 0 B | 0 B | 0 B | 0 B |
| from small pool | 0 B | 0 B | 0 B | 0 B |
|---------------------------------------------------------------------------|
| Non-releasable memory | 0 B | 0 B | 0 B | 0 B |
| from large pool | 0 B | 0 B | 0 B | 0 B |
| from small pool | 0 B | 0 B | 0 B | 0 B |
|---------------------------------------------------------------------------|
| Allocations | 0 | 0 | 0 | 0 |
| from large pool | 0 | 0 | 0 | 0 |
| from small pool | 0 | 0 | 0 | 0 |
|---------------------------------------------------------------------------|
| Active allocs | 0 | 0 | 0 | 0 |
| from large pool | 0 | 0 | 0 | 0 |
| from small pool | 0 | 0 | 0 | 0 |
|---------------------------------------------------------------------------|
| GPU reserved segments | 0 | 0 | 0 | 0 |
| from large pool | 0 | 0 | 0 | 0 |
| from small pool | 0 | 0 | 0 | 0 |
|---------------------------------------------------------------------------|
| Non-releasable allocs | 0 | 0 | 0 | 0 |
| from large pool | 0 | 0 | 0 | 0 |
| from small pool | 0 | 0 | 0 | 0 |
|===========================================================================|
从结果中看到,没有内存被占用。
再次运行代码依旧报错,难道是代码自身所需的内存过大而导致失败?但是我们的代码只是推理代码,不应该占用这么高的内存,经过查询,发现在推理模型时,应该在主代码部分添加torch.no_grad()以防止推理过程中对梯度进行追踪。追踪梯度时会占用大量的内存。解决办法如下:
with torch.no_grad():
outputs = model(samples) #主代码
在使用PyTorch进行模型推理时遇到CUDA内存溢出问题,报错显示GPU0共2GB容量,已分配1.32GB,0B可用,而PyTorch预留了1.34GB。通过`nvidia-smi`检查无进程占用GPU,torch内存管理命令显示无内存占用。问题可能由于推理过程中梯度追踪导致,解决方案是在推理代码块中加入`torch.no_grad()`以防止梯度追踪,降低内存需求。
9万+

被折叠的 条评论
为什么被折叠?



