解决Docker镜像报错“ImportError: libcuda.so.1: cannot open shared object file: No such file or directory”问题

最新推荐文章于 2025-11-11 18:58:11 发布

原创最新推荐文章于 2025-11-11 18:58:11 发布 · 4.1k 阅读

1 ·

CC 4.0 BY-SA版权

文章标签：

#docker #tensorflow #容器

踩坑记录同时被 2 个专栏收录

55 篇文章

订阅专栏

Python

5 篇文章

订阅专栏

在尝试运行Python3版本的tensorflow-gpu=1.13.1镜像时，遇到'ImportError: libcuda.so.1: cannot open shared object file: No such file or directory'错误。该问题源于镜像缺少相应的GPU驱动。解决方案是在启动容器时添加'--gpus all'选项，确保容器能够访问GPU资源。

部署运行你感兴趣的模型镜像

问题描述

拉取Python3版本的tensorflow-gpu==1.13.1的镜像

docker pull tensorflow/tensorflow:1.13.1-gpu-py3

指定容器

docker run -it -v /xxx/xxx/xxx:/xxx/xxx --name tf-113 xxxxxxxxxxxx /bin/bash

进入镜像后使用pip指令查看发现已经安装成功tensorlfow-gpu=1.13.1版本

pip list

但是在引用tensorflow的时候报错

ImportError: libcuda.so.1: cannot open shared object file: No such file or directory

问题分析

libcuda报错的意思是该镜像没有相应的显卡驱动，一开始以为是要重新安装对应版本的cudatoolkit和cudnn，后来意识到是运行镜像的时候没有指定gpu

解决方法

更换指令，添加“--gous all”，即可

docker run -it --gpus all -v /xxx/xxx/xxx:/xxx --name tf-113 xxxxxxxxxxxx /bin/bash

如图所示：

root@b8d6213d8bf7:/# python
Python 3.5.2 (default, Nov 12 2018, 13:43:14) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> tf.__version__
'1.13.1'
>>> tf.test.is_gpu_available()
2022-10-26 03:10:24.960562: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2022-10-26 03:10:25.268171: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x55769f0 executing computations on platform CUDA. Devices:
2022-10-26 03:10:25.268237: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): NVIDIA RTX XXXXX, Compute Capability 8.6
2022-10-26 03:10:25.268255: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (1): NVIDIA RTX XXXXX, Compute Capability 8.6
2022-10-26 03:10:25.291696: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3400000000 Hz
2022-10-26 03:10:25.298380: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x58085e0 executing computations on platform Host. Devices:
2022-10-26 03:10:25.298429: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): <undefined>, <undefined>
2022-10-26 03:10:25.298664: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties: 
name: NVIDIA RTX XXXXX major: 8 minor: 6 memoryClockRate(GHz): 1.695
pciBusID: 0000:73:00.0
totalMemory: 23.68GiB freeMemory: 23.06GiB
2022-10-26 03:10:25.298733: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 1 with properties: 
name: NVIDIA RTX XXXXX major: 8 minor: 6 memoryClockRate(GHz): 1.695
pciBusID: 0000:a6:00.0
totalMemory: 23.69GiB freeMemory: 771.88MiB
2022-10-26 03:10:25.299380: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0, 1
2022-10-26 03:10:25.301950: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2022-10-26 03:10:25.301983: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 1 
2022-10-26 03:10:25.301998: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N Y 
2022-10-26 03:10:25.302009: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 1:   Y N 
2022-10-26 03:10:25.302189: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/device:GPU:0 with 22432 MB memory) -> physical GPU (device: 0, name: NVIDIA RTX XXXXX, pci bus id: 0000:73:00.0, compute capability: 8.6)
2022-10-26 03:10:25.302951: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/device:GPU:1 with 546 MB memory) -> physical GPU (device: 1, name: NVIDIA RTX XXXXX, pci bus id: 0000:a6:00.0, compute capability: 8.6)
True

您可能感兴趣的与本文相关的镜像