tensorflow-gpu安装及问题解决(ImportError: libcudnn.so.7,module 'tensorflow.python.training.checkpointable'

本文详细记录了解决TensorFlow GPU版本与CUDA 9.0兼容性的过程,包括卸载旧版CUDA,安装CUDA 9.0,处理libcudnn.so.7缺失问题,以及更新TensorFlow版本至1.9.0,最终实现GPU加速训练。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

最近需要用GPU来进行tensorflow进行训练,发现需要安装cuda 9.0版本才可以,因为tensorflow默认的就是cuda 9.0,为了减少对tensorflow源码进行重构的时间,将之前的cuda8.0版本进行了卸载,然后再安装cuda 9.0

安装过程如下:

# instructions from https://developer.nvidia.com/cuda-downloads (linux -> x86_64 -> Ubuntu -> 16.04 -> deb)
CUDA_REPO_PKG="cuda-repo-ubuntu1604-9-0-local_9.0.176-1_amd64-deb"
wget https://developer.nvidia.com/compute/cuda/9.0/Prod/local_installers/${CUDA_REPO_PKG}
sudo dpkg -i ${CUDA_REPO_PKG}
sudo apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/7fa2af80.pub
sudo apt-get update
sudo apt-get -y install cuda-9-0

不过安装完成之后,发现tensorflow-gpu还是不管用!报了这个错误:

ImportError: libcudnn.so.7: cannot open shared object file: No such file or directory

定位到/usr/local/cuda目录下,才发现没有这个文件。

在查找相关资料后,可以安装单独的libcudnn 的包,其处理如下:

CUDA_PATCH1="cuda-repo-ubuntu1604-9-0-local-cublas-performance-update_1.0-1_amd64-deb"
wget https://developer.nvidia.com/compute/cuda/9.0/Prod/patches/1/${CUDA_PATCH1}
sudo dpkg -i ${CUDA_PATCH1}
sudo apt-get update

安装完成后,发现/usr/local/cuda目录下,仍然没有这个文件。

才发现安装后目录到了/usr/lib/x86_64-linux-gnu这个路径下。

于是进行拷贝:sudo cp libcud* /usr/local/cuda/lib64

再次进入tensorflow,发现报错如下:

这个问题在官网上看到的是tensorflow-gpu 1.8.0的问题,于是升级到1.9.0,再次实验,果然顺利成功。如上图。

E:\anaconda3\envs\spytorch\python.exe "E:\pythonProjectaaa\pythonProject\Follw DL\43 目标检测.py" WARNING:root:Limited tf.compat.v2.summary API due to missing TensorBoard installation. WARNING:root:Limited tf.compat.v2.summary API due to missing TensorBoard installation. WARNING:root:Limited tf.compat.v2.summary API due to missing TensorBoard installation. WARNING:root:Limited tf.compat.v2.summary API due to missing TensorBoard installation. WARNING:root:Limited tf.compat.v2.summary API due to missing TensorBoard installation. WARNING:root:Limited tf.compat.v2.summary API due to missing TensorBoard installation. WARNING:root:Limited tf.summary API due to missing TensorBoard installation. Found 0 images belonging to 0 classes. Found 0 images belonging to 0 classes. 2025-03-19 15:44:33.400472: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2 2025-03-19 15:44:33.403302: I tensorflow/core/common_runtime/process_util.cc:147] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance. Traceback (most recent call last): File "E:\pythonProjectaaa\pythonProject\Follw DL\43 目标检测.py", line 66, in <module> validation_steps=50 File "E:\anaconda3\envs\spytorch\lib\site-packages\tensorflow_core\python\keras\engine\training.py", line 819, in fit use_multiprocessing=use_multiprocessing) File "E:\anaconda3\envs\spytorch\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py", line 235, in fit use_multiprocessing=use_multiprocessing) File "E:\anaconda3\envs\spytorch\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py", line 593, in _process_training_inputs use_multiprocessing=use_multiprocessing) File "E:\anaconda3\envs\spytorch\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py", line 706, in _process_inputs use_multiprocessing=use_multiprocessing) Fil
03-20
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值