python 3.8 + tensorflow 2.4.0 + cuda11.0 的问题

版本匹配

🔗从源代码构建 | TensorFlow

报错:Could not load dynamic library ‘cupti64_110.dll’; dlerror: cupti64_110.dll not found

请添加图片描述

是因为我电脑中的 cuda 版本以前是 10,现在是 11.4 ,所以需要安装对应版本的 cudatoolkit

解决方法:在 anaconda 对应的环境下 pip install

conda install cudatoolkit=11.0

我这里的环境名是 tf

请添加图片描述

切换到不同容器环境是: conda activate tf 或者 conda activate base。从而,可以看到模块成功加载。

请添加图片描述

CUDA 降级

我电脑里本来是 11.4 ,这和 tensorflow 2.4.0 不匹配,所以需要降至 11.0

NVIDIA CUDA Toolkit 11.0 Downloads

报错:找不到 cudnn64_8.dll

将 cudnn bin 目录文件下的几个文件粘贴到 …\NVIDIA GPU Computing Toolkit\CUDA\v11.0\bin应文件夹下即可。

请添加图片描述

NVIDIA cuDNN Archive

选择匹配对应 cuda 版本的,我是 11.0

请添加图片描述

一定要版本匹配

请添加图片描述

显存较小,需要设置按需增长的显存分配

gpus = tf.config.experimental.list_physical_devices(device_type='GPU')
    
for gpu in gpus:
	tf.config.experimental.set_memory_growth(gpu, True)

GPU 显存不足

是因为数据量太大,类型太多,我这里 1650 的显存是 4G。

解决方法:

  • 使用较少的数据量、识别的种类减少。
  • 换显存更大 的显卡。

请添加图片描述

自编译tensorflow: 1.python3.5,tensorflow1.12; 2.支持cuda10.0,cudnn7.3.1,TensorRT-5.0.2.6-cuda10.0-cudnn7.3; 3.支持mkl,无MPI; 软硬件硬件环境:Ubuntu16.04,GeForce GTX 1080 配置信息: hp@dla:~/work/ts_compile/tensorflow$ ./configure WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown". You have bazel 0.19.1 installed. Please specify the location of python. [Default is /usr/bin/python]: /usr/bin/python3 Found possible Python library paths: /usr/local/lib/python3.5/dist-packages /usr/lib/python3/dist-packages Please input the desired Python library path to use. Default is [/usr/local/lib/python3.5/dist-packages] Do you wish to build TensorFlow with XLA JIT support? [Y/n]: XLA JIT support will be enabled for TensorFlow. Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: No OpenCL SYCL support will be enabled for TensorFlow. Do you wish to build TensorFlow with ROCm support? [y/N]: No ROCm support will be enabled for TensorFlow. Do you wish to build TensorFlow with CUDA support? [y/N]: y CUDA support will be enabled for TensorFlow. Please specify the CUDA SDK version you want to use. [Leave empty to default to CUDA 10.0]: Please specify the location where CUDA 10.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: /usr/local/cuda-10.0 Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7]: 7.3.1 Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda-10.0]: Do you wish to build TensorFlow with TensorRT support? [y/N]: y TensorRT support will be enabled for TensorFlow. Please specify the location where TensorRT is installed. [Default is /usr/lib/x86_64-linux-gnu]:/home/hp/bin/TensorRT-5.0.2.6-cuda10.0-cudnn7.3/targets/x86_64-linux-gnu Please specify the locally installed NCCL version you want to use. [Default is to use https://github.com/nvidia/nccl]: Please specify a list of comma-separated Cuda compute capabilities you want to build with. You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus. Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 6.1,6.1,6.1]: Do you want to use clang as CUDA compiler? [y/N]: nvcc will be used as CUDA compiler. Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: Do you wish to build TensorFlow with MPI support? [y/N]: No MPI support will be enabled for TensorFlow. Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native -Wno-sign-compare]: Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: Not configuring the WORKSPACE for Android builds. Preconfigured Bazel build configs. You can use any of the below by adding "--config=" to your build command. See .bazelrc for more details. --config=mkl # Build with MKL support. --config=monolithic # Config for mostly static monolithic build. --config=gdr # Build with GDR support. --config=verbs # Build with libverbs support. --config=ngraph # Build with Intel nGraph support. --config=dynamic_kernels # (Experimental) Build kernels into separate shared objects. Preconfigured Bazel build configs to DISABLE default on features: --config=noaws # Disable AWS S3 filesystem support. --config=nogcp # Disable GCP support. --config=nohdfs # Disable HDFS support. --config=noignite # Disable Apacha Ignite support. --config=nokafka # Disable Apache Kafka support. --config=nonccl # Disable NVIDIA NCCL support. Configuration finished 编译: hp@dla:~/work/ts_compile/tensorflow$ bazel build --config=opt --config=mkl --verbose_failures //tensorflow/tools/pip_package:build_pip_package 卸载已有tensorflow: hp@dla:~/temp$ sudo pip3 uninstall tensorflow 安装自己编译的成果: hp@dla:~/temp$ sudo pip3 install tensorflow-1.12.0-cp35-cp35m-linux_x86_64.whl
### TensorFlow 2.4.0 安装指南 对于希望安装 TensorFlow 2.4.0 的用户而言,官方提供了详细的指导说明。确保环境配置正确至关重要,尤其是当涉及到特定版本的需求时。 #### Python 版本需求 TensorFlow 2.4.0 支持 Python 3.6–3.8[^1]。这意味着,在准备环境中应选择上述范围内的Python版本之一来匹配此版TensorFlow的要求。 #### CUDA 和 cuDNN 配置 由于提到的CUDA版本11.0,这确实对应TensorFlow 2.4.0的支持列表内。因此,为了使GPU支持正常工作,除了安装兼容的CUDA外,还需要相应地设置cuDNN库。如果`import tensorflow as tf`执行时不抛出错误,则表明当前环境下的cuDNN已满足条件;反之则需手动调整至适当版本[^3]。 ```bash pip install tensorflow==2.4.0 ``` 这段命令用于指定安装TensorFlow的确切版本号,即2.4.0。然而,考虑到依赖冲突问题——特别是关于`typing-extensions`包的情况,可能需要先解决这些潜在障碍再继续安装过程。 ### 使用文档概览 TensorFlow官方网站提供详尽的教程和API参考手册,帮助开发者快速上手并深入理解框架功能。针对不同应用场景(如图像识别、自然语言处理等),都有专门章节介绍如何构建模型以及优化性能技巧。此外,还包含了大量实例代码片段供学习者模仿练习。 ### 更新日志要点 在TensorFlow 2.4.0中引入了一些重要改进: - **增强Keras集成**:简化了高层级接口操作流程,使得创建复杂神经网络结构变得更加直观便捷。 - **分布式训练加强**:新增特性允许更灵活高效地管理多设备间的协作计算任务。 - **图形优化器升级**:通过内部机制革新提高了整体运算效率,减少了不必要的资源消耗。 值得注意的是,随着新特性的加入,某些旧有函数可能会被标记为废弃状态或行为有所改变,所以在迁移现有项目到新版之前务必仔细阅读相关变更记录以评估影响程度。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值