windows 11 搭建 TensorFlow GPU 开发环境【RTX 3060】:3 -- 基于WSL2 本地方式的jupyter notebook使用

本文详细介绍了如何在Windows 11环境中通过WSL2和Docker搭建CUDA与TensorFlow 2.2.0 GPU开发环境,包括CUDA本地安装、Conda+Jupyter Notebook的配置,并展示了NVIDIA工具的使用和版本对应关系。特别关注了在WSL2中处理Linux模型文件的问题和临时解决方案。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >


接上文, 我们发现WSL2 中其实也是可以使用CUDA 的。只不过使用docker 的方式。在加载资源的过程中有点卡。

我们在使用windows 炼丹的过程中,经常会遭遇到,Linux系统下生成的ckpt,或者模型文件加载出来编码错误。

那么临时救急的办法就是使用WSL2 进行加载。


基本环境构建

CUDA 本地环境构建

上文 基本搭建好了wsl2 中使用的cuda 驱动等内容。 核心参照:

https://docs.nvidia.com/cuda/wsl-user-guide/index.html

核心步骤就是安装包:Using the WSL-Ubuntu Package

wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-wsl-ubuntu.pin
sudo mv cuda-wsl-ubuntu.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/11.4.0/local_installers/cuda-repo-wsl-ubuntu-11-4-local_11.4.0-1_amd64.deb
sudo dpkg -i cuda-repo-wsl-ubuntu-11-4-local_11.4.0-1_amd64.deb
sudo apt-key add /var/cuda-repo-wsl-ubuntu-11-4-local/7fa2af80.pub
sudo apt-get update
sudo apt-get -y install cuda

conda jupyter notebook tensorflow-gpu 环境构建

conda create -n nlp_gputf2 python=3.8 -y
conda activate nlp_gputf2
conda install ipykernel
# bert4keras 无法支持高版本
conda install tensorflow-gpu==2.2.0
pip install pandas
pip install matplotlib
pip install sklearn
pip install bert4keras

NVIDIA 命令

nvidia-smi

测试效果

import tensorflow as tf
version = tf.__version__
gpu_ok = tf.test.is_gpu_available()
WARNING:tensorflow:From /tmp/ipykernel_5239/425579737.py:3: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.


2022-02-04 23:33:16.130812: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA
2022-02-04 23:33:16.385570: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 2304000000 Hz
2022-02-04 23:33:16.476626: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55d95c1d0a30 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2022-02-04 23:33:16.477265: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2022-02-04 23:33:16.526553: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2022-02-04 23:33:17.468088: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2022-02-04 23:33:17.468231: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 3060 Laptop GPU computeCapability: 8.6
coreClock: 1.702GHz coreCount: 30 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 312.97GiB/s
2022-02-04 23:33:17.477580: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2022-02-04 23:33:17.543341: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2022-02-04 23:33:17.591854: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2022-02-04 23:33:17.604233: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2022-02-04 23:33:17.699987: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2022-02-04 23:33:17.719334: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2022-02-04 23:33:17.898389: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2022-02-04 23:33:17.899437: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2022-02-04 23:33:17.900126: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2022-02-04 23:33:17.900167: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2022-02-04 23:33:17.900984: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2022-02-04 23:33:18.858296: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2022-02-04 23:33:18.858385: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108]      0 
2022-02-04 23:33:18.858434: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0:   N 
2022-02-04 23:33:18.860154: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2022-02-04 23:33:18.860241: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1330] Could not identify NUMA node of platform GPU id 0, defaulting to 0.  Your kernel may not have been built with NUMA support.
2022-02-04 23:33:18.860856: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2022-02-04 23:33:18.861517: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2022-02-04 23:33:18.861655: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/device:GPU:0 with 4846 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 3060 Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 8.6)
2022-02-04 23:33:18.899505: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55d95ba19b20 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2022-02-04 23:33:18.899580: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): NVIDIA GeForce RTX 3060 Laptop GPU, Compute Capability 8.6
tf.config.list_physical_devices('GPU')
2022-02-04 23:33:23.729200: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2022-02-04 23:33:23.729447: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 3060 Laptop GPU computeCapability: 8.6
coreClock: 1.702GHz coreCount: 30 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 312.97GiB/s
2022-02-04 23:33:23.729685: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2022-02-04 23:33:23.729717: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2022-02-04 23:33:23.729732: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2022-02-04 23:33:23.729744: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2022-02-04 23:33:23.729756: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2022-02-04 23:33:23.729766: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2022-02-04 23:33:23.729780: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2022-02-04 23:33:23.731680: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2022-02-04 23:33:23.733292: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2022-02-04 23:33:23.733402: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0





[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
 print("tf version:",version,"\nuse GPU",gpu_ok)
tf version: 2.2.0 
use GPU True

版本对应

你以为到这块就完了么,其实没有,TensorFlow 和keras 的GPU 使用很多是依赖不同的小版本的,同时对应了不同的CUDA 版本。

CUDA 历史版本的下载链接如下:
https://developer.nvidia.com/cuda-toolkit-archive

tensorflow 版本对应

https://tensorflow.google.cn/install/source_windows

pytorch 版本对应

https://pytorch.org/get-started/previous-versions/


参考文献

nvidia 神奇的又新增了一个 docker 开发者文档:

  • https://docs.nvidia.com/ai-enterprise/deployment-guide/dg-docker.html#enabling-the-docker-repository-and-installing-the-nvidia-container-toolkit
    在这里插入图片描述

nvidia-docker 2.0 感觉就是又加了一层
在这里插入图片描述

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

shiter

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值