这里主要是GPU版本的框架安装
安装前提:
如果装cpu版本pytorch没有问题(caffe我没有试过)
安装gpu版本需提前安装好nvidia显卡驱动,cuda和cuDNN
其中要保证cuda与nvidia-driver之间版本匹配,cuda与cuDnn版本匹配,cuda与深度学习框架(torch,tensorflow等)版本不配,否则都有可能时出现问题
【1】安装显卡驱动
具体在我的其他博客有写,https://blog.youkuaiyun.com/Cindy_lxy/article/details/89438123
【2】安装cuda
下载地址: https://developer.nvidia.com/cuda-toolkit-archive
下载runfile(local)文件
然后直接sh cuda.run
注意:不要安装graph driver,选择这个会重新安装显卡驱动
测试:nvcc -V
如果发现不存在该命令,则需要配置环境变量。(注:不需要安装apt get install nvidia-cuda-toolkit)
配置方法:
gedit ~/.bashrc
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/lib64:/usr/local/cuda/extras/CUPTI/lib64
然后重启一下计算机。nvcc -V就可以看见啦
【3】安装cudnn
下载地址:https://developer.nvidia.com/rdp/cudnn-download
下载 cudnn Library for Linux
然后执行以下命令:
cp cuda/include/cudnn.h /usr/local/cuda/include
cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
测试:
cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
一、pytorch
简单,明了
https://pytorch.org/get-started/locally/
选一下命令行一复制ok
测试gpu
import torch
torch.cuda.is_available()
二、caffe2
现在caffe2已经集成在pytorch中了
首先安装caffe2需要的环境
https://caffe2.ai/docs/getting-started.html?platform=ubuntu&configuration=compile
Building for source-install dependence
然后安装pytorch就可以了
测试方法:
# To check if Caffe2 build was successful
python -c 'from caffe2.python import core' 2>/dev/null && echo "Success" || echo "Failure"
# To check if Caffe2 GPU build was successful
# This must print a number > 0 in order to use Detectron
python -c 'from caffe2.python import workspace; print(workspace.NumCudaDevices())'
2.编译caffe
在caffe的根目录下:
mkdir build
cd build
cmake ..
make
测试是否安装成功:利用mnist数据集测试caffe是否正确安装
cd ~/caffe/
./data/mnist/get_mnist.sh
./examples/mnist/create_mnist.sh
./examples/mnist/train_lenet.sh
三、tensorflow
pip install tensorflow-gpu==1.11.0
1.注意:如果numpy版本为1.17.0会导致import tensorflow报错
(tensorflow) lxy@mat:~/python/tensorflow/bin$ python
Python 3.5.2 (default, Nov 12 2018, 13:43:14)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
/home/lxy/python/tensorflow/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:523: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/lxy/python/tensorflow/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:524: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
将版本降为1.16.0即可。
2.如果报这类错误:ImportError: libcusolver.so.8.0: cannot open shared object file: No such file or directory
一般是由于cuda和tensorflow版本不匹配,.8.0就是需要cuda8,解决方法:下载对应版本cuda或者上网查找cuda和tensorflow的版本对应要求。
测试GPU资源是否可用:
tf.test.is_gpu_available()