设备:普通主机; ubuntu16.04; GTX1080 8G显存显卡1块
注:(有任何问题可以联系QQ:1657580398 2018.12.29 11:42)
#**#:在ubuntu16.04基础上裸机开始装配置环境:
#**# :Anaconda3是自带python的,根据需要可以下载不同对应的python版本,Anaconda是一个很强大环境配置工具
#**#:记住显卡驱动的版本必须大于等于cuda的sh文件名里面的版本号
按照顺序安装!!!
1.安装Anaconda3
下载Anaconda3(自带python3.5):如下图
其他python的版本可以自行选择
https://repo.continuum.io/archive/
创建anaconda3文件夹以便存放安装文件,进入anaconda3文件夹(Anaconda3-4.2.0-Linux-x86_64.sh的目录)
sudo mkdir anaconda3
cd anaconda3
安装下载好的anaconda3:
bash Anaconda3-4.2.0-Linux-x86_64.sh
进行安装,一直按yes或者回车就对了
安装过程中会牵涉到添加python的环境变量到系统,这个心里要有数
更新环境变量:这时候默认的python与pip都是对应anaconda3自带的python3.5了
source ~/.bashrc
2.显卡驱动安装:
下载显卡驱动:https://www.nvidia.cn/Download/index.aspx?lang=cn
cuda9.0必须要Nivdia 384及以上才可以正常安装,显卡驱动的版本必须大于等于cuda的sh文件名里面的版本号
如下图,在nvidia官网查找自己显卡对应的显卡驱动型号如下图,1080Ti最新的是390.67;我用的是以前1080Ti的384.90驱动,1080Ti用384.x的驱动基本没问题
<1>检查驱动是否安装:
nvidia-smi
如果什么都没出现说明你没有安装nvidia驱动或者驱动已损坏
如果正常会显示类似下图:(说明已经安装了就可以跳过显卡驱动安装)
<2>打开终端,先删除旧的驱动:
sudo apt-get purge nvidia*
<3>禁用自带的 nouveau nvidia驱动
用下面命令查看Nouveau是已经被禁用:如果已经没有任何显示说明不用禁用了,否则继续<3.1>操作
lsmod | grep nouveau
<3.1>创建一个文件,(注:按一下i键,表示现在对内容进行内容插入)
sudo vim /etc/modprobe.d/blacklist-nouveau.conf
并添加如下内容:
blacklist nouveau
options nouveau modeset=0
注:退出可用两个命令中任意一个:
1.按完esc键后,按shift+zz 或者
2. 按完esc键后,输入“:wq!”双引号里面的内容
再更新一下:
sudo update-initramfs –u
确认下Nouveau是已经被禁用:
lsmod | grep nouveau
没有输出什么东西,说明已经成功关闭了
<4>.关闭X-window服务:
Ctrl+Alt+F1切换到无桌面命令终端:这是lightdm显示管理器操作命令;你的也可能是gdm,kdm,使用Ctrl+Alt+F4
到底是哪一个可以使用命令进行查看:
cat /etc/X11/default-display-manager
接下来关闭显示管理器。这里下面几小步建议用手机拍照,对着照片来做,因为你可能不太熟悉,记住NVIDIA-Linux-x86_64-384.90.run文件下载后放到的位置,推荐放在/home/用户名/tmp下 (tmp文件夹是自己新建的,意思是临时的,装完可以删除)
sudo service lightdm stop
显示屏关了,进入终端界面:
Login:用户账号
Password:用户密码
安装:
cd /home/ljm/tmp # tmp文件夹是我放NVIDIA-Linux-x86_64-384.90.run文件的目录
sudo sh NVIDIA-Linux-x86_64-384.90.run
按照如下步骤安装:
(1)accept
(2)contiuned install
后面默认yes安装就好了
启动显示器:(lightdm只是我的显示管理器,你可能是前面的说的gdm,kdm)
sudo service lightdm start
然后按Ctrl+Alt+F7 进入到桌面进行操作
检查是否成功:
nvidia-smi
3.CUDA9.0安装:
首先去官网下载cuda9.0, 下载那个1.6G的.run文件,下载完毕就可以正式安装了。
其他版本请看这里:https://developer.nvidia.com/cuda-toolkit-archive
<1>.安装下载好的cuda包:
进入下载目录,给文件添加运行权限:
chmod +x ./cuda_9.0.176_384.81_linux.run
运行安装:
sudo ./cuda_9.0.176_384.81_linux.run
启动安装程序,一直按空格到最后(可以选择按q键跳过),最后输入accept接受条款
1.输入n不安装nvidia图像驱动,之前已经安装过了
2.输入n不安装OpenGL库(这步应该没有,如果有这步的话,服务器双卡或者多卡不要安装OpenGL)
3.输入y安装cuda 9.0工具
4.回车确认cuda默认安装路径:/usr/local/cuda-9.0
5.输入y用sudo权限运行安装,输入密码 (这步有的用户可能没有)
6.输入y安装指向/usr/local/cuda的符号链接 (a symbolic link at /usr/local/cuda)
7.输入y安装CUDA 9.0 Samples,以便后面测试 (其实这里无所谓)
回车确认CUDA 9.0 Samples默认安装路径,该安装路径测试完可以删除(这里不用在乎)
<2>安装完毕后就需要添加环境了,这步很重要!!!
sudo gedit ~/.bashrc
在此文件末尾添加两行:
export PATH=/usr/local/cuda-9.0/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64$LD_LIBRARY_PATH
<3>执行source ~/.bashrc使环境变量生效 ,重启电脑
<4> ctrl+Alt+T打开新终端,输入nvcc -V,如果输出显卡信息则说明安装成功。
4.cudnn7.0.5安装:
<1>注册一个账号,去官网上下载cudnn7.0.5
选择适配Linux的,也就是图上的第四个!
下载完直接解压,解压会出现一个cuda文件夹(文件名自己核对好)
tar -xzvf cudnn-9.0-linux-x64-v7.tar.gz
里面有两个文件include 和 lib64,把里面的文件copy到/usr/local/cuda/里面相应的目录里。 如果你就在local下解压的就不要移动了。只需要给文件加读权限即可!
sudo chmod a+x /usr/local/cuda/include/cudnn.h
sudo chmod a+x /usr/local/cuda/lib64/libcudnn*
然后更新网络连接:
cd /usr/local/cuda/lib64/
sudo chmod +r libcudnn.so.7.0.5 # 自己查看.so的版本号
sudo ln -sf libcudnn.so.7.0.5. libcudnn.so.7
sudo ln -sf libcudnn.so.7 libcudnn.so
sudo ldconfig
查看cudnn版本,检查是否安装好:
cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
至此cudnn安装完成。
5. pip下载源更换为国内源:(时间急的可以跳过这步,长久来看还是有必要的)
为什么要做这步操作?
答:针对你的电脑下载库很慢的问题,linux系统安装库默认下载的很多资源是从国外下载,速递超级慢,需要我们自己换为国内的源,这里推荐最好用的清华镜像源链接
临时使用:可以在使用pip的时候加参数-i https://pypi.tuna.tsinghua.edu.cn/simple
例如:这样就会从清华这边的镜像去安装numpy库
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple numpy
永久修改,一劳永逸:分Linux与windows
Linux: 修改 ~/.pip/pip.conf (没有就创建一个), 修改 index-url至tuna,换源为国内镜像,内容如下:
[global]
index-url = https://pypi.tuna.tsinghua.edu.cn/simple
windows: 直接在user目录中创建一个pip目录,如:C:\Users\xx\pip,新建文件pip.ini,内容如下
[global]
index-url = https://pypi.tuna.tsinghua.edu.cn/simple
6.安装tensorflow-gpu
安装gpu版的tensorflow-gpu:
sudo pip install tensorflow-gpu
安装指定版本的tensorflow-gpu: 比如:1.8.0
sudo pip install tensorflow-gpu==1.8.0
不想安装gpu版,只安装cpu版:
sudo pip install tensorflow
7.安装keras
安装keras:
sudo pip install keras
安装指定版本的keras: 比如:2.2.0
sudo pip install keras==2.2.0
8.安装caffe(使用ubuntu自带的python2.7或者python3.5进行编译)
8.1 必要依赖包安装
sudo apt-get install build-essential
sudo apt-get install --no-install-recommends libboost-all-dev
sudo apt-get install libatlas-base-dev libgflags-dev libgoogle-glog-dev liblmdb-dev
sudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libhdf5-serial-dev protobuf-compiler
sudo apt-get install libblas-dev liblapack-dev libatlas-base-dev gfortran python-numpy
8.2 安装pip
# 使用linux自带的python2.7进行caffe编译
sudo apt-get python-pip
# 使用linux自带的python3.5进行caffe编译
sudo apt-get python3-pip
8.3 下载
caffe
及
安装python依赖
(1)安装git,并下载caffe源码:
sudo apt-get install git
git clone https://github.com/BVLC/caffe.git
(2)在caffe根目录python文件夹下安装依赖项:用的清华镜像下载要下载的包(有些人小伙伴可能没换pip国内源)
cd caffe
sudo for req in $(cat "requirements.txt"); do pip install -i https://pypi.tuna.tsinghua.edu.cn/simple $req; done
8.4 编译caffe
(1) 复制得到一份 Makefile.config,并进行编辑
cd caffe
sudo cp Makefile.config.example Makefile.config
sudo gedit Makefile.config
(2) 完整的修改后Makefile.config文件
要修改的地方共10个,下面已经用 # ---***number***--- 标记出来了
重要的说明:选择linux自带的python版本,这里不详细说明Anaconda自带的python下caffe编译,因为道理一样
其中***6***是用python2.7编译;
其中***7***是用python3.5编译;
## Refer to http://caffe.berkeleyvision.org/installation.html
# Contributions simplifying and improving our build system are welcome!
# cuDNN acceleration switch (uncomment to build with cuDNN).
USE_CUDNN := 1 # ---***1***---
# CPU-only switch (uncomment to build without GPU support).
# CPU_ONLY := 1 # ---***2***---
# uncomment to disable IO dependencies and corresponding data layers
USE_OPENCV := 1 # ---***3***---
# USE_LEVELDB := 0
# USE_LMDB := 0
# This code is taken from https://github.com/sh1r0/caffe-android-lib
# USE_HDF5 := 0
# uncomment to allow MDB_NOLOCK when reading LMDB files (only if necessary)
# You should not set this flag if you will be reading LMDBs with any
# possibility of simultaneous read and write
# ALLOW_LMDB_NOLOCK := 1
# Uncomment if you're using OpenCV 3
OPENCV_VERSION := 3 # ---***4***---
# To customize your choice of compiler, uncomment and set the following.
# N.B. the default for Linux is g++ and the default for OSX is clang++
# CUSTOM_CXX := g++
# CUDA directory contains bin/ and lib/ directories that we need.
CUDA_DIR := /usr/local/cuda
# On Ubuntu 14.04, if cuda tools are installed via
# "sudo apt-get install nvidia-cuda-toolkit" then use this instead:
# CUDA_DIR := /usr
# CUDA architecture setting: going with all of them.
# For CUDA < 6.0, comment the *_50 through *_61 lines for compatibility.
# For CUDA < 8.0, comment the *_60 and *_61 lines for compatibility.
# For CUDA >= 9.0, comment the *_20 and *_21 lines for compatibility. # delete *_20 and *_21------***5***------
CUDA_ARCH := -gencode arch=compute_30,code=sm_30 \
-gencode arch=compute_35,code=sm_35 \
-gencode arch=compute_50,code=sm_50 \
-gencode arch=compute_52,code=sm_52 \
-gencode arch=compute_60,code=sm_60 \
-gencode arch=compute_61,code=sm_61 \
-gencode arch=compute_61,code=compute_61
# BLAS choice:
# atlas for ATLAS (default)
# mkl for MKL
# open for OpenBlas
BLAS := atlas
# Custom (MKL/ATLAS/OpenBLAS) include and lib directories.
# Leave commented to accept the defaults for your choice of BLAS
# (which should work)!
# BLAS_INCLUDE := /path/to/your/blas
# BLAS_LIB := /path/to/your/blas
# Homebrew puts openblas in a directory that is not on the standard search path
# BLAS_INCLUDE := $(shell brew --prefix openblas)/include
# BLAS_LIB := $(shell brew --prefix openblas)/lib
# This is required only if you will compile the matlab interface.
# MATLAB directory should contain the mex binary in /bin.
# MATLAB_DIR := /usr/local
# MATLAB_DIR := /Applications/MATLAB_R2012b.app
# NOTE: this is required only if you will compile the python interface.
# We need to be able to find Python.h and numpy/arrayobject.h.
PYTHON_INCLUDE := /usr/include/python2.7 \ # ---***6***--- -py27
/usr/lib/python2.7/dist-packages/numpy/core/include \ # ---***6***--- -py27
/usr/local/lib/python2.7/dist-packages/numpy/core/include # ---***6***--- -py27
# Anaconda Python distribution is quite popular. Include path:
# Verify anaconda location, sometimes it's in root.
# ANACONDA_HOME := $(HOME)/anaconda
# PYTHON_INCLUDE := $(ANACONDA_HOME)/include \
# $(ANACONDA_HOME)/include/python2.7 \
# $(ANACONDA_HOME)/lib/python2.7/site-packages/numpy/core/include
# Uncomment to use Python 3 (default is Python 2)
#PYTHON_LIBRARIES := boost_python-py35 python3.5m # ---***7***--- -py35
#PYTHON_INCLUDE := /usr/include/python3.5m \ # ---***7***--- -py35
# /usr/lib/python3.5/dist-packages/numpy/core/include # ---***7***--- -py35
# We need to be able to find libpythonX.X.so or .dylib.
PYTHON_LIB := /usr/lib
# PYTHON_LIB := $(ANACONDA_HOME)/lib
# Homebrew installs numpy in a non standard path (keg only)
# PYTHON_INCLUDE += $(dir $(shell python -c 'import numpy.core; print(numpy.core.__file__)'))/include
# PYTHON_LIB += $(shell brew --prefix numpy)/lib
# Uncomment to support layers written in Python (will link against Python libs)
WITH_PYTHON_LAYER := 1 # ---***8***---
# Whatever else you find you need goes here.
INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial # ---***9***---
LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /usr/lib/x86_64-linux-gnu /usr/lib/x86_64-linux-gnu/hdf5/serial # ---***9***---
# If Homebrew is installed at a non standard location (for example your home directory) and you use it for general dependencies
# INCLUDE_DIRS += $(shell brew --prefix)/include
# LIBRARY_DIRS += $(shell brew --prefix)/lib
# NCCL acceleration switch (uncomment to build with NCCL)
# https://github.com/NVIDIA/nccl (last tested version: v1.2.3-1+cuda8.0)
# USE_NCCL := 1
# Uncomment to use `pkg-config` to specify OpenCV library paths.
# (Usually not necessary -- OpenCV libraries are normally installed in one of the above $LIBRARY_DIRS.)
# USE_PKG_CONFIG := 1
# N.B. both build and distribute dirs are cleared on `make clean`
BUILD_DIR := build
DISTRIBUTE_DIR := distribute
# Uncomment for debugging. Does not work on OSX due to https://github.com/BVLC/caffe/issues/171
# DEBUG := 1
# The ID of the GPU that 'make runtest' will use to run unit tests.
TEST_GPUID := 0
# enable pretty build (comment to see full commands)
Q ?= @
(3) 修改Makefile文件
Makefile文件修改:
LIBRARIES += glog gflags protobuf boost_system boost_filesystem m hdf5_hl hdf5
改为:
LIBRARIES += glog gflags protobuf boost_system boost_filesystem m hdf5_serial_hl hdf5_serial
NVCCFLAGS +=-ccbin=$(CXX) -Xcompiler-fPIC $(COMMON_FLAGS)
改为:
NVCCFLAGS += -D_FORCE_INLINES -ccbin=$(CXX) -Xcompiler -fPIC $(COMMON_FLAGS)
(4) 开始编译
make all –j8 (8线程,线程越多速度越快)
make test –j8
make runtest –j8
make pytest –j8
(5) 将caffe库添加到环境变量:
# 1.编辑~/.bashrc文件
sudo gedit ~/.bashrc
# 2.在文件内容最后一行添加如下内容,其中ljm是你自己的usename
export PYTHONPATH=/home/ljm/caffe/python:$PYTHONPATH
# 3.右键保存,×掉退出
# 4.激活环境变量
source ~/.bashrc
(6) 测试安装结果:
ljm@ljm:~/caffe$ python
Python 2.7.12 (default, Nov 12 2018, 14:36:49)
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import caffe
>>> exit(0)
END! !
关于卸载:
1.卸载CUDA以及CUDNN
如果需要升级CUDA版本,我建议,先卸载,在重新安装吧,不然可能入坑。
(1)卸载CUDA
sudo /usr/local/cuda-9.0/bin/uninstall_cuda_9.0.pl
(2)卸载CUDNN
删除原来的cudnn文件:
sudo rm -rf /usr/local/cuda-9.0/lib64/libcudnn*
sudo rm -rf /usr/local/cuda-9.0/include/cudnn.h
如果需要安装,则按照上述安装方式重新安装即可。
感谢
十分感谢来自合肥工业大学的大佬武广同学的联手完成此篇博客的写作!