以下方法在x86上亲测通过,在Nvidia TX 系列第三步会出错。但是会生成动态链接库,有兴趣的可以试试能不能用,我测试是可以用的。
环境(16.04LTS cuda8.0 cudnn6.0.10 tf1.3 python2 )
1.安装依赖项
sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update && sudo apt-get -y dist-upgrade
sudo apt-get install oracle-java8-installer -y
sudo apt-get install zip unzip autoconf automake libtool curl zlib1g-dev maven -y
sudo apt-get install python-numpy python-pip python-dev python-wheel
//--------For Python 2.7--------
sudo apt-get install python-pip python-numpy swig python-dev
sudo pip install wheel
//--------For Python 3.3+--------
sudo apt-get install python3-pip python3-numpy swig python3-dev
sudo pip3 install wheel
2.安装Bazel
Bazel 安装版本问题:
个人试过很多个版本,0.6.1 0.4.5 0.5.2。最后选择0.5.2。安装的时候0.6.1问题最多,时间久远问题记不清了。
bazel下载
https://github.com/bazelbuild/bazel/releases
这段时间,翻不出去。自己下载很慢很慢。找在学校的同学下载的。把本地的传到优快云上了,供需要的下载
http://download.youkuaiyun.com/download/ycdhqzhiai/10130322
安装:解压后很简单(别直接解压,mkdir先)
./compile.sh
安装得一会,完成后将生成的二级制文件复制到bin中
cp output/bazel /usr/local/bin/
(大汗!!!!!!bazel可以搞一天。。。。。)
3.编译tensorflow
git clone –recursive https://github.com/tensorflow/tensorflow.git
cd tensorflow
git checkout v1.3.0
修改tensorflow/BUILD文件,在末尾行添加
#Added build rule
cc_binary(
name = "libtensorflow_all.so",
linkshared = 1,
linkopts = ["-Wl,--version-script=tensorflow/tf_version_script.lds"], # Remove this line if you are using MacOS
deps = [
"//tensorflow/core:framework_internal",
"//tensorflow/core:tensorflow",
"//tensorflow/cc:cc_ops",
"//tensorflow/cc:client_session",
"//tensorflow/cc:scope",
"//tensorflow/c:c_api",
],
)
./configure
编译选择
You have bazel 0.5.2- installed.
Please specify the location of python. [Default is /usr/bin/python]: /usr/bin/python
Do you wish to build TensorFlow with MKL support? [y/N] N
No MKL support will be enabled for TensorFlow
Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]:
Do you wish to use jemalloc as the malloc implementation? [Y/n] Y
jemalloc enabled
Do you wish to build TensorFlow with Google Cloud Platform support? [y/N] N
No Google Cloud Platform support will be enabled for TensorFlow
Do you wish to build TensorFlow with Hadoop File System support? [y/N] N
No Hadoop File System support will be enabled for TensorFlow
Do you wish to build TensorFlow with the XLA just-in-time compiler (experimental)? [y/N] N
No XLA support will be enabled for TensorFlow
Do you wish to build TensorFlow with VERBS support? [y/N] N
No VERBS support will be enabled for TensorFlow
Do you wish to build TensorFlow with OpenCL support? [y/N] N
No OpenCL support will be enabled for TensorFlow
Do you wish to build TensorFlow with CUDA support? [y/N] N
No CUDA support will be enabled for TensorFlow
Do you wish to build TensorFlow with MPI support? [y/N] N
MPI support will not be enabled for TensorFlow
Configuration finished
生成动态链接库
bazel build tensorflow:libtensorflow_all.so
cp bazel-bin/tensorflow/libtensorflow_all.so /usr/local/lib/
4.编译 Protobuf Eigen
此时所在目录应为tensorflow源码根目录下。
mkdir /tmp/proto
tensorflow/contrib/makefile/download_dependencies.sh
cd tensorflow/contrib/makefile/downloads/protobuf/
./autogen.sh
./configure --prefix=/tmp/proto/
make
make install
mkdir /tmp/eigen
cd ../eigen
mkdir build_dir
cd build_dir
cmake -DCMAKE_INSTALL_PREFIX=/tmp/eigen/ ../
make install
cd ../../../../../..
5添加include文件
sudo mkdir -p /usr/local/include/google/tensorflow/tensorflow
cp -r bazel-genfiles/* /usr/local/include/google/tensorflow
cp -r tensorflow/cc /usr/local/include/google/tensorflow/tensorflow
cp -r tensorflow/core /usr/local/include/google/tensorflow/tensorflow
cp -r third_party /usr/local/include/google/tensorflow
cp -r /tmp/proto/include/* /usr/local/include/google/tensorflow
cp -r /tmp/eigen/include/eigen3/* /usr/local/include/google/tensorflow
至此,tensorflow编译动态链接库就差不多了。.so文件在/usr/loca/lib下。include在/usr/local/include/google/tensorflow/下,有了.so和.h赶紧测试吧。(亲测调用tensorflow C++API 提取图像特征,与python提取特征仅仅数据精度有差别)。
=========我是分割线==========
前面提到的TX系列使用.so问题。我在编译的时候第三步会失败。先贴报错信息吧
TX1上报错
/home/nvidia/tensorflow/tensorflow/BUILD:458:1: error loading package ‘tensorflow/c’: Encountered error while reading extension file ‘protobuf.bzl’: no such package ‘@protobuf//’: java.io.IOException: Error downloading [https://github.com/google/protobuf/archive/0b059a3d8a8f8aa40dde7bea55edca4ec5dfea66.tar.gz, http://mirror.bazel.build/github.com/google/protobuf/archive/0b059a3d8a8f8aa40dde7bea55edca4ec5dfea66.tar.gz] to /home/nvidia/.cache/bazel/_bazel_nvidia/d2751a49dacf4cb14a513ec663770624/external/protobuf/0b059a3d8a8f8aa40dde7bea55edca4ec5dfea66.tar.gz: Checksum was e5fdeee6b28cf6c38d61243adff06628baa434a22b5ebb7432d2a7fbabbdb13d but wanted 6d43b9d223ce09e5d4ce8b0060cb8a7513577a35a64c7e3dad10f0703bf3ad93 and referenced by ‘//tensorflow:libtensorflow_all.so’.
这个错误在TX2上没有出现,可能是我TX2前面装过protobuf原因,具体原因不知道
解决方案:命令行输入
sed -i ‘\@https://github.com/google/protobuf/archive/0b059a3d8a8f8aa40dde7bea55edca4ec5dfea66.tar.gz@d’ tensorflow/workspace.bzl
接着编译。
TX2和TX1上共同出现的错误:
ERROR: /home/nvidia/tensorflow/tensorflow/core/kernels/BUILD:2157:1: C++ compilation of rule ‘//tensorflow/core/kernels:matrix_solve_ls_op’ failed: gcc failed: error executing command /usr/bin/gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -B/usr/bin -B/usr/bin -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -g0 -O2 ‘-D_FORTIFY_SOURCE=1’ -DNDEBUG … (remaining 106 argument(s) skipped): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 4.
gcc: internal compiler error: Killed (program cc1plus)
这个错误搞了半天依旧没有解决。如果有大神知道是什么问题希望能够留言指导下,拜谢!!!
但是此时如果cd bazel-bin/tensorflow 里面可以看见已经生成了libtensorflow_all.so,只是头文件没有生成。
菜鸟解决不了问题只能采用最笨的方法。
将X86_64平台下的头文件(google文件夹)整个拷贝到TX2 /usr/local/include/下
将生成的.so文件拷到/usr/local/lib下。
这样.so文件和.h文件不都有了嘛!!!!!!
后来测试提取图片特征,依然能够编译通过。而且打印的特征值也没问题。。。。。。