快速运行TensorFlow的6种方式
TensorFlow(http://tensorflow.org)是一个深度学习计算引擎,自然是可以直接安装运行的,而且能得到最佳的性能。但是,考虑到机器学习需要安装大量的软件,之间必然会带来软件管理和版本兼容性问题,而且在集群中运行更为复杂,因此不推荐这种方式。
- 在Linux上安装TensorFlow,https://www.tensorflow.org/install/install_linux
- 搭建分布式Tensorflow集群,https://my.oschina.net/u/2306127/blog/1815193
- 分布式TensorFlow,中文,https://my.oschina.net/u/2306127/blog/1815208
- TensorBoard张量图可视化,https://my.oschina.net/repine/blog/918576
- TensorFlow on Kubernetes架构实践,https://my.oschina.net/jxcdwangtao/blog/1612667
这里将介绍几种可以在隔离环境中运行TensorFlow的方式,包括:Anaconda、Docker、Jupyter、Kubernetes POD、Kubeflow、Spark DL on MLlib,将更容易管理和扩展到集群中运行。
1、Anaconda
通过Anaconda来运行Tensorflow的Python调用代码,优点是可以提供一个相对隔离的Python运行环境,避免与其它的python任务产生版本冲突。虽然也可以通过VirtualEnv完成类似的任务,但Anaconda自带包管理功能,用起来更为方便。
下载和安装Anaconda:
echo ""
echo "====================================================================="
echo "Downloading Anaconda3 5.1.0 to ~/openthings..."
echo ""
cd ~/openthings
wget -c https://repo.anaconda.com/archive/Anaconda3-5.1.0-Linux-x86_64.sh
echo "====================================================================="
echo "Finished.Latest version at https://repo.continuum.io/archive/"
echo "More practice at https://my.oschina.net/u/2306127/blog"
echo "====================================================================="
echo ""
下载和安装Miniconda:
echo ""
echo "====================================================================="
echo "Downloading Miniconda3 4.5.1 to ~/openthings..."
echo ""
cd ~/openthings
wget -c https://repo.continuum.io/miniconda/Miniconda3-4.5.1-Linux-x86_64.sh
echo "====================================================================="
echo "Finished.Latest version at https://repo.continuum.io/miniconda/"
echo "More practice at https://my.oschina.net/u/2306127/blog"
echo "====================================================================="
echo ""
安装TensorFlow on Anaconda:
echo ""
echo "================================================================="
echo "Create conda env for Tensorflow, python 3.6.5 ..."
conda create --yes -n tensorflow pip python=3.6.5
source activate tensorflow
echo ""
echo ""
echo "================================================================="
echo "Install Tensorflow 1.8.0 with GPU support."
pip install --ignore-installed --upgrade \
https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.8.0-cp36-cp36m-linux_x86_64.whl
echo "================================================================="
echo "More practice at https://my.oschina.net/u/2306127/blog"
echo "More info,Visit https://www.tensorflow.org/install/install_linux "
echo "================================================================="
echo ""
然后运行:
source activate tensorflow
这样切换Python运行环境,就可以在Anaconda环境下通过Python执行Tensorflow调用代码了,当然也可以安装Jupyter Notebook之类的组件,然后在浏览器里编写和运行tensorflow计算任务。
2、Docker
将TensorFlow放到Docker容器中执行,安装部署都更为方便,不污染宿主机环境,快速实验多种版本。
echo ""
echo "================================================================="
echo "Running Tensorflow in Docker with bash shell."
echo "More practice at https://my.oschina.net/u/2306127/blog"
echo "Please Visit https://www.tensorflow.org/install/install_linux "
echo "================================================================="
echo ""
nvidia-docker run -it tensorflow/tensorflow:latest-gpu bash
3、Jupyter
在Docker中安装Anaconda和Jupyter。
echo ""
echo "================================================================="
echo "Running Tensorflow in Docker with jupyter notebook."
echo "More practice at https://my.oschina.net/u/2306127/blog"
echo "Please Visit https://www.tensorflow.org/install/install_linux "
echo "================================================================="
echo ""
nvidia-docker run -it -p 8888:8888 tensorflow/tensorflow:latest-gpu
然后按照提示的地址,到浏览器打开网址 http://localhost:8888,就可以使用了。
4、Kubernetes POD
如果使用Kubernetes/Minikube或者OpenShift,可以将Tensorflow部署到K8S集群中,参见:
- Kubernetes集成TensorFlow服务,https://my.oschina.net/u/2306127/blog/1811348
5、Kubeflow
通过Kubernetes建立TensorFlow机器学习集群的更佳方式是使用KubeFlow,目前该项目启动时间不长,还需要进一步完善。使用参考:
- Kubeflow使用指南,https://my.oschina.net/u/2306127/blog/1808582
6、Spark DL on MLlib
由于TensorFlow只是完成计算功能,通过Spark ML建立深度学习流水线,从而可以充分利用Spark的分布式内存和数据处理、数据IO、交互分析等功能,然后进一步将其运行在Kubernetes集群之中统一调度。
- Apache Spark上的深度学习流水线 ,https://my.oschina.net/u/2306127/blog/1811876