kubernetes与kubeflow一站式搭建

欢迎观看,有什么问题可以在源仓库提下issue,可以一起学习讨论

https://github.com/JK-97/my_note

一、前期准备

k8s有很多种搭建方式,google上查找的大部分教程都是基于AWS和GCP的,而网上搭建本地的集群的教程极为零散。

那么接下就开始搭建之路吧!

示例环境

master 192.168.0.105

node 192.168.0.115

 

已经适配的版本:

  • kubernets
 kubeadm kubelet kubectl 全部要统一版本v1.12.8

 

  • docker :
Client:

Version: 17.12.1-ce
API version: 1.35
Go version:  go1.9.4
Git commit:  7390fc6
Built:   Tue Feb 27 22:17:40 2018
 OS/Arch: linux/amd64

Server:
Engine:
Version:    17.12.1-ce
API version:    1.35 (minimum version 1.12)
Go version: go1.9.4
Git commit: 7390fc6
Built:  Tue Feb 27 22:16:13 2018
OS/Arch:    linux/amd64
Experimental:   false
  • nvidia-docker
NVIDIA Docker: 2.0.3
Client:
Version: 17.12.1-ce
API version: 1.35
Go version:  go1.9.4
Git commit:  7390fc6
Built:   Tue Feb 27 22:17:40 2018
OS/Arch: linux/amd64

Server:
Engine:
Version:    17.12.1-ce
API version:    1.35 (minimum version 1.12)
Go version: go1.9.4
Git commit: 7390fc6
Built:  Tue Feb 27 22:16:13 2018
OS/Arch:    linux/amd64
Experimental:   false

相关命令

$ kubeadm version
$ docker version
$ nvidia-docker version

 

 第一步:安装docker

# 安装最新版本
$ sudo apt-get update

$ sudo apt-get install \
apt-transport-https \
ca-certificates \
curl \
gnupg-agent \
software-properties-common

$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -

$ sudo apt-key fingerprint 0EBFCD88

$ sudo add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) \
stable"
# 获取docker的repo

$ sudo apt-get update


$ sudo apt-get install docker-ce docker-ce-cli containerd.io
# 直接安装是安装最新版本的,这里需要安装指定版本,我们跳过
# 该教程使用的就是 17.12.1~ce-0~ubuntu 版本
# 安装指定版本,紧接上一段倒数第二句命令
$ apt-cache madison docker-ce
    docker-ce | 5:18.09.1~3-0~ubuntu-xenial | https://download.docker.com/linux/ubuntu xenial/stable amd64 Packages
    docker-ce | 5:18.09.0~3-0~ubuntu-xenial | https://download.docker.com/linux/ubuntu xenial/stable amd64 Packages
    docker-ce | 18.06.1~ce~3-0~ubuntu | https://download.docker.com/linux/ubuntu xenial/stable amd64 Packages
    docker-ce | 18.06.0~ce~3-0~ubuntu | https://download.docker.com/linux/ubuntu xenial/stable amd64 Packages
    ·····
# 查看有什么版本

$ sudo apt-get install docker-ce=<VERSION_STRING> containerd.io
# eg. sudo apt-get install docker-ce=17.12.1~ce-0~ubuntu containerd.io
#这样就完成了docker的安装

第二步:安装nvidia-docker

$ docker volume ls -q -f driver=nvidia-docker | xargs -r -I{} -n1 docker ps -q -a -f volume={} | xargs -r docker rm -f

$ sudo apt-get purge -y nvidia-docker
# 卸载旧版的nvidia-docker,之前没安装就跳过

$ curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -

$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID)

$ curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
    deb https://nvidia.github.io/libnvidia-container/ubuntu18.04/$(ARCH) /
    deb https://nvidia.github.io/nvidia-container-runtime/ubuntu18.04/$(ARCH) /
    deb https://nvidia.github.io/nvidia-docker/ubuntu18.04/$(ARCH) /


$ sudo apt-get update

$ sudo apt-get install -y nvidia-docker2
# 直接装是最新版本,会自动升级docker到最新版本,一般情况下我们不这么做,我们这里不适用

 

# 安装指定版本
$ apt-cache madison nvidia-docker2
    nvidia-docker2 | 2.0.3+docker18.09.5-3 | https://nvidia.github.io/nvidia-docker/ubuntu16.04/amd64 Packages
    nvidia-docker2 | 2.0.3+docker18.09.5-2 | https://nvidia.github.io/nvidia-docker/ubuntu16.04/amd64 Packages
    nvidia-docker2 | 2.0.3+docker18.09.4-1 | https://nvidia.github.io/nvidia-docker/ubuntu16.04/amd64 Packages
    nvidia-docker2 | 2.0.3+docker18.09.3-1 | https://nvidia.github.io/nvidia-docker/ubuntu16.04/amd64 Packages
    nvidia-docker2 | 2.0.3+docker18.09.2-1 | https://nvidia.github.io/nvidia-docker/ubuntu16.04/amd64 Packages
    nvidia-docker2 | 2.0.3+docker18.09.1-1 | https://nvidia.github.io/nvidia-docker/ubuntu16.04/amd64 Packages
    nvidia-docker2 | 2.0.3+docker18.09.0-1 | https://nvidia.github.io/nvidia-docker/ubuntu16.04/amd64 Packages
    nvidia-docker2 | 2.0.3+docker18.06.2-2 | https://nvidia.github.io/nvidia-docker/ubuntu16.04/amd64 Packages
    nvidia-docker2 | 2.0.3+docker18.06.2-1 | https://nvidia.github.io/

····

# 获取到版本号后,直接装也是不行的
# 他会提示你要有新的依赖,需要安装最新的nvidia-container-runtime,实际是不需要的
# 所以安装还要带上nvidia-container-runtime并且指定一个版本

$ apt-cache madison nvidia-container-runtime
    nvidia-container-runtime | 2.0.0+docker18.09.5-3 | https://nvidia.github.io/nvidia-container-runtime/ubuntu16.04/amd64 Packages
    nvidia-container-runtime | 2.0.0+docker18.09.5-1 | https://nvidia.github.io/nvidia-container-runtime/ubuntu16.04/amd64 Packages
    nvidia-container-runtime | 2.0.0+docker18.09.4-1 | https://nvidia.github.io/nvidia-container-runtime/ubuntu16.04/amd64 Packages
    nvidia-container-runtime | 2.0.0+docker18.09.3-1 | https://nvidia.github.io/nvidia-container-runtime/ubuntu16.04/amd64 Packages
    nvidia-container-runtime | 2.0.0+docker18.09.2-1 | https://nvidia.github.io/nvidia-container-runtime/ubuntu16.04/amd64 Packages
    nvidia-container-runtime | 2.0.0+docker18.09.1-1 | https://nvidia.github.io/nvidia-container-runtime/ubuntu16.04/amd64 Packages

    ····
# 查看版本对应的docker版本

$ sudo apt-get install -y nvidia-docker2=2.0.3+docker17.12.1-1 nvidia-container-runtime=2.0.0+docker17.12.1-1
# 最终选择这样匹配的版本

 

# 卸载docker
$ apt autoremove docker-ce containerd.io

 

 

第三步:配置显卡

 


# 需要修改docker 的daemon
$ vim /etc/docker/daemon.json
# 写入以下内容
{
    "registry-mirrors": ["https://registry.docker-cn.com"],
    "default-runtime": "nvidia",
    "runtimes": {
        "nvidia": {
            "path": "nvidia-container-runtime",
            "runtimeArgs": []
        }
    }
}


$ sudo pkill -SIGHUP dockerd


$ docker run --runtime=nvidia --rm nvidia/cuda:9.0-base nvidia-smi

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.39       Driver Version: 418.39       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile
评论 6
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值