-
准备三台虚拟机,对应网络条件如下:
IP地址 主机名 作用 172.18.74.26 manager 管理节点 172.18.74.29 g160402 worker 172.18.74.25 u180402 worker 按照上述条件修改主机名,并向/etc/hosts添加其他两个节点的解析配置
-
将所有的节点的 docker daemon 的监听方式更改为0.0.0.0:2375
-
配置一
#修改[service] ExecStart 行如下 example@manager:~$ sudo vi /lib/systemd/system/docker.service ExecStart=/usr/bin/dockerd -H 0.0.0.0:2375 -H unix:///var/run/docker.sock example@manager:~$ sudo systemctl daemon-reload example@manager:~$ sudo systemctl restart docker
-
配置二
example@u180402:~$ cat /etc/docker/daemon.json { "registry-mirrors": [ "https://reg-mirror.qiniu.com", "https://hub-mirror.c.163.com", "https://registry.aliyuncs.com" ], "hosts" : ["unix:///var/run/docker.sock", "tcp://0.0.0.0:2375"] } example@manager:~$ sudo vi /lib/systemd/system/docker.service ExecStart=/usr/bin/dockerd example@manager:~$ sudo systemctl daemon-reload example@manager:~$ sudo systemctl restart docker
-
-
初始化集群
-
创建管理节点
example@manager:~$ docker swarm init --advertise-addr 172.18.74.26 Swarm initialized: current node (w78pv2cxmucv2vca3v5r069wt) is now a manager. To add a worker to this swarm, run the following command: docker swarm join --token SWMTKN-1-1fffxrlpybn1oz0qsff9ywxuz7ef1o7v6c4qqf6kwvckt6bphi-6t9lfyat23n99do5y9mpdtdkg 192.168.1.154:2377 To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
-
初始化管理节点后,会创建两个新的网络docker_gwbridge、ingress
example@manager:~$ docker network ls NETWORK ID NAME DRIVER SCOPE 6b3877ce1c6f bridge bridge local 6f5af407c445 docker_gwbridge bridge local 25066e8c0d9e host host local p5dq2m8snezx ingress overlay swarm b512147e5000 none null local
- bridge 是docker默认创建的网络,存在于所有docker容器中。docker引擎自动创建子网络和路由,docker run 命令将自动添加新的容器到这个网络。
- docker_gwbridge 是节点加入swarm是自动创建的网络,作为不同主机的swarm 节点通讯网络。
- overlay跨主机网络仅对swarm中需要服务的节点可用。当你创建一个使用overlay网络的服务,管理节点自动扩展overlay网络到那个运行服务任务的节点中。
-
node 节点加入集群
#g160402 example@g160402:~$ docker swarm join --token SWMTKN-1-1fffxrlpybn1oz0qsff9ywxuz7ef1o7v6c4qqf6kwvckt6bphi-6t9lfyat23n99do5y9mpdtdkg 172.18.74.26:2377 This node joined a swarm as a worker. #u180402 example@u180402:~$ docker swarm join --token SWMTKN-1-1fffxrlpybn1oz0qsff9ywxuz7ef1o7v6c4qqf6kwvckt6bphi-6t9lfyat23n99do5y9mpdtdkg 172.18.74.26:2377 This node joined a swarm as a worker.
-
管理节点查看节点状态
Active
:调度器能够安排任务到该节点Pause
:调度器不能够安排任务到该节点,但是已经存在的任务会继续运行Drain
:调度器不能够安排任务到该节点,而且会停止已存在的任务,并将这些任务分配到其他 Active 状态的节点
example@manager:~$ docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION vrfif1jr3v0gl29o8okhdlc4l g160402 Ready Active 18.06.1-ce w78pv2cxmucv2vca3v5r069wt * manager Ready Active Leader 18.09.5 7jjv186tvj8hscubg6me026vq u180402 Ready Active 18.06.1-ce
-
退出集群
example@u180402:~$ docker swarm leave Node left the swarm. example@g160402:~$ docker swarm leave Node left the swarm. ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION vrfif1jr3v0gl29o8okhdlc4l g160402 Down Active 18.06.1-ce w78pv2cxmucv2vca3v5r069wt * manager Ready Active Leader 18.09.5 7jjv186tvj8hscubg6me026vq u180402 Down Active 18.06.1-ce #manager 强制退出集群 example@manager:~$ docker swarm leave --force Node left the swarm.
-
-
在集群中启用服务
-
创建拥有两个副本的http服务
example@manager:~$ docker service create --replicas 2 --name hello-swarm httpd:latest 01voy53c0ygxb5w7ncocxwfvp overall progress: 2 out of 2 tasks 1/2: running [==================================================>] 2/2: running [==================================================>] verify: Service converged example@manager:~$ docker service ls ID NAME MODE REPLICAS IMAGE PORTS 01voy53c0ygx hello-swarm replicated 2/2 httpd:latest example@manager:~$ docker service ps hello-swarm ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS qw0rfrbhgk5v hello-swarm.1 httpd:latest manager Running Running about a minute ago byhnp23chffg hello-swarm.2 httpd:latest g160402 Running Running about a minute ago example@g160402:~$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES f9f928c906e4 httpd:latest "httpd-foreground" 2 minutes ago Up 2 minutes 80/tcp hello-swarm.2.byhnp23chffg59hbnpdndgp69
-
更新服务配置
- 增加端口映射
example@manager:~$ docker service update --publish-add 8080:80 hello-swarm hello-swarm overall progress: 2 out of 2 tasks 1/2: running [==================================================>] 2/2: running [==================================================>] verify: Service converged example@g160402:~$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 8ed735b92841 httpd:latest "httpd-foreground" 13 seconds ago Up 11 seconds 80/tcp hello-swarm.2.0v51ok3f424iaziisc51tfq00
此时可以在浏览器访问任意服务器的8080端口都可以看到httpd运行成功的"It works"界面。
-
集群扩容
enee@manager:~$ docker service scale hello-swarm=4 hello-swarm scaled to 4 overall progress: 4 out of 4 tasks 1/4: running [==================================================>] 2/4: running [==================================================>] 3/4: running [==================================================>] 4/4: running [==================================================>] enee@manager:~$ docker service ps hello-swarm ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS 3o6rzluek155 hello-swarm.1 httpd:latest u180402 Running Running 5 minutes ago qw0rfrbhgk5v \_ hello-swarm.1 httpd:latest manager Shutdown Shutdown 6 minutes ago 0v51ok3f424i hello-swarm.2 httpd:latest g160402 Running Running 6 minutes ago byhnp23chffg \_ hello-swarm.2 httpd:latest g160402 Shutdown Shutdown 6 minutes ago faitccodd7vq hello-swarm.3 httpd:latest manager Running Running 27 seconds ago biqpebevezkj hello-swarm.4 httpd:latest manager Running Running 26 seconds ago
此时manager服务器运行两个http服务,u180402 1个,g160402 一个。
-
为服务添加目录映射,多次刷新页面,所访问的服务会随机分布在各运行容器
example@manager:~$ docker service update --mount-add type=bind,source=/home/example/temp/,destination=/usr/local/apache2/htdocs/ hello-swarm hello-swarm overall progress: 2 out of 2 tasks 1/2: running [==================================================>] 2/2: running [==================================================>] verify: Service converged
-
不更改任何配置,重启服务
example@g160402:~$ docker service update --force hello-swarm hello-swarm overall progress: 2 out of 2 tasks 1/2: running [==================================================>] 2/2: running [==================================================>] verify: Service converged example@g160402:~$ docker service ps hello-swarm ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS x0j4ow0jozso hello-swarm.1 httpd:latest g160402 Running Running 2 minutes ago b8g0xoo53w4a \_ hello-swarm.1 httpd:latest g160402 Shutdown Shutdown 2 minutes ago q8l75pkn9r3x hello-swarm.2 httpd:latest g160402 Running Running 2 minutes ago q28kvehhdcun \_ hello-swarm.2 httpd:latest g160402 Shutdown Shutdown 2 minutes ago 6nvq8ntrfs04 \_ hello-swarm.2 httpd:latest g160402 Shutdown Failed 20 minutes ago "task: non-zero exit (137)" example@g160402:~$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 36fd1a6c3b28 httpd:latest "httpd-foreground" About a minute ago Up About a minute 80/tcp hello-swarm.1.x0j4ow0jozsomdxnnw5vkcv6s 6c4501017beb httpd:latest "httpd-foreground" About a minute ago Up About a minute 80/tcp hello-swarm.2.q8l75pkn9r3xy33g28llgzui5
-
删除服务
example@manager:~$ docker service rm hello-swarm hello-swarm
-
-
让服务在指定节点上运行
-
为各节点添加标签
-
使用命令行添加、删除
example@manager:~$ docker node update --label-add role=manager manager manager example@manager:~$ docker node update --label-add role=worker1 g160402 g160402 example@manager:~$ docker node update --label-add role=worker2 u180402 example@manager:~$ docker node inspect g160402 ...... "Spec": { "Labels": { "role": "worker1" }, ...... #删除节点标签 example@manager:~$ docker node update --label-rm role g160402 g160402
-
在docker-daemon中添加标签
example@manager:~$ sudo vi /lib/systemd/system/docker.service ExecStart=/usr/bin/dockerd -H 0.0.0.0:2375 -H unix:///var/run/docker.sock --label hostname=manage
-
-
指定运行节点
example@manager:~$ docker service create --replicas 2 --constraint 'node.labels.role == worker1' --name hello-swarm httpd:latest rfz6aocpi9bh4foq4wzw1bl3x overall progress: 2 out of 2 tasks 1/2: running [==================================================>] 2/2: running [==================================================>] verify: Service converged example@g160402:~$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES b098a29fc83b httpd:latest "httpd-foreground" 7 seconds ago Up 6 seconds 80/tcp hello-swarm.1.b8g0xoo53w4adyvf9mdl1hozd d2cfd7a650c3 httpd:latest "httpd-foreground" 7 seconds ago Up 6 seconds 80/tcp hello-swarm.2.6nvq8ntrfs04i1mx0wiy5f92h
-
-
容器异常退出或删除后,manager节点会再次启动新的服务,同时记录退出日志
example@manager:~$ docker service ps hello-swarm ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS b8g0xoo53w4a hello-swarm.1 httpd:latest g160402 Running Running 4 minutes ago 6nvq8ntrfs04 hello-swarm.2 httpd:latest g160402 Running Running 4 minutes ago example@g160402:~$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES b098a29fc83b httpd:latest "httpd-foreground" 3 minutes ago Up 3 minutes 80/tcp hello-swarm.1.b8g0xoo53w4adyvf9mdl1hozd d2cfd7a650c3 httpd:latest "httpd-foreground" 3 minutes ago Up 3 minutes 80/tcp hello-swarm.2.6nvq8ntrfs04i1mx0wiy5f92h example@g160402:~$ docker rm -f d2cfd7a650c3 d2cfd7a650c3 example@g160402:~$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES b098a29fc83b httpd:latest "httpd-foreground" 4 minutes ago Up 4 minutes 80/tcp hello-swarm.1.b8g0xoo53w4adyvf9mdl1hozd example@manager:~$ docker service ps hello-swarm ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS b8g0xoo53w4a hello-swarm.1 httpd:latest g160402 Running Running 5 minutes ago q28kvehhdcun hello-swarm.2 httpd:latest g160402 Running Running 7 seconds ago 6nvq8ntrfs04 \_ hello-swarm.2 httpd:latest g160402 Shutdown Failed 13 seconds ago "task: non-zero exit (137)" example@g160402:~$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES d35b717ddc46 httpd:latest "httpd-foreground" 19 seconds ago Up 14 seconds 80/tcp hello-swarm.2.q28kvehhdcunpi3h5e4a12679 b098a29fc83b httpd:latest "httpd-foreground" 5 minutes ago Up 5 minutes 80/tcp hello-swarm.1.b8g0xoo53w4adyvf9mdl1hozd
-
节点的升级与降级
-
"MANAGER STATUS"状态说明:
Leader
:为群体做出所有群管理和编排决策的主要管理者节点Reachable
:如果 Leader 节点变为不可用,该节点有资格被选举为新的 LeaderUnavailable
:该节点不能和其他 Manager 节点产生任何联系,这种情况下,应该添加一个新的 Manager 节点到集群,或者将一个 Worker 节点提升为 Manager 节点
-
将g160402节点升级,此时该节点可以执行manager节点的可执行命令,“MANAGER STATUS"变为“Reachable”
example@manager:~$ docker node promote g160402 u180402 Node g160402 promoted to a manager in the swarm. Node u180402 promoted to a manager in the swarm. example@g160402:~$ docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION kl6siwciwca88y6sp8mhku38p * g160402 Ready Active Reachable 18.06.1-ce uyoiijq9vtdi9f6tvkr4wuqh9 manager Ready Active Leader 18.09.5 ffm3ttsc31l4tiwa4lyu7vol4 u180402 Ready Active 18.06.1-ce example@g160402:~$ docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION kl6siwciwca88y6sp8mhku38p * g160402 Ready Active Leader 18.06.1-ce uyoiijq9vtdi9f6tvkr4wuqh9 manager Unknown Active Unreachable 18.09.5 ffm3ttsc31l4tiwa4lyu7vol4 u180402 Ready Active Reachable 18.06.1-ce
-
节点降级
example@manager:~$ docker node demote g160402 Manager g160402 demoted in the swarm. example@manager:~$ docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION kl6siwciwca88y6sp8mhku38p g160402 Ready Active 18.06.1-ce uyoiijq9vtdi9f6tvkr4wuqh9 * manager Ready Active Leader 18.09.5 ffm3ttsc31l4tiwa4lyu7vol4 u180402 Ready Active 18.06.1-ce
-
-
Docker stack
-
指令用法
参数 说明 deploy 新建或更新一个栈 ls 列出已存在的栈列表 ps 列出栈堆中的任务 rm 删除一个或多个栈 services 列出栈堆中的服务 -
启动一个服务
example@manager:/data/@stack/giot$ pwd /data/@stack/giot example@manager:/data/@stack/giot$ ls docker-compose.yml #创建自定义网络 example@manager:~/docker$ docker network create --driver overlay giot_network 7sfjbimchcmhh1336v075y4d9 example@manager:/data/@stack/giot$ cat docker-compose.yml version: "3" services: nginx: image: nginx:1.15.8-alpine deploy: replicas: 2 resources: limits: cpus: "0.1" memory: 50M placement: constraints: - node.labels.role == worker1 restart_policy: condition: on-failure ports: - 80:80/tcp volumes: - /data/containers/nginx/etc/nginx/nginx.conf:/etc/nginx/nginx.conf:ro - /data/containers/nginx/etc/nginx/conf.d:/etc/nginx/conf.d - /dev/log:/dev/log - /var/log/nginx:/var/log/nginx - /data:/data - /etc/localtime:/etc/localtime:ro networks: - giot_network networks: giot_network: external: true example@manager:/data/@stack/giot$ docker stack deploy -c docker-compose.yml giot Creating network giot_default Creating service giot_nginx example@g160402:/data/containers/nginx$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES b7b8f7d57a24 nginx:1.15.8-alpine "nginx -g 'daemon of…" 9 seconds ago Up 7 seconds 80/tcp test_nginx.1.x9262cydwiwr6au792z3m39xg be5b8aae70ee nginx:1.15.8-alpine "nginx -g 'daemon of…" 9 seconds ago Up 7 seconds 80/tcp test_nginx.2.uf5s1xi537h6k6qkea5wunu3m
-
附录1:docker service 参数列表
简写 | 参数 | 参数类型 | 描述 | 默认值 |
---|---|---|---|---|
–config | config | 指定给服务的配置 | ||
–constraint | list | 约束条件 | ||
–container-label | list | 容器标签 | ||
–credential-spec | credential-spec | 托管服务帐户的凭证规范(限Windows) | ||
-d | –detach | 立即退出,而不是等待服务的收敛 | ||
–dns | list | 设置自定义的 DNS servers | ||
–dns-option | list | 设置 DNS 参数 | ||
–dns-search | list | 设置自定义的DNS搜索域 | ||
–endpoint-mode | string | 端点模式 (vip or dnsrr) | vip | |
–entrypoint | command | 覆盖镜像默认的 ENTRYPOINT | ||
-e | –env | list | 设置环境变量 | |
–env-file | list | 从文件中读取环境变量 | ||
–generic-resource | list | 用户定义的资源 | ||
–group | list | 为容器设置一个或多个不同的用户组 | ||
–health-cmd | string | 检查健康状况的命令行 | ||
–health-interval | duration | 健康检查的时间间隔 (ms/s/m/h) | ||
–health-retries | int | 报告不健康的连续失败次数 | ||
–health-start-period | duration | 在重新计数到不稳定之前,容器初始化的时间 (ms/s/m/h) | ||
–health-timeout | duration | 一次检查的最长允许时间 (ms/s/m/h) | ||
–host | list | 设置一个或多个 host-to-IP 映射 (host:ip) | ||
–hostname | string | 容器主机名 | ||
–isolation | string | 服务容器隔离模式 | ||
-l | –label | list | 服务标签 | |
–limit-cpu | decimal | CPUs 限制 | ||
–limit-memory | bytes | 内存限制 | ||
–log-driver | string | 服务的日志驱动 | ||
–log-opt | list | 日志驱动参数 | ||
–mode | string | 服务模式 (replicated or global) | replicated | |
–mount | mount | 将文件系统挂载到服务 | ||
–name | string | 服务名称 | ||
–network | network | 服务网络 | ||
–no-healthcheck | 禁用任何容器指定的健康检查 | |||
–no-resolve-image | 不要查询注册表来解决图像摘要和支持的平台 | |||
–placement-pref | pref | 添加偏好设置 | ||
-p | –publish | port | 发布一个端口作为节点端口 | |
-q | –quiet | 简化进度输出 | ||
–read-only | 将容器的根文件系统挂载为只读 | |||
–replicas | uint | 任务的数量(即容器副本数量) | 1 | |
–reserve-cpu | decimal | 保留 CPUs | ||
–reserve-memory | bytes | 保留内存 | ||
–restart-condition | string | 重启条件 (“none”、“on-failure”、“any”) | any | |
–restart-delay | duration | 重启延时(ns/us/ms/s/m/h) | 5s | |
–restart-max-attempts | uint | 放弃之前的最大重启次数 | ||
–restart-window | duration | 用于评估重新启动策略的窗口(ns/us/ms/s/m/h) | ||
–rollback-delay | duration | 任务回滚延时(ns/us/ms/s/m/h) | 0s | |
–rollback-failure-action | string | 回滚失败的操作(“pause”、“continue”) | pause | |
–rollback-max-failure-ratio | float | 在回滚期间容忍的故障率 | 0 | |
–rollback-monitor | duration | 每个任务回滚之后的持续时间以监控失败 (ns/us/ms/s/m/h) | 5s | |
–rollback-order | string | 回滚顺序 (“start-first”/“stop-first”) | stop-first | |
–rollback-parallelism | uint | 最大数量的任务同时回滚 (0 代表同时回滚所有) | 1 | |
–secret | secret | 指定给服务的安全机制 | ||
–stop-grace-period | duration | 结束一个容器之前等待的时间 (ns/us/ms/s/m/h) | 10s | |
–stop-signal | string | 停止容器的信号 | ||
-t | –tty | 分配一个 pseudo-TTY | ||
–update-delay | duration | 更新延迟时间 (ns/us/ms/s/m/h) | 0s | |
–update-failure-action | string | 更新失败的动作(“pause”、“continue”、“rollback”) | pause | |
–update-max-failure-ratio | float | 在更新期间容忍的失败率 | 0 | |
–update-monitor | duration | 每个任务更新后的持续时间以监控失败(ns/us/ms/s/m/h) | 5s | |
–update-order | string | 更新顺序 (“start-first”、“stop-first”) | stop-first | |
–update-parallelism | uint | 最大数量的任务同时更新(0 代表同时更新所有) | 1 | |
-u | –user | string | Username 或 UID (format: <name/uid>[:<group/gid>]) | |
–with-registry-auth | 发送认证信息给 Swarm 代理 | |||
-w | –workdir | string | 容器内的工作目录 |