实战踩坑记:KubernetesV1.10 集群的建立

本文详细介绍如何在三台Ubuntu16.04机器上搭建Kubernetes集群,包括环境初始化、Keepalived配置、etcd集群搭建、Docker安装、Kubernetes初始化及节点加入等关键步骤。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

准备工作

  1. 准备三台机器,使用的系统是ubuntu1604。
  2. 做好地址规划,三台机器同网段。
主机名称IP备注
node01192.168.175.96master and etcd
node02192.168.175.101master and etcd
node03192.168.175.57master and etcd
VIP192.168.175.120非主机只是一个虚拟IP
  1. 准备好相对应的软件版本。
    docker 18.06.1-ce 容器软件
    kubelet v1.10.3 在kubernetes集群中,每个Node节点都会启动kubelet进程,用来处理Master节点下发到本节点的任务,管理Pod和其中的容器。kubelet会在API Server上注册节点信息,定期向Master汇报节点资源使用情况,并通过cAdvisor监控容器和节点资源。可以把kubelet理解成【Server-Agent】架构中的agent,是Node上的pod管家
    kubectl v1.10.3 Kubernetes提供的kubectl命令是与集群交互最直接的方式
    kubeadm v1.10.3 kubeadm是Kubernetes 1.4开始新增的特性,用于快速搭建Kubernetes集群环境,两个命令就能把一个k8s集群搭建起来。
    etcd v3.3.5 是一个键值存储仓库,用于配置共享和服务发现。
    注意以上软件的搭建日期为2019年的3月,随着软件的更新,按照以下步骤进行的操作可能会因为版本的问题报错,请参考此文章的读者,出现版本更新后请注意阅读相关软件官网的文档。

1. 环境初始化

1.1 修改主机名(所有机器执行)
hostnamectl set-hostname node01
hostnamectl set-hostname node02
hostnamectl set-hostname node03
1.2 配置主机映射(所有机器执行)

通过cat命令追加主机列表
cat << EOF >> /etc/hosts
192.168.175.96 node01
192.168.175.101 node02
192.168.175.57 node03
EOF

1.3 node01上执行ssh免密码登陆配置
ssh-keygen  #一路回车即可
ssh-copy-id  node02
ssh-copy-id  node03
ssh-copy-id  node04

错误处理:

root@ubuntu:/home/wangdong# ssh-copy-id node02
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
The authenticity of host 'node02 (192.168.175.101)' can't be established.
ECDSA key fingerprint is SHA256:WpahJfY7TNgpVHPAlEbd+6ehVUVU7gwBrpx47tqTdq8.
Are you sure you want to continue connecting (yes/no)? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
root@node02's password: 
Permission denied, please try again.

出现这个错误的原因是:node2上面禁止了root用户的远程登录,需要到对应的ssh配置文件中进行更改来允许root用户远程登录

修改ssh服务配置文件,sudo vi /etc/ssh/sshd_config
调整PermitRootLogin参数值为yes,如下图:

 26 # Authentication:
 27 LoginGraceTime 120
 28 PermitRootLogin yes
 29 StrictModes yes

重启ssh的服务

root@ubuntu:/etc/ssh# /etc/init.d/sshd  restart
bash: /etc/init.d/sshd: No such file or directory
root@ubuntu:/etc/ssh# /etc/init.d/ssh  restart
[ ok ] Restarting ssh (via systemctl): ssh.service.

在node1上面执行 ssh-copy-id node02

root@ubuntu:/home/wangdong# ssh-copy-id node02
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
root@node02's password: 

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'node02'"
and check to make sure that only the key(s) you wanted were added.

系统提示成功的增加了一条记录。同样的方法在node3中执行,一样成功,进入下一步。

1.4 三台主机配置设置内核、K8S源、关闭Swap、配置ntp(配置完后重启一次)

#其实我更推荐使用的是ntp服务,而不是ntpdate。两者的区别是ntp是渐进式的同步时间,不会引起程序混乱。
而ntpdate是暴力的跳转到一个时间,可能会引起程序的混乱。

swapoff -a 
sed -i 's/.*swap.*/#&/' /etc/fstab    #禁用swap分区

#加载的是内核模块
modprobe br_netfilter
cat <<EOF >  /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sysctl -p /etc/sysctl.d/k8s.conf
ls /proc/sys/net/bridge

 #https://opsx.alibaba.com/mirror 阿里云地址
apt-get update && apt-get install -y apt-transport-https
curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add - 
cat <<EOF >/etc/apt/sources.list.d/kubernetes.list
deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main 
EOF
apt-get update  #这次更新的是阿里云下载的链接
apt-cache madison kubelet   #使用这个命令来查看对应软件包的版本
apt-cache madison kubelet  |  grep  1.10.3  #用来查看执行版本是否存在,存在可继续
存在会显示如下内容:
 kubelet |  1.10.3-00 | https://mirrors.aliyun.com/kubernetes/apt kubernetes-xenial/main amd64 Packages
apt-get install -y kubelet= 1.10.3-00
apt-get install -y  kubeadm= 1.10.3-00
apt-get install -y kubectl= 1.10.3-00

systemctl enable ntpdate.service   #时间相差不大,可以不做。时间同步也可以使用ntp服务来设置
echo '*/30 * * * * /usr/sbin/ntpdate time7.aliyun.com >/dev/null 2>&1' > /tmp/crontab2.tmp
crontab /tmp/crontab2.tmp
systemctl start ntpdate.service
 
echo "* soft nofile 65536" >> /etc/security/limits.conf
echo "* hard nofile 65536" >> /etc/security/limits.conf
echo "* soft nproc 65536"  >> /etc/security/limits.conf
echo "* hard nproc 65536"  >> /etc/security/limits.conf
echo "* soft  memlock  unlimited"  >> /etc/security/limits.conf
echo "* hard memlock  unlimited"  >> /etc/security/limits.conf

2. 安装、配置keepalived(主节点)

keepalived是一个类似于layer3, 4 & 5交换机制的软件,也就是我们平时说的第3层、第4层和第5层交换。Keepalived是自动完成,不需人工干涉。
keepalived配虚拟ip(vip)的作用
keepalived是以VRRP协议为实现基础的,VRRP全称Virtual Router Redundancy Protocol,即虚拟路由冗余协议。
虚拟路由冗余协议,可以认为是实现路由器高可用的协议,即将N台提供相同功能的路由器组成一个路由器组,这个组里面有一个master和多个backup,
master上面有一个对外提供服务的vip(VIP = Virtual IP Address,虚拟IP地址,该路由器所在局域网内其他机器的默认路由为该vip),master会发组播,
当backup收不到VRRP包时就认为master宕掉了,这时就需要根据VRRP的优先级来选举一个backup当master。这样的话就可以保证路由器的高可用了。
可以理解为所有keepalived节点统一一个ip,一个down掉,其他节点变为主服务器。

2.1 安装keepalived 这一个主要是负载均衡的时候使用的,可以跳过去
apt install -y keepalived
systemctl enable keepalived
node01的keepalived.conf
cat <<EOF > /etc/keepalived/keepalived.conf
global_defs {
   router_id LVS_k8s
}

vrrp_script CheckK8sMaster {
    script "curl -k https://192.168.175.120:6443"
    interval 3
    weight 3
    timeout 9
}

vrrp_instance VI_1 {
    state MASTER
    interface ens160
    virtual_router_id 61
    priority 100
    advert_int 1
    mcast_src_ip 192.168.175.96
    nopreempt
    authentication {
        auth_type PASS
        auth_pass sqP05dQgMSlzrxHj
    }
    unicast_peer {
        192.168.175.101
        192.168.175.57
    }
    virtual_ipaddress {
        192.168.175.120/24
    }
    track_script {
        CheckK8sMaster
    }

}
EOF
node02的keepalived.conf
cat <<EOF > /etc/keepalived/keepalived.conf
global_defs {
   router_id LVS_k8s
}

global_defs {
   router_id LVS_k8s
}

vrrp_script CheckK8sMaster {
    script "curl -k https://192.168.175.120:6443"
    interval 3
    weight 3
    timeout 9
}

vrrp_instance VI_1 {
    state BACKUP
    interface ens160
    virtual_router_id 61
    priority 90
    advert_int 1
    mcast_src_ip 192.168.175.101
    nopreempt
    authentication {
        auth_type PASS
        auth_pass sqP05dQgMSlzrxHj
    }
    unicast_peer {
        192.168.175.96
        192.168.175.57
    }
    virtual_ipaddress {
        192.168.175.120/24
    }
    track_script {
        CheckK8sMaster
    }

}
EOF
node03的keepalived.conf
cat <<EOF > /etc/keepalived/keepalived.conf
global_defs {
   router_id LVS_k8s
}

global_defs {
   router_id LVS_k8s
}

vrrp_script CheckK8sMaster {
    script "curl -k https://192.168.175.120:6443"
    interval 3
    weight 3
    timeout 9
}

vrrp_instance VI_1 {
    state BACKUP
    interface ens160
    virtual_router_id 61
    priority 80
    advert_int 1
    mcast_src_ip 192.168.175.57
    nopreempt
    authentication {
        auth_type PASS
        auth_pass sqP05dQgMSlzrxHj
    }
    unicast_peer {
        192.168.175.96
        192.168.175.101
    }
    virtual_ipaddress {
        192.168.175.120/24
    }
    track_script {
        CheckK8sMaster
    }

}
EOF
2.2 启动keepalived
systemctl restart keepalived
可以看到VIP已经绑定到node01上面了,可以ping通VIP地址
root@node01:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 00:0c:29:0b:06:c2 brd ff:ff:ff:ff:ff:ff
    inet 192.168.175.96/24 brd 192.168.175.255 scope global ens160
       valid_lft forever preferred_lft forever
    inet 192.168.175.120/24 scope global secondary ens160
       valid_lft forever preferred_lft forever
    inet6 fe80::20c:29ff:fe0b:6c2/64 scope link 
       valid_lft forever preferred_lft forever

3. 创建etcd证书(node01上执行即可)

3.1 设置cfssl环境 CFSSL是CloudFlare开源的一款PKI/TLS工具。

CFSSL 包含一个命令行工具 和一个用于 签名,验证并且捆绑TLS证书的 HTTP API 服务。 使用Go语言编写。

wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64
wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64
wget https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64
chmod +x cfssl_linux-amd64
mv cfssl_linux-amd64 /usr/local/bin/cfssl
chmod +x cfssljson_linux-amd64
mv cfssljson_linux-amd64 /usr/local/bin/cfssljson
chmod +x cfssl-certinfo_linux-amd64
mv cfssl-certinfo_linux-amd64 /usr/local/bin/cfssl-certinfo
export PATH=/usr/local/bin:$PATH
3.2 创建 CA 配置文件(下面配置的IP为etc节点的IP)
mkdir /root/ssl
cd /root/ssl
cat >  ca-config.json <<EOF
{
"signing": {
"default": {
  "expiry": "8760h"
},
"profiles": {
  "kubernetes-Soulmate": {
    "usages": [
        "signing",
        "key encipherment",
        "server auth",
        "client auth"
    ],
    "expiry": "8760h"
  }
}
}
}
EOF

cat >  ca-csr.json <<EOF
{
"CN": "kubernetes-Soulmate",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
  "C": "CN",
  "ST": "shanghai",
  "L": "shanghai",
  "O": "k8s",
  "OU": "System"
}
]
}
EOF

cfssl gencert -initca ca-csr.json | cfssljson -bare ca

cat > etcd-csr.json <<EOF
{
  "CN": "etcd",
  "hosts": [
    "127.0.0.1",
    "192.168.175.96",
    "192.168.175.101",
    "192.168.175.57"
  ],
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [
    {
      "C": "CN",
      "ST": "shanghai",
      "L": "shanghai",
      "O": "k8s",
      "OU": "System"
    }
  ]
}
EOF

cfssl gencert -ca=ca.pem \
  -ca-key=ca-key.pem \
  -config=ca-config.json \
  -profile=kubernetes-Soulmate etcd-csr.json | cfssljson -bare etcd
3.3 node01分发etcd证书到node02、node03上面
mkdir -p /etc/etcd/ssl  #三台机器上分别执行
cp etcd.pem etcd-key.pem ca.pem /etc/etcd/ssl/
scp -r /etc/etcd/ssl/*.pem node02:/etc/etcd/ssl/
scp -r /etc/etcd/ssl/*.pem node03:/etc/etcd/ssl/
3.4 安装配置etcd (三主节点)ETCD是用于共享配置和服务发现的分布式,一致性的KV存储系统。
安装etcd
apt install etcd -y
node01的etcd.service
cat <<EOF >/etc/systemd/system/etcd.service
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
Documentation=https://github.com/coreos

[Service]
Type=notify
WorkingDirectory=/var/lib/etcd/
ExecStart=/usr/bin/etcd   --name node01   --cert-file=/etc/etcd/ssl/etcd.pem   --key-file=/etc/etcd/ssl/etcd-key.pem   --peer-cert-file=/etc/etcd/ssl/etcd.pem   --peer-key-file=/etc/etcd/ssl/etcd-key.pem   --trusted-ca-file=/etc/etcd/ssl/ca.pem   --peer-trusted-ca-file=/etc/etcd/ssl/ca.pem   --initial-advertise-peer-urls https://192.168.175.96:2380   --listen-peer-urls https://192.168.175.96:2380   --listen-client-urls https://192.168.175.96:2379,http://127.0.0.1:2379   --advertise-client-urls https://192.168.175.96:2379   --initial-cluster-token etcd-cluster-0   --initial-cluster node01=https://192.168.175.96:2380,node02=https://192.168.175.101:2380,node03=https://192.168.175.57:2380   --initial-cluster-state new   --data-dir=/var/lib/etcd
Restart=on-failure
RestartSec=5
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF
node02的etcd.service
cat <<EOF >/etc/systemd/system/etcd.service
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
Documentation=https://github.com/coreos

[Service]
Type=notify
WorkingDirectory=/var/lib/etcd/
ExecStart=/usr/bin/etcd   --name node02   --cert-file=/etc/etcd/ssl/etcd.pem   --key-file=/etc/etcd/ssl/etcd-key.pem   --peer-cert-file=/etc/etcd/ssl/etcd.pem   --peer-key-file=/etc/etcd/ssl/etcd-key.pem   --trusted-ca-file=/etc/etcd/ssl/ca.pem   --peer-trusted-ca-file=/etc/etcd/ssl/ca.pem   --initial-advertise-peer-urls https://192.168.175.101:2380   --listen-peer-urls https://192.168.175.101:2380   --listen-client-urls https://192.168.175.101:2379,http://127.0.0.1:2379   --advertise-client-urls https://192.168.175.101:2379   --initial-cluster-token etcd-cluster-0   --initial-cluster node01=https://192.168.175.96:2380,node02=https://192.168.175.101:2380,node03=https://192.168.175.57:2380   --initial-cluster-state new   --data-dir=/var/lib/etcd
Restart=on-failure
RestartSec=5
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF
node03的etcd.service
cat <<EOF >/etc/systemd/system/etcd.service
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
Documentation=https://github.com/coreos

[Service]
Type=notify
WorkingDirectory=/var/lib/etcd/
ExecStart=/usr/bin/etcd   --name node03   --cert-file=/etc/etcd/ssl/etcd.pem   --key-file=/etc/etcd/ssl/etcd-key.pem   --peer-cert-file=/etc/etcd/ssl/etcd.pem   --peer-key-file=/etc/etcd/ssl/etcd-key.pem   --trusted-ca-file=/etc/etcd/ssl/ca.pem   --peer-trusted-ca-file=/etc/etcd/ssl/ca.pem   --initial-advertise-peer-urls https://192.168.175.57:2380   --listen-peer-urls https://192.168.175.57:2380   --listen-client-urls https://192.168.175.57:2379,http://127.0.0.1:2379   --advertise-client-urls https://192.168.175.57:2379   --initial-cluster-token etcd-cluster-0 --initial-cluster node01=https://192.168.175.96:2380,node02=https://192.168.175.101:2380,node03=https://192.168.175.57:2380   --initial-cluster-state new   --data-dir=/var/lib/etcd
Restart=on-failure
RestartSec=5
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF
添加自启动(etc集群最少2个节点才能启动,启动报错看mesages日志)
systemctl daemon-reload
systemctl enable etcd
systemctl start etcd
systemctl status etcd

注意执行 systemctl enable etcd 的时候会报错,错误提示如下:
root@node01:~/ssl# systemctl enable etcd
Synchronizing state of etcd.service with SysV init with /lib/systemd/systemd-sysv-install…
Executing /lib/systemd/systemd-sysv-install enable etcd
Failed to execute operation: File exists
错误的意思是
etcd的service上曾经enable,因为文件存在着,所以创建软链接失败.解决办法是先disable之前的那个etcd的service,再enable etcd.service

 systemctl disable etcd.service
 systemctl enable etcd.service
 systemctl enable etcd

这样就会正常提示如下内容:
root@node01:~/ssl# systemctl enable etcd
Synchronizing state of etcd.service with SysV init with /lib/systemd/systemd-sysv-install…
Executing /lib/systemd/systemd-sysv-install enable etcd

在三个etcd节点执行一下命令检查(先做完下一步的etcd升级再执行本步)

访问ETCD要使用证书,以下搜索的相关资料,如果执行下面的步骤出现Error: context deadline exceeded 就可以参照下面这部分的内容来进行排查:

k8s现在使用的是etcd v3,必须提供ca、key、cert,否则会出现Error: context deadline exceeded

不加--endpoint参数时,默认访问的127.0.0.1:2379,而使用--endpoint参数时,必须提供ca,key,cert。

[root@k8s-test2 ~]# etcdctl endpoint health 
127.0.0.1:2379 is healthy: successfully committed proposal: took = 939.097µs
 
[root@k8s-test2 ~]# etcdctl --endpoints=https://10.0.26.152:2379 endpoint health 
https://10.0.26.152:2379 is unhealthy: failed to connect: context deadline exceeded
 
[root@k8s-test2 ~]# etcdctl --endpoints=https://10.0.26.152:2379 --cacert=/etc/k8s/ssl/etcd-root-ca.pem --key=/etc/k8s/ssl/etcd-key.pem  --cert=/etc/k8s/ssl/etcd.pem  endpoint health 
https://10.0.26.152:2379 is healthy: successfully committed proposal: took = 1.001505ms

下面是带证书测试的命令,注意V2版本和V3版本的不一样

v2.0的命令  (会报错,提示:Incorrect Usage. 若报错就先升级到3.0然后执行3.0的命令格式)
etcdctl --endpoints=https://192.168.175.96:2379,https://192.168.175.101:2379,https://192.168.175.57:2379 \
  --ca-file=/etc/etcd/ssl/ca.pem \
  --cert-file=/etc/etcd/ssl/etcd.pem \
  --key-file=/etc/etcd/ssl/etcd-key.pem  cluster-health

如果使用v2.0的命令,在v3.0中执行,就会出现错误提示,例如unnkown usage 之类的。

V3.0的命令  将etcd升级到v3.0,使用证书来测试  在三个etc节点上面都执行一下该测试
etcdctl --endpoints=https://192.168.175.96:2379,https://192.168.175.101:2379,https://192.168.175.57:2379  --cacert=/etc/etcd/ssl/ca.pem --key=/etc/etcd/ssl/etcd-key.pem  --cert=/etc/etcd/ssl/etcd.pem  endpoint health 
测试成功的提示为
https://192.168.175.101:2379 is healthy: successfully committed proposal: took = 1.912224ms
https://192.168.175.57:2379 is healthy: successfully committed proposal: took = 2.274874ms
https://192.168.175.96:2379 is healthy: successfully committed proposal: took = 2.293437ms
etcd升级(apt安装的版本为v2.2.5,kubernetes v1.10要求版本最低为3.1)
官网下载最新安装包
wget https://github.com/coreos/etcd/releases/download/v3.3.5/etcd-v3.3.5-linux-amd64.tar.gz
tar zxf etcd-v3.3.5-linux-amd64.tar.gz
执行下面语句的时候先进入对应的目录把同名旧文件做一下备份,用mv命令就可以
mv /usr/bin/etcd /usr/bin/etcd_v2
mv /usr/bin/etcdctl /usr/bin/etcdctl_v2
cp etcd-v3.3.5-linux-amd64/etcd /usr/bin/etcd
cp etcd-v3.3.5-linux-amd64/etcdctl /usr/bin/etcdctl
在/etc/profile文件添加以下一行,重启服务器
export ETCDCTL_API=3

source /etc/profile

systemctl restart etcd

重启etcd服务,并产看集群状态
(可在所有的机器上面查看集群的情况)
root@k8s-n2:~/k8s# etcdctl member list
aa76456e260f7bd1, started, node02, https://192.168.175.101:2380, https://192.168.175.101:2379
d12950b45efa96da, started, node03, https://192.168.175.57:2380, https://192.168.175.57:2379
e598ba1c84356928, started, node01, https://192.168.175.96:2380, https://192.168.175.96:2379

4. 安装docker(三台机器都需要做)

curl -fsSL "https://get.docker.com/" | sh

(可以使用apt-get install docker.io

4.1 docker 添加阿里云代理,修改配置文件/lib/systemd/system/docker.service
ExecStart=/usr/bin/dockerd -H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock --registry-mirror=https://ms3cfraz.mirror.aliyuncs.com
4.2 启动docker,添加开机自启动
systemctl daemon-reload
systemctl restart docker
systemctl enable docker
systemctl status docker

5. 配置kubeadm

所有节点修改kubelet配置文件
/etc/systemd/system/kubelet.service.d/10-kubeadm.conf
#添加这一行
Environment="KUBELET_CGROUP_ARGS=--cgroup-driver=cgroupfs"
#添加这一行
Environment="KUBELET_EXTRA_ARGS=--v=2 --fail-swap-on=false --pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/k8sth/pause-amd64:3.0"
所有节点修改完配置文件一定要重新加载配置
systemctl daemon-reload
systemctl enable kubelet
systemctl restart kubelet

6. 初始化集群

6.1 node01、node02、node03添加集群初始配置文件(集群配置文件一样)
cat <<EOF > config.yaml 
apiVersion: kubeadm.k8s.io/v1alpha1
kind: MasterConfiguration
etcd:
  endpoints:
  - https://192.168.175.96:2379
  - https://192.168.175.101:2379
  - https://192.168.175.57:2379
  caFile: /etc/etcd/ssl/ca.pem
  certFile: /etc/etcd/ssl/etcd.pem
  keyFile: /etc/etcd/ssl/etcd-key.pem
  dataDir: /var/lib/etcd
networking:
  podSubnet: 10.244.0.0/16
kubernetesVersion: 1.10.0
api:
  advertiseAddress: "192.168.175.120"
token: "b99a00.a144ef80536d4344"
tokenTTL: "0s"
apiServerCertSANs:
- node01
- node02
- node03
- 192.168.175.96
- 192.168.175.101
- 192.168.175.57
- 192.168.175.99
- 192.168.175.120
featureGates:
  CoreDNS: true
imageRepository: "registry.cn-hangzhou.aliyuncs.com/k8sth"
EOF
6.2 首先node01初始化集群
配置文件定义podnetwork是10.244.0.0/16
kubeadm init --hlep可以看出,service默认网段是10.96.0.0/12
kubeadm init --help
找到下面这行
--service-cidr string                  Use alternative range of IP address for service VIPs. (default "10.96.0.0/12")
/etc/systemd/system/kubelet.service.d/10-kubeadm.conf默认dns地址cluster-dns=10.96.0.10
kubeadm init --config config.yaml 

注意:下载kubectl、kubeadm、kubelet时没有加上指定版本号的,执行以上语句,提示如下的错误

kubeadm init --config config.yaml
your configuration file uses an old API spec: "kubeadm.k8s.io/v1alpha1". Please use kubeadm v1.11 instead and run 'kubeadm config migrate --old-config old.yaml --new-config new.yaml', which will write the new, similar spec using a newer API version.

原因解析,是原文档的集群初始配置文件的API为apiVersion: kubeadm.k8s.io/v1alpha1;而在新版里面的API已经升级成为v1beta1。

补救方法一
而要想让v1alpha1升级成为v1beta1,需要几次版本更新才可以。升级的路径如下:
v1alpha1—v1alpha2----v1alpha3----v1beta1
升级文档对应的官方地址为:
https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta1
从旧kubeadm配置版本迁移
请使用kubeadm v1.13.x的“kubeadm配置迁移”命令将您的v1alpha 3配置文件转换为v1beta 1(从kubeadm配置文件的旧版本转换到v1beta 1需要更早版本的kubeadm。

kubeadm v1.11 should be used to migrate v1alpha1 to v1alpha2; kubeadm v1.12 should be used to translate v1alpha2 to v1alpha3)

需要下载对应的不同版本的kubeadm,非常的繁琐。采用这个办法可以先下载对应版本的kubeadm来进行逐步升级。

补救方法二
先卸载掉keburnetes 的三个软件包,重新安装1.10.13版本。
需要的三个命令是:
1、卸载命令
apt-get --purge remove kubelet
apt-get --purge remove kubeadm
apt-get --purge remove kubectl
2、查找kubeadm的历史版本信息:
apt-cache madison kubeadm
3、下载对应历史信息的版本:
格式为:apt-get install <>=<>
例如:apt-get install kubeadm=1.10.13-00

初始化失败后处理办法
kubeadm reset
#或
rm -rf /etc/kubernetes/*.conf
rm -rf /etc/kubernetes/manifests/*.yaml
docker ps -a |awk '{print $1}' |xargs docker rm -f
systemctl  stop kubelet
初始化正常的结果如下
Your Kubernetes master has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of machines by running the following on each node
as root:

  kubeadm join 192.168.175.120:6443 --token b99a00.a144ef80536d4344 --discovery-token-ca-cert-hash sha256:a2551d730098fe59c8f0f9d77e07ab9e1ceb2d205678e4780826e8b7cc32aacf
6.3 node01上面执行如下命令
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
6.4 kubeadm生成证书密码文件分发到node02和node03上面去
scp -r /etc/kubernetes/pki  node03:/etc/kubernetes/
scp -r /etc/kubernetes/pki  node02:/etc/kubernetes/

这一步完成之后,可以在剩余的两台机器上面执行初始化集群
kubeadm init --config config.yaml
参照上面的初始化提示,能够初始化成功之后,就可以继续了。

6.5 部署flannel网络,只需要在node01执行就行
wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
#版本信息:quay.io/coreos/flannel:v0.10.0-amd64
kubectl create -f  kube-flannel.yml
执行命令

执行 kubectl get node 命令,刚开始是 Notready 状态,等候几分钟后

[root@node01 ~]# kubectl   get node
NAME      STATUS    ROLES     AGE       VERSION
node01    Ready     master    50m       v1.10.3
node02    Ready     master    44m       v1.10.3
node03    Ready     master    43m       v1.10.3
[root@node01 ~]# kubectl   get pods --all-namespaces
NAMESPACE     NAME                             READY     STATUS    RESTARTS   AGE
kube-system   coredns-7997f8864c-4x7mg         1/1       Running   0          29m
kube-system   coredns-7997f8864c-zfcck         1/1       Running   0          29m
kube-system   kube-apiserver-node01            1/1       Running   0          29m
kube-system   kube-controller-manager-node01   1/1       Running   0          30m
kube-system   kube-flannel-ds-hw2xb            1/1       Running   0          1m
kube-system   kube-proxy-s265b                 1/1       Running   0          29m
kube-system   kube-scheduler-node01            1/1       Running   0          30m

以上步骤完成率之后,就可以确定k8s基本是安装完成了。

6.6 部署dashboard (这是图形化的管理集群的工具,可以不用安装)
kubernetes-dashboard.yaml文件内容如下
# Copyright 2017 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Configuration to deploy release version of the Dashboard UI compatible with
# Kubernetes 1.8.
#
# Example usage: kubectl create -f <this_file>

# ------------------- Dashboard Secret ------------------- #

apiVersion: v1
kind: Secret
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard-certs
  namespace: kube-system
type: Opaque

---
# ------------------- Dashboard Service Account ------------------- #

apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kube-system

---
# ------------------- Dashboard Role & Role Binding ------------------- #

kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: kubernetes-dashboard-minimal
  namespace: kube-system
rules:
  # Allow Dashboard to create 'kubernetes-dashboard-key-holder' secret.
- apiGroups: [""]
  resources: ["secrets"]
  verbs: ["create"]
  # Allow Dashboard to create 'kubernetes-dashboard-settings' config map.
- apiGroups: [""]
  resources: ["configmaps"]
  verbs: ["create"]
  # Allow Dashboard to get, update and delete Dashboard exclusive secrets.
- apiGroups: [""]
  resources: ["secrets"]
  resourceNames: ["kubernetes-dashboard-key-holder", "kubernetes-dashboard-certs"]
  verbs: ["get", "update", "delete"]
  # Allow Dashboard to get and update 'kubernetes-dashboard-settings' config map.
- apiGroups: [""]
  resources: ["configmaps"]
  resourceNames: ["kubernetes-dashboard-settings"]
  verbs: ["get", "update"]
  # Allow Dashboard to get metrics from heapster.
- apiGroups: [""]
  resources: ["services"]
  resourceNames: ["heapster"]
  verbs: ["proxy"]
- apiGroups: [""]
  resources: ["services/proxy"]
  resourceNames: ["heapster", "http:heapster:", "https:heapster:"]
  verbs: ["get"]

---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: kubernetes-dashboard-minimal
  namespace: kube-system
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: kubernetes-dashboard-minimal
subjects:
- kind: ServiceAccount
  name: kubernetes-dashboard
  namespace: kube-system

---
# ------------------- Dashboard Deployment ------------------- #

kind: Deployment
apiVersion: apps/v1beta2
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kube-system
spec:
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      k8s-app: kubernetes-dashboard
  template:
    metadata:
      labels:
        k8s-app: kubernetes-dashboard
    spec:
      nodeSelector:
        node-role.kubernetes.io/master: ""
      containers:
      - name: kubernetes-dashboard
        image: registry.cn-hangzhou.aliyuncs.com/k8sth/kubernetes-dashboard-amd64:v1.8.3
        ports:
        - containerPort: 8443
          protocol: TCP
        args:
          - --auto-generate-certificates
          # Uncomment the following line to manually specify Kubernetes API server Host
          # If not specified, Dashboard will attempt to auto discover the API server and connect
          # to it. Uncomment only if the default does not work.
          # - --apiserver-host=http://my-address:port
        volumeMounts:
        - name: kubernetes-dashboard-certs
          mountPath: /certs
          # Create on-disk volume to store exec logs
        - mountPath: /tmp
          name: tmp-volume
        livenessProbe:
          httpGet:
            scheme: HTTPS
            path: /
            port: 8443
          initialDelaySeconds: 30
          timeoutSeconds: 30
      volumes:
      - name: kubernetes-dashboard-certs
        secret:
          secretName: kubernetes-dashboard-certs
      - name: tmp-volume
        emptyDir: {}
      serviceAccountName: kubernetes-dashboard
      # Comment the following tolerations if Dashboard must not be deployed on master
      tolerations:
      - key: node-role.kubernetes.io/master
        effect: NoSchedule

---
# ------------------- Dashboard Service ------------------- #

kind: Service
apiVersion: v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kube-system
spec:
  type: NodePort
  ports:
    - port: 443
      targetPort: 8443
      nodePort: 30000
  selector:
    k8s-app: kubernetes-dashboard

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: admin-user
  namespace: kube-system

---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: admin-user
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- kind: ServiceAccount
  name: admin-user
  namespace: kube-system
部署
kubectl create -f kubernetes-dashboard.yaml
获取token,通过令牌登陆
kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep admin-user | awk '{print $1}')
通过firefox访问dashboard,输入token,即可登陆
https://192.168.175.96:30000/#!/login
6.7 在node02和node03上面分别执行初始化
kubeadm init --config config.yaml
#初始化的结果和node01的结果完全一样
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
6.8 查看节点信息
[root@node01 ~]# kubectl get nodes
NAME      STATUS    ROLES     AGE       VERSION
node01    Ready     master    5h        v1.10.3
node02    Ready     master    2h        v1.10.3
node03    Ready     master    1h        v1.10.3
[root@node01 ~]# kubectl get pods --all-namespaces -o wide
NAMESPACE     NAME                                    READY     STATUS    RESTARTS   AGE       IP               NODE
kube-system   coredns-7997f8864c-5bvlg                1/1       Running   0          6m        10.244.1.2       node02
kube-system   coredns-7997f8864c-xbq2j                1/1       Running   0          6m        10.244.2.2       node03
kube-system   kube-apiserver-node01                   1/1       Running   3          5m        192.168.175.96   node01
kube-system   kube-apiserver-node02                   1/1       Running   0          1h        192.168.175.101   node02
kube-system   kube-apiserver-node03                   1/1       Running   0          1h        192.168.175.57   node03
kube-system   kube-controller-manager-node01          1/1       Running   3          5m        192.168.175.96   node01
kube-system   kube-controller-manager-node02          1/1       Running   0          1h        192.168.175.101   node02
kube-system   kube-controller-manager-node03          1/1       Running   1          1h        192.168.175.57   node03
kube-system   kube-flannel-ds-gwql9                   1/1       Running   1          1h        192.168.175.96   node01
kube-system   kube-flannel-ds-l8bfs                   1/1       Running   1          1h        192.168.175.101   node02
kube-system   kube-flannel-ds-xw5bv                   1/1       Running   1          1h        192.168.175.57   node03
kube-system   kube-proxy-cwlhw                        1/1       Running   0          1h        192.168.175.57   node03
kube-system   kube-proxy-jz9mk                        1/1       Running   3          5h        192.168.175.96   node01
kube-system   kube-proxy-zdbtc                        1/1       Running   0          2h        192.168.175.101   node02
kube-system   kube-scheduler-node01                   1/1       Running   3          5m        192.168.175.96   node01
kube-system   kube-scheduler-node02                   1/1       Running   0          1h        192.168.175.101   node02
kube-system   kube-scheduler-node03                   1/1       Running   1          1h        192.168.175.57   node03
kube-system   kubernetes-dashboard-7b44ff9b77-chdjp   1/1       Running   0          6m        10.244.2.3       node03
6.9 让master也运行pod(默认master不运行pod)
kubectl taint nodes --all node-role.kubernetes.io/master-

7. 添加node04节点到集群

在node04节点执行如下命令,即可将节点添加进集群
root@node04:~# kubeadm join 192.168.175.120:6443 --token b99a00.a144ef80536d4344 --discovery-token-ca-cert-hash sha256:a2551d730098fe59c8f0f9d77e07ab9e1ceb2d205678e4780826e8b7cc32aacf
[preflight] Running pre-flight checks.
	[WARNING SystemVerification]: docker version is greater than the most recently validated version. Docker version: 18.05.0-ce. Max validated version: 17.03
	[WARNING FileExisting-crictl]: crictl not found in system path
Suggestion: go get github.com/kubernetes-incubator/cri-tools/cmd/crictl
[discovery] Trying to connect to API Server "192.168.175.120:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://192.168.175.120:6443"
[discovery] Requesting info from "https://192.168.175.120:6443" again to validate TLS against the pinned public key
[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "192.168.175.120:6443"
[discovery] Successfully established connection with API Server "192.168.175.120:6443"

This node has joined the cluster:
* Certificate signing request was sent to master and a response
  was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the master to see this node join the cluster.
[root@node01 ~]# kubectl get node
NAME      STATUS    ROLES     AGE       VERSION
node01    Ready     master    45m       v1.10.0
node02    Ready     master    15m       v1.10.0
node03    Ready     master    14m       v1.10.0
node04    Ready     <none>    13m       v1.10.0

k8s.png

k8s1.png

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值