K8S1.21集群搭建_localapiendpoint-优快云博客

本文链接：https://blog.youkuaiyun.com/verastart/article/details/135139536

ku1. 环境配置

（1）关闭防火墙

sudo systemctl disable firewalld
sudo systemctl stop firewalld

（2）关闭selinux

sudo setenforce 0
# 永久关闭 修改/etc/sysconfig/selinux文件设置
sudo sed -i 's/SELINUX=permissive/SELINUX=disabled/' /etc/sysconfig/selinux
sudo sed -i "s/SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config

（3）禁用交换分区

sudo swapoff -a
# 永久禁用，打开/etc/fstab注释掉swap那一行。
sudo sed -i 's/.*swap.*/#&/' /etc/fstab

（4）修改内核参数

对于master节点：

sudo vim /etc/sysctl.d/k8s.conf

net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1
net.ipv6.conf.all.forwarding = 1
vm.swappiness = 0

sudo sysctl --system

对于node节点：

sudo vim /etc/sysctl.d/k8s.conf

net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1

sudo sysctl --system

验证是否生效，均返回 1 即正确：

sysctl -n net.bridge.bridge-nf-call-iptables 
sysctl -n net.bridge.bridge-nf-call-ip6tables

2. master节点环境准备

（1）安装kubeadm，kubectl，kubelet和部署镜像

#!/bin/sh

cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

#安装kubeadm、kubelet、kubectl,注意这里默认安装当前最新版本v1.14.1:
yum install -y kubeadm-1.21.3-0 kubelet-1.21.3-0 kubectl-1.21.3-0
#yum install -y kubeadm kubelet kubectl
systemctl enable kubelet && systemctl start kubelet


# pull all images that k8s needs
ver=v1.21.3
registry=registry.cn-hangzhou.aliyuncs.com/google_containers
images=`kubeadm config images list --kubernetes-version=$ver |awk -F '/' '{print $2}'`
 
for image in $images
do
if [ $image != coredns ];then
    docker pull ${registry}/$image
    if [ $? -eq 0 ];then
        docker tag ${registry}/$image k8s.gcr.io/$image
        docker rmi ${registry}/$image
    else
        echo "ERROR: 下载镜像报错，$image"
    fi
else
    docker pull coredns/coredns:1.8.0
    docker tag coredns/coredns:1.8.0  k8s.gcr.io/coredns/coredns:v1.8.0
    docker rmi coredns/coredns:1.8.0
fi
done

📎install_k8s.sh

拉取镜像的脚本：

#!/bin/bash
# pull images
 
ver=v1.21.3
registry=registry.cn-hangzhou.aliyuncs.com/google_containers
images=`kubeadm config images list --kubernetes-version=$ver |awk -F '/' '{print $2}'`
 
for image in $images
do
if [ $image != coredns ];then
    docker pull ${registry}/$image
    if [ $? -eq 0 ];then
        docker tag ${registry}/$image k8s.gcr.io/$image
        docker rmi ${registry}/$image
    else
        echo "ERROR: 下载镜像报错，$image"
    fi
else
    docker pull coredns/coredns:1.8.0
    docker tag coredns/coredns:1.8.0  k8s.gcr.io/coredns/coredns:v1.8.0
    docker rmi coredns/coredns:1.8.0
fi
done

（2）准备集群初始化配置文件 kubeadm-config.yaml

生成 kubeadm 初始化配置文件

kubeadm config print init-defaults > kubeadm-config.yaml

修改配置文件：

localAPIEndpoint:
  advertiseAddress: 1.2.3.4
# 替换为：
localAPIEndpoint:
  advertiseAddress: 192.168.4.120（master ip）
  name: centos79-node1

kubernetesVersion: 1.21.0
networking:
  dnsDomain: cluster.local
  serviceSubnet: 10.96.0.0/12
# 替换为：
kubernetesVersion: 1.21.3
networking:
  podSubnet: "10.244.0.0/16"
  serviceSubnet: 10.96.0.0/12

（3）初始化集群

测试环境是否正常

kubeadm init phase preflight

I0810 13:46:36.581916   20512 version.go:254] remote version is much newer: v1.22.0; falling back to: stable-1.21
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'

初始化master，10.244.0.0/16 是 flannel 固定使用的 IP 段，设置取决于网络组件要求。

sudo kubeadm init --config=kubeadm-config.yaml --ignore-preflight-errors=2 --upload-certs | tee kubeadm-init.log
（如果出现端口被占用：sudo kubeadm reset）

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
（以上三步依从提示）
(可选)export KUBECONFIG=/etc/kubernetes/admin.conf

输出如下：

W0810 14:55:25.741990   13062 strict.go:54] error unmarshaling configuration schema.GroupVersionKind{Group:"kubeadm.k8s.io", Version:"v1beta2", Kind:"InitConfiguration"}: error unmarshaling JSON: while decoding JSON: json: unknown field "name"
[init] Using Kubernetes version: v1.21.3
[preflight] Running pre-flight checks
        [WARNING Hostname]: hostname "node" could not be reached
        [WARNING Hostname]: hostname "node": lookup node on 223.5.5.5:53: no such host
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local node] and IPs [10.96.0.1 192.168.4.120]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [localhost node] and IPs [192.168.4.120 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [localhost node] and IPs [192.168.4.120 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 17.503592 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.21" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[upload-certs] Using certificate key:
fceedfd1392b27957c5f6345661d62dc09359b61e07f76f444a9e3095022dab4
[mark-control-plane] Marking the node node as control-plane by adding the labels: [node-role.kubernetes.io/master(deprecated) node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node node as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: abcdef.0123456789abcdef
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
 
Your Kubernetes control-plane has initialized successfully!
 
To start using your cluster, you need to run the following as a regular user:
 
  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config
 
Alternatively, if you are the root user, you can run:
 
  export KUBECONFIG=/etc/kubernetes/admin.conf
 
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/
 
Then you can join any number of worker nodes by running the following on each as root:
 
kubeadm join 192.168.4.120:6443 --token abcdef.0123456789abcdef \
        --discovery-token-ca-cert-hash sha256:6ad6978a7e72cfae06c836886276634c87bedfa8ff02e44f574ffb96435b4c2b

（5）部署网络插件

常用网络插件包括weave、flannel，选择一种安装即可。

参考文章：Creating a cluster with kubeadm | Kubernetes

1）weave

export kubever=$(kubectl version | base64 | tr -d '\n')
kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$kubever"

Weave - Kubernetes指南

2）flannel

curl -o kube-flannel.yml https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
kubectl apply -f kube-flannel.yml    # 这里下载镜像非常慢，还是先手动拉下来吧，不行就多试几次
docker pull quay.io/coreos/flannel:v0.14.0
kubectl apply -f kube-flannel.yml

安装后如果节点仍然not ready，显示：container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized

则需要重启kubelet：sudo systemctl restart kubelet

(6) 添加其他master

在当前唯一的master节点上运行如下命令

第一步：

sudo kubeadm init phase upload-certs --upload-certs

执行结果如下：

[root@k8s-master ~]# kubeadm init phase upload-certs --upload-certs
I0122 08:01:15.517893   60086 version.go:252] remote version is much newer: v1.23.2; falling back to: stable-1.18
W0122 08:01:17.049273   60086 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
[upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[upload-certs] Using certificate key:
5d817a5480c54bb079eab4f7b75b4dfe21bd36e059dfb46bf39f724adb3349aa

第二步：

kubeadm token create --print-join-command

执行结果如下

[root@k8s-master ~]# kubeadm token create --print-join-command
W0122 08:01:19.445355   60121 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
kubeadm join 172.16.64.2:6443 --token xhsmiv.ggj00ojs6dvv8b23     --discovery-token-ca-cert-hash sha256:5211bd42a2e81b933b52ec83686f93ae6212542d22d00c621fad20f0dc9592b4

第三步：

将得到的token和key进行拼接，得到如下命令：

kubeadm join 172.16.64.2:6443 --token xhsmiv.ggj00ojs6dvv8b23     
		--discovery-token-ca-cert-hash sha256:5211bd42a2e81b933b52ec83686f93ae6212542d22d00c621fad20f0dc9592b4 
    --control-plane --certificate-key  5d817a5480c54bb079eab4f7b75b4dfe21bd36e059dfb46bf39f724adb3349aa

第四步：

执行以上命令。

可能会出现问题：

[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
error execution phase preflight: 
One or more conditions for hosting a new control plane instance is not satisfied.

unable to add a new control plane instance a cluster that doesn't have a stable controlPlaneEndpoint address

Please ensure that:
* The cluster has a stable controlPlaneEndpoint address.
* The certificates that must be shared among control plane instances are provided.

解决：k8s 高可用集群 -加入第二个master-优快云博客

查看一下 kubeadm-config.yaml

kubectl -n kube-system get cm kubeadm-config -oyaml

发现controlPlaneEndpoint的值是空的，进行修改

kubectl -n kube-system edit cm kubeadm-config

编辑yaml文件，添加位置：

apiVersion: v1
data:
  ClusterConfiguration: |
    apiServer:
      extraArgs:
        authorization-mode: Node,RBAC
      timeoutForControlPlane: 4m0s
    apiVersion: kubeadm.k8s.io/v1beta2
    certificatesDir: /etc/kubernetes/pki
    clusterName: kubernetes
    controlPlaneEndpoint: 192.168.10.75:6443
    controllerManager: {}
	...

（7）其他问题

1) master 上的 Coredns pod 在设置 pod 网络后出现 CrashLoopBackOff

这个问题需要具体情况具体分析，分析log命令：

kubectl logs coredns-558bd4d5db-7b482 -n kube-system

出现相关error：plugin/loop: Loop (127.0.0.1:33921 -> :53)

参考链接中将这个问题出现的原因讲解得很清楚，可以采取如下方法解决问题。

一、解除插件loop

手动编辑 CoreDNS ConfigMap

kubectl edit cm coredns -n kube-system

然后 CoreDNS Pod 在重新启动时应该可以工作。

二、

如果方法一不能解决问题，就是节点上的 /etc/resolv.conf 配置了 127.0.0.1。

按照 loop 插件的说法，就是启动了 loop 模块，并成功检测到了回路，那此时，把回路清理掉，应该就可以恢复正常，我们可以尝试把对应节点上的 nameserver 127.0.0.1 删除掉，增加nameserver 114.114.114.114

#nameserver 127.0.0.1
search tbsite.net aliyun.com
options timeout:2 attempts:2
nameserver 10.101.0.1
nameserver 10.101.0.17
nameserver 10.101.0.33
nameserver 114.114.114.114

删除后删除原来的coredns pod，令其自动重启。

参考链接：

Coredns pods on master are in CrashLoopBackOff after setting up pod networking · Issue #1688 · kubernetes/kubeadm · GitHub

CoreDNS loop 插件异常问题 - 知乎

重新init之前先kubeadm reset

2) 某些节点上部署flannel出错：Failed to create SubnetManager: error retrieving pod spec for 'kube-flannel/kube-flannel-ds-6l99v': Get "https://10.96.0.1:443/api/v1/namespaces/kube-flannel/pods/kube-flannel-ds-6l99v": dial tcp 10.96.0.1:443: i/o timeout

修改对应的yaml文件，添加两个环境变量：KUBERNETES_SERVICE_HOST，KUBERNETES_SERVICE_PORT

           - name: POD_NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
            - name: KUBERNETES_SERVICE_HOST
              value: '<IP Master/DNS Master>' #ip address or dns of the host where kube-apiservice is running
            - name: KUBERNETES_SERVICE_PORT
              value: '6443'

参考链接：Failed to create SubnetManager: error retrieving pod spec for 'kube-system/kube-flannel-ds-sb3k4': the server does not allow access to the requested resource^C · Issue #671 · flannel-io/flannel · GitHub

3) failed to set bridge addr: "cni0" already has an IP address different from XXX

重新配置k8s时，原有的网络配置可能还存在，相关pod会出现报错，此时可将错误的网卡删掉，在重新apply flannel等网络pod后会自动重建。

ifconfig cni0 down
ip link delete cni0

节点重建的cni0网卡，会依据flannel的网络环境配置生成。

解决k8s"failed to set bridge addr: "cni0" already has an IP address different from 10.244.1.1/24"-优快云博客

3. Node节点部署（容器引擎为docker）

（1）部署docker

同master安装方式一致

（2）K8S部署

node节点只需要安装kubeadm kubelet。

（3）加入集群

1）Master节点上创建令牌

kubeadm token create --print-join-command

2）Node节点加入集群

master节点创建令牌时会出现一行命令，在node节点上执行该命令