目录
更改contianerd 运行时容器 为cri-docker
系统最低要求
内存最少是4G cpu个数最少两个
IP | 内存 | CPU | 主机名 |
192.168.231.120 | 4 | 4 | K1 |
192.168.231.121 | 4 | 4 | K2 |
192.168.231.122 | 4 | 4 | K3 |
基础准备工作--三台主机
关闭防火墙
systemctl stop firewalled
关闭swap 永久关闭
swapoff -a
vim /etc/fstab
#/dev/mapper/centos-swap swap swap defaults 0 0
设置主机名称 主机配置文件hosts
hostnamectl set-hostname k1
hostnamectl set-hostname k2
hostnamectl set-hostname k3
vim /etc/hosts
192.168.241.129 k2
192.168.241.128 k1
192.168.241.130 k3
内核参数设置
vim /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
sysctl -p
配置同步时间
yum install ntpdate
cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
systemctl enable ntpdate
systemctl start ntpdate
systemctl status ntpdate
systemctl start crond
systemctl enable crond
# 配置计划任务,每5分钟同步一次
$ crontab -e
*/5 * * * * /usr/sbin/ntpdate cn.pool.ntp.org
crontab -l 表示列出所有的定时任务
时间正确 写入系统时间
date
hwclock --systohc
安装containerd服务 # 如果选择docker 作为运行时容器则不装。
yum install -y containerd.io-1.6.27
配置IPVS 功能
3台机器操作
yum install ipset ipvsadm -y
#配置内核ipvs功能
vim /etc/sysconfig/modules/ipvs.modules
#!/bin/bash
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack
写入文件 wq 保存退出
chmod +x /etc/sysconfig/modules/ipvs.modules
/bin/bash /etc/sysconfig/modules/ipvs.modules
检查模块是否安装
lsmod | grep -e ip_vs -e nf_conntrack_ipv4
ip_vs_sh 12688 0
ip_vs_wrr 12697 0
ip_vs_rr 12600 0
ip_vs 145458 6 ip_vs_rr,ip_vs_sh,ip_vs_wrr
nf_conntrack_ipv4 15053 7
nf_defrag_ipv4 12729 1 nf_conntrack_ipv4
nf_conntrack 139264 8 ip_vs,nf_nat,nf_nat_ipv4,nf_nat_ipv6,xt_conntrack,nf_nat_masquerade_ipv4,nf_conntrack_ipv4,nf_conntrack_ipv6
libcrc32c 12644 4 xfs,ip_vs,nf_nat,nf_conntrack
安装docker
#安装源 epel
yum install epel-release yum-utils
#docker官网源
yum-config-manager \
--add-repo \
https://download.docker.com/linux/centos/docker-ce.repo
# docker 官方源不通或者慢 则使用阿里源
yum-config-manager \
--add-repo \
https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
#docker安装
yum install docker-ce -y
# 启动并开机启动
systemctl enable docker --now
systemctl start docker
systemctl status docker
更改contianerd 运行时容器 为cri-docker (使用默认不需要)
安装2023-1-27最新版本
wget https://github.com/Mirantis/cri-dockerd/releases/download/v0.3.9/cri-dockerd-0.3.9-3.el7.x86_64.rpm
# 三台机器上面都安装
rpm -ivh cri-dockerd-0.3.9-3.el7.x86_64.rpm
# systemctl daemon-reload
# systemctl enable cri-docker --now
Created symlink from /etc/systemd/system/multi-user.target.wants/cri-docker.service to /usr/lib/systemd/system/cri-docker.service.
#systemctl is-active cri-docker
active
# systemctl status cri-docker
拉取cri-docker镜像
集群安装好执行执行 没有则不执行
kubeadm config images pull --cri-socket unix:///var/run/cri-dockerd.sock
#被墙 请选择阿里源
kubeadm config images pull --image-repository=registry.aliyuncs.com/google_containers --cri-socket unix:///run/cri-dockerd.sock
k8s集群安装
配置k8s yum源 目前1.28的稳定版本为1.28.6 可从拉取的仓库看到
vim /etc/yum.repos.d/k8s.repo
[kubernetes]
name=Kubernetes
baseurl=https://pkgs.k8s.io/core:/stable:/v1.28/rpm/
enabled=1
gpgcheck=1
gpgkey=https://pkgs.k8s.io/core:/stable:/v1.28/rpm/repodata/repomd.xml.key
exclude=kubelet kubeadm kubectl cri-tools kubernetes-cni
上述源不好用 请用阿里云的源 选择其中一个即可
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
安装 kubelet kubeadm kubectl
yum install -y kubelet kubeadm kubectl
初始化集群
在master上面执行 如果是用cri-docker作为运行时镜像 请先安装cri-docker
先测试预拉取一下k8s镜像 确保镜像仓库是可用的
kubeadm config images pull --image-repository registry.aliyuncs.com/google_containers --cri-socket=unix:///var/run/cri-dockerd.sock
I0129 12:38:25.026132 60167 version.go:256] remote version is much newer: v1.29.1; falling back to: stable-1.28
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-apiserver:v1.28.6
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-controller-manager:v1.28.6
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-scheduler:v1.28.6
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-proxy:v1.28.6
[config/images] Pulled registry.aliyuncs.com/google_containers/pause:3.9
[config/images] Pulled registry.aliyuncs.com/google_containers/etcd:3.5.9-0
[config/images] Pulled registry.aliyuncs.com/google_containers/coredns:v1.10.1
不指定源仓库就是拉取官网的 选择其中一个执行
kubeadm config images pull --kubernetes-version=1.28.2 --cri-socket unix:///run/cri-dockerd.sock
[config/images] Pulled registry.k8s.io/kube-apiserver:v1.28.2
[config/images] Pulled registry.k8s.io/kube-controller-manager:v1.28.2
[config/images] Pulled registry.k8s.io/kube-scheduler:v1.28.2
[config/images] Pulled registry.k8s.io/kube-proxy:v1.28.2
[config/images] Pulled registry.k8s.io/pause:3.9
[config/images] Pulled registry.k8s.io/etcd:3.5.9-0
[config/images] Pulled registry.k8s.io/coredns/coredns:v1.10.1
初始化 集群
kubeadm init --kubernetes-version=1.28.2 --image-repository registry.aliyuncs.com/google_containers --apiserver-advertise-address=192.168.241.128 --service-cidr=10.96.0.0/12 --pod-network-cidr=10.244.0.0/16 --cri-socket=unix:///run/cri-dockerd.sock
# 不指定阿里云仓库位置 使用默认的k8s官方源
kubeadm init --kubernetes-version=1.28.2 --apiserver-advertise-address=192.168.241.128 --service-cidr=10.96.0.0/12 --pod-network-cidr=10.244.0.0/16 --cri-socket=unix:///run/cri-dockerd.sock
初始化成功后
在 K1上面执行
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
#如果是root 用户也可执行如下 一样的命令
cp -i /etc/kubernetes/admin.conf /root/.kube/config
chown 0:0 /root/.kube/config
calico 网络插件 在master上安装
calico 3.26.1
wget --no-check-certificate https://docs.projectcalico.org/manifests/calico.yaml
kubectl apply -f calico.yaml
配置calico网络参数
vim calico.yaml
修改 初始化指定的参数 --pod-network-cidr=10.244.0.0/16 和calico.yaml 中
CALICO_IPV4POOL_CIDR 值是相同的
默认是注释的
- name: CALICO_IPV4POOL_CIDR
value: "10.244.0.0/16"
修改完配置重新部署
kubectl replace -f calico.yaml
安装
kubectl apply -f calico.yaml
没有安装calico 组件之前 node 的状态是 NotReady
获取节点信息 验证是否安装成功
K8s集群加入node工作节点
即把 K2 K3主机加入到k8s集群中
生成 K1主节点的token
# 在K1 master上面生成
kubeadm token create --print-join-command
# 创建一个永不过期的token
kubeadm token create --ttl 0 --print-join-command
# 在k2 k3 wokenode上执行下面命令 加入k8s集群
生成的结果如下 不过期的token
kubeadm join 192.168.241.128:6443 --token 5ajtxi.sx49u7jyygnmw0c4 --discovery-token-ca-cert-hash sha256:ada6bf229e93d346c4af69f953c96040c12c30b1f2b10eb2993052fbfaa48651
#如果采用 docker作为运行时容器 K2上面执行 加入进集群
kubeadm join 192.168.241.128:6443 --token iym0rs.q1at20rk4knx1b1x --discovery-token-ca-cert-hash sha256:e61313ccd385434aadf56b5fa2060e4ece95f442a6b64fbc54a8d522f98ce489 --cri-socket unix:///var/run/cri-dockerd.sock
k8s集群搭建完成
kubectl get nodes #master上执行
重新初始化
#彻底清理 kubeadm reset 参考本文报错3
kubeadm reset
kubeadm init --apiserver-advertise-address=192.168.241.128 --token-ttl=0 --cri-socket unix:///run/cri-dockerd.sock
可以看到运行时容器已修改 CONTAINER-RUNTIME
[root@k1 lib]# kubectl get node -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k1 Ready control-plane 67m v1.28.6 192.168.241.128 <none> CentOS Linux 7 (Core) 3.10.0-1160.el7.x86_64 docker://24.0.7
k2 Ready <none> 31m v1.28.6 192.168.241.129 <none> CentOS Linux 7 (Core) 3.10.0-1160.el7.x86_64 docker://24.0.7
k3 Ready <none> 29m v1.28.6 192.168.241.130 <none> CentOS Linux 7 (Core) 3.10.0-1160.el7.x86_64 docker://24.0.7
删除k8s node节点
kubectl get nodes
NAME STATUS ROLES AGE VERSION
k1 Ready control-plane 25h v1.28.2
k2 NotReady <none> 24h v1.28.2
k3 NotReady <none> 24h v1.28.2
删除 节点
kubectl delete k2
清空数据
kubeadm reset
cri-docker 镜像 清空数据
kubeadm reset --cri-socket=unix:///run/cri-dockerd.sock
安装helm
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
#查看helm 版本
helm version
helm常用指令
# helm 官方仓库
helm repo add brigade https://brigadecore.github.io/charts
# 添加ingress 仓库
helm repo add ng https://kubernetes.github.io/ingress-nginx
# 删除仓库
helm repo remove stable
# 查看仓库
helm repo list
#查看 ingress的所有版本
helm search repo ingress-nginx/ingress-nginx -l
#搜索ingress
helm search repo ingress
NAME CHART VERSION APP VERSION DESCRIPTION
ng/ingress-nginx 4.9.0 1.9.5 Ingress controller for Kubernetes using NGINX a...
#下载下来 会存储在本地
helm pull ng/ingress-nginx
查看安装失败的服务
helm -n ingress-nginx ls -a
删除安装失败的服务
helm -n ingress-nginx delete ingress-nginx
彻底删除calico网络插件
# master上面执行 确保有calico 安装文件
kubeclt delete -f calico.yaml
#全部执行
#删除网卡 如果是IpIp模式 网卡名为tun10 可通过ifconfig查看
modprobe -r ipip
#全部执行
# 删除calico 配置文件 相关的calico的配置文件
cd /etc/cni/net.d/ && rm -rf 10-calico.conflist calico-kubeconfig
#全部执行
systemctl restart kubelet
systemctl restart docker
POD命令
删除pod
1.
# 删除对应namespace下的pod
kubectl delete pod ingress-nginx-admission-patch-j4s7z -n ingress-nginx
查看pod日志
kubectl -n kube-system describe po calico-kube-controllers-7ddc4f45bc-gb7mp
2.查看deployment 信息
kubectl get deployment -n ingress-nginx
3. 删除deployment信息
kubectl delete deployment ingress-nginx-controller -n ingress-nginx
4, 在删除pod信息
kubectl delete pod ingress-nginx-controller-6fcf745c45-24k9b -n ingress-nginx
namespace命令
创建
查看所有的namspace
kubectl get namespaces
#namespace下的所有pod 这里的namespace是kube-system
kubectl get pods -n kube-system
查看 指定namespace(空间是ingress-nginx) 下所有的pod
kubectl get all -n ingress-nginx
查看加入节点的命令
kubeadm token create --print-join-command
#生成一个永不过期的token
kubeadm token create --ttl 0 --print-join-command
k8s集群上删除一个节点
kubectl delete nodes k3
在节点上清空数据
kubeadm reset
把pod调度到指定节点
kubectl label node master1 ingress=true
ERROR:
报错1:
K2节点加入报错
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR CRI]: container runtime is not running: output: time="2024-01-18T10:41:00-05:00" level=fatal msg="validate service connection: validate CRI v1 runtime API for endpoint \"unix:///var/run/containerd/containerd.sock\": rpc error: code = Unimplemented desc = unknown service runtime.v1.RuntimeService"
, error: exit status 1
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher
解决办法:
在master端 重启 systemctl restart containerd
在node 端 删除 rm /etc/containerd/config.toml 重启 systemctl restart containerd
加入之后 如果没有 ready 在重启 master端的node
报错2:
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR FileAvailable--etc-kubernetes-pki-ca.crt]: /etc/kubernetes/pki/ca.crt already exists
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher
kubeadm reset 在节点上 清空节点数据
重新加入
报错3 重启服务器导致的报错
E0126 20:46:32.845126 7509 memcache.go:265] couldn't get current server API group list: Get "https://192.168.241.128:6443/api?timeout=32s": dial tcp 192.168.241.128:6443: connect: connection refused
E0126 20:46:32.845281 7509 memcache.go:265] couldn't get current server API group list: Get "https://192.168.241.128:6443/api?timeout=32s": dial tcp 192.168.241.128:6443: connect: connection refused
E0126 20:46:32.853384 7509 memcache.go:265] couldn't get current server API group list: Get "https://192.168.241.128:6443/api?timeout=32s": dial tcp 192.168.241.128:6443: connect: connection refused
E0126 20:46:32.853530 7509 memcache.go:265] couldn't get current server API group list: Get "https://192.168.241.128:6443/api?timeout=32s": dial tcp 192.168.241.128:6443: connect: connection refused
E0126 20:46:32.855105 7509 memcache.go:265] couldn't get current server API group list: Get "https://192.168.241.128:6443/api?timeout=32s": dial tcp 192.168.241.128:6443: connect: connection refused
The connection to the server 192.168.241.128:6443 was refused - did you specify the right host or port?
解决过程:
lsof -i:6443 端口 发现端口没有服务
查看日志发现 无法注册k1 节点
journalctl -u kubelet
Attempting to register node" node="k1"
Jan 26 21:01:54 k1 kubelet[2209]: E0126 21:01:54.712910 2209 kubelet_node_status.go:92] "Unable to register node with API server" err="Post \"https://192.168.241.128:6443/api/v1/nodes\": dial tcp 192.168.241.128:6443: connect: connection refused" node="k1"
Jan 26 21:01:55 k1 kubelet[2209]: E0126 21:01:55.677638 2209 controller.go:146] "Failed to ensure lease exists, will retry" err="Get \"https://192.168.241.128:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/k1?timeout=10s\": dial tcp 192.168.241.128:6443: connect: connection refused" interval="7s"
Jan 26 21:01:57 k1 kubelet[2209]: E0126 21:01:57.385561 2209 eviction_manager.go:258] "Eviction manager: failed to get summary stats" err="failed to get node info: node \"k1\" not found"
Jan 26 21:02:01 k1 kubelet[2209]: E0126 21:02:01.236048 2209 event.go:289] Unable to write event: '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"kube-scheduler-k1.17adf043686e3b00", GenerateName:"", Namespace:"kube-system", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"kube-scheduler-k1", UID:"84a01f6320ea2c31160c7acf2d558c4c", APIVersion:"v1", ResourceVersion:"", FieldPath:""}, Reason:"SandboxChanged", Message:"Pod sandbox changed, it will be killed and re-created.", Source:v1.EventSource{Component:"kubelet", Host:"k1"}, FirstTimestamp:time.Date(2024, time.January, 26, 19, 46, 46, 148815616, time.Local), LastTimestamp:time.Date(2024, time.January, 26, 19, 46, 46, 148815616, time.Local), Count:1, Type:"Normal", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"kubelet", ReportingInstance:"k1"}': 'Post "https://192.168.241.128:6443/api/v1/namespaces/kube-system/events": dial tcp 192.168.241.128:6443: connect: connection refused'(may retry after sleeping)
docker ps -a 查看容器状态 发现容器已经全部down掉
重启docker容器 docker restart 容器id
重启k8s服务 三台机器 systemctl status kubelet
此时可以看见 6443端口服务已经启动 但是 k1还是连接拒绝
重置kubeadm reset 报错解决
报错3:重新加入节点报错
查看日志 journalctl -u kubelet
StopPodSandbox from runtime service failed" err="rpc error
master节点显示如下:因为没有节点加入的原因
kube-system coredns-5dd5756b68-lqkm5 0/1 ContainerCreating 0 23m
kube-system coredns-5dd5756b68-s5czv 0/1 ContainerCreating 0 23m
加入节点时报错:
error execution phase kubelet-start: error uploading crisocket: Unauthorized
查看节点日志
journalctl -u kubelet
Jan 26 22:52:48 k2 kubelet[3986]: E0126 22:52:48.392703 3986 file_linux.go:61] "Unable to read config path" err="path does not exist, ignoring" path="/etc
解决办法:彻底清除kubeadm init 在系统的残留
kubeadm reset
如果使用cri-docker 需要指定cri
全部 删除k8s拉取的镜像文件
kubeadm reset --cri-socket unix:///var/run/cri-dockerd.sock
#三台机器上面执行
iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X
systemctl stop kubelet
systemctl stop docker
rm -rf /root/.kube
rm -rf /var/lib/cni/*
rm -rf /var/lib/calico/
rm -rf /var/lib/cri-dockerd/sandbox
rm -rf /var/lib/kubelet/*
rm -rf /var/lib/etcd/
rm -rf /etc/cni/*
rm /etc/kubernetes/pki
ifconfig docker0 down
ip link delete cni0
systemctl start docker
systemctl start cri-docker
systemctl status cri-docker
systemctl status docker
完成以后
重新kubeadm init
重新加入发现node 节点显示 not ready 状态
kubectl describe node k1 | grep Taints
Taints: node.kubernetes.io/not-ready:NoSchedule
pod 一直显示 pending 状态
[root@k1 ~]# kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-7ddc4f45bc-gb7mp 0/1 Pending 0 3s
kube-system calico-node-fvj7h 1/1 Running 0 10m
kube-system calico-node-skgfb 1/1 Running 0 10m
kube-system calico-node-tcfjc 1/1 Running 0 10m
kube-system coredns-5dd5756b68-hsnwn 0/1 Pending 0 12m
kube-system coredns-5dd5756b68-vxw2b 0/1 Pending 0 12m
查看 calico pod的日志
kubectl -n kube-system describe po calico-kube-controllers-7ddc4f45bc-gb7mp
显示 not Schedule
卸载掉污点节点
kubectl taint nodes k1 node.kubernetes.io/not-ready-
# 重启 kubelet 服务 docker containerd服务
systemctl restart kubelet
systemctl restart docker
systemctl restart containerd
显示正常
报错:calico coredns 产生crashloopbackoff 可能得原因
发现几台机器的时区时间不同步
cp /usr/share/zoneinfo/Asia/Dubai /etc/localtime
systemctl restart ntpdate
各个机器同步时间之后 calico 和coredns 恢复正常
报错4:calico 时不时的 CrashLoopBackOff
查看报错calico
kubectl -n kube-system describe po calico-kube-controllers-7ddc4f45bc-bvzf2
kubelet Back-off restarting failed container calico-kube-controllers in pod calico-kube-controllers-
kubelet Readiness probe failed: Error initializing datastore: Get "https://10.96.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default": dial tcp 10.96.0.1:443: i/o timeout
解决办法:铲掉集群 重新安装 因为在kubeadm init 阶段没有指定--service-cidr=10.96.0.0/12 --pod-network-cidr=10.244.0.0/16 安装 calico.yaml的时候没有修改 - name: CALICO_IPV4POOL_CIDR
value: "10.244.0.0/16" 指定分配网段地址
报错:
Found multiple CRI endpoints on the host. Please define which one do you wish to use by setting the 'criSocket' field in the kubeadm configuration file: unix:///var/run/containerd/containerd.sock, unix:///var/run/cri-dockerd.sock
To see the stack trace of this error execute with --v=5 or higher
解决办法: 没有指定 CRI 执行的命令 指定起CRI --cri-socket unix:///var/run/cri-dockerd.sock
-v2 查看具体报错信息
报错:加入节点报错
[discovery] Failed to request cluster-info, will try again: Get "https://192.168.241.128:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s": dial tcp 192.168.241.128:6443: connect: no route to host
I0131 09:45:19.026519 8529 token.go:217] [discovery] Failed to request cluster-info, will try again: Get "https://192.168.241.128:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s": dial tcp 192.168.241.128:6443: connect: no route to host
解决办法:k1 master 节点 6443端口不通 防护墙规则拦截 开放644端口
firewall-cmd --permanent --add-port=6443/tcp
firewall-cmd --reload
报错:加入节点报错
Failed to request cluster-info, will try again: Get "https://192.168.241.128:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s": tls: failed to verify certificate: x509: certificate has expired or is not yet valid: current time 2024-01-31T10:53:06+08:00 is before 2024-01-31T13:42:55Z
报错:版本不一致导致 k8s版本1.28.2 calico 版本3.20.1
error: resource mapping not found for name: "calico-kube-controllers" namespace: "kube-system" from "calico.yaml": no matches for kind "PodDisruptionBudget"in version "policy/v1beta1"
报错: 其中有一个calico 的node状态 节点长期为 0/1
NAMESPACE NAME READY STATUS RES
kube-system calico-kube-controllers-7ddc4f45bc-l8fgm 1/1 Running 0 3h59m
kube-system calico-node-97jxl 1/1 Running 0 3h59m
kube-system calico-node-kxx2g 0/1 Running 1 (3h42m ago) 3h44m
kube-system calico-node-swf7r 1/1 Running 0
添加 新节点之后 calico node 节点状态全部不是ready 状态
kube-system calico-kube-controllers-7ddc4f45bc-l8fgm 1/1 Running 0 22h
kube-system calico-node-97jxl 0/1 Running 0 22h
kube-system calico-node-kxx2g 0/1 Running 1 (22h ago) 22h
kube-system calico-node-swf7r 0/1 Running 0 22h
kube-system coredns-5dd5756b68-2w6gl 1/1 Running 0 23h
kube-system coredns-5dd5756b68-b6sc8 1/1 Running 0 23h
解决办法:
cir-docke报错
报错1 执行 kubeadm reset
Found multiple CRI endpoints on the host. Please define which one do you wish to use by setting the 'criSocket' field in the kubeadm configuration file: unix:///var/run/containerd/containerd.sock, unix:///var/run/cri-dockerd.sock
To see the stack trace of this error execute with --v=5 or higher
systemctl stop containerd
systemctl stop kubelet
systemctl stop docker
然后再执行 kubeadm reset