搭建k8s时,pod不停报错如下:
Warning FailedCreatePodSandBox 2s (x9 over 111s) kubelet (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "472ad4eae0c5e30f91b314adcea4c35fc9cfb710fcb4112b8d073fe93c65e3d6": plugin type="flannel" failed (add): failed to set bridge addr: "cni0" already has an IP address different from 10.244.3.1/24
我在出问题的node节点上执行了
cat /run/flannel/subnet.env
FLANNEL_NETWORK=10.244.0.0/16 FLANNEL_SUBNET=10.244.3.1/24 FLANNEL_MTU=1450 FLANNEL_IPMASQ=true
ip addr show cni0
cni0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000 link/ether 5a:0a:92:40:22:da brd ff:ff:ff:ff:ff:ff inet 10.244.1.1/24 brd 10.244.1.255 scope global cni0 valid_lft forever preferred_lft forever inet6 fe80::580a:92ff:fe40:22da/64 scope link valid_lft forever preferred_lft forever
上述可见cni0 网桥的 IP(10.244.1.1/24)与 Flannel 的配置(10.244.3.1/24)严重不一致。这表明该节点的网络配置存在残留的旧数据,可能是节点重启、Flannel 配置变更后未正确清理,或人为误操作导致的。以下是针对此问题的详细解决方案
1.强制删除有问题的 Flannel Pod
kubectl -n kube-flannel delete pod kube-flannel-ds-gmcht --grace-period=0 --force
2.清理节点残留配置(在 k8s有问题的node上执行)
# 确保 kubelet 已停止
sudo systemctl stop kubelet
# 强制删除 cni0 网桥(如果存在)
sudo ip link delete cni0 2>/dev/null
# 清理 CNI 缓存和网络命名空间
sudo rm -rf /var/lib/cni/*
sudo rm -rf /var/run/netns/*
# 重启 kubelet
sudo systemctl start kubelet
3.等待 Flannel Pod 自动重建
kubectl -n kube-flannel get pods -o wide | grep k8snode1
4.检查node节点pod恢复
kubectl get node k8snode1 -o wide