kubelet Error getting node 问题求助

问题现象

master节点采用kubectl 发现资源不存在

[root@k8s-master01 cni]# kubectl get nodes
No resources found
[root@k8s-master01 cni]# kubectl get node
No resources found

寻找原因

发现kubelet的状态虽然是成功,但是报错E0506 03:15:55.794029 4646 kubelet.go:2461] "Error getting node" err="node \"k8s-master01\" not found"

[root@k8s-master01 calico]# systemctl status kubelet -l
● kubelet.service - Kubernetes Kubelet
   Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2022-05-06 03:15:50 EDT; 5s ago
     Docs: https://github.com/kubernetes/kubernetes
 Main PID: 4646 (kubelet)
    Tasks: 12
   Memory: 24.7M
   CGroup: /system.slice/kubelet.service
           └─4646 /usr/local/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.kubeconfig --kubeconfig=/etc/kubernetes/kubelet.kubeconfig --config=/etc/kubernetes/kubelet-conf.yml --network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin --container-runtime=remote --runtime-request-timeout=15m --container-runtime-endpoint=unix:///run/containerd/containerd.sock --cgroup-driver=systemd --node-labels=node.kubernetes.io/node=''

May 06 03:15:54 k8s-master01 kubelet[4646]: E0506 03:15:54.988785    4646 kubelet.go:2461] "Error getting node" err="node \"k8s-master01\" not found"
May 06 03:15:55 k8s-master01 kubelet[4646]: E0506 03:15:55.089274    4646 kubelet.go:2461] "Error getting node" err="node \"k8s-master01\" not found"
May 06 03:15:55 k8s-master01 kubelet[4646]: E0506 03:15:55.147196    4646 kubelet_node_status.go:92] "Unable to register node with API server" err="Node \"k8s-master01\" is invalid: metadata.labels: Invalid value: \"''\": a valid label must be an empty string or consist of alphanumeric characters, '-', '_' or '.', and must start and end with an alphanumeric character (e.g. 'MyValue',  or 'my_value',  or '12345', regex used for validation is '(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])?')" node="k8s-master01"
May 06 03:15:55 k8s-master01 kubelet[4646]: E0506 03:15:55.189399    4646 kubelet.go:2461] "Error getting node" err="node \"k8s-master01\" not found"
May 06 03:15:55 k8s-master01 kubelet[4646]: E0506 03:15:55.290355    4646 kubelet.go:2461] "Error getting node" err="node \"k8s-master01\" not found"
May 06 03:15:55 k8s-master01 kubelet[4646]: E0506 03:15:55.390631    4646 kubelet.go:2461] "Error getting node" err="node \"k8s-master01\" not found"
May 06 03:15:55 k8s-master01 kubelet[4646]: E0506 03:15:55.491190    4646 kubelet.go:2461] "Error getting node" err="node \"k8s-master01\" not found"
May 06 03:15:55 k8s-master01 kubelet[4646]: E0506 03:15:55.592451    4646 kubelet.go:2461] "Error getting node" err="node \"k8s-master01\" not found"
May 06 03:15:55 k8s-master01 kubelet[4646]: E0506 03:15:55.693180    4646 kubelet.go:2461] "Error getting node" err="node \"k8s-master01\" not found"
May 06 03:15:55 k8s-master01 kubelet[4646]: E0506 03:15:55.794029    4646 kubelet.go:2461] "Error getting node" err="node \"k8s-master01\" not found"

初步判断 应该是某个配置中多加了"",被识别多了""

排除网络上的答案

  1. ip地址和主机名没有改变并配置正确

    [root@k8s-master01 cni]# cat /etc/hosts
    127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
    ::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
    192.168.230.81 k8s-master01
    192.168.230.82 k8s-master02
    192.168.230.83 k8s-master03
    192.168.230.84 k8s-node01
    192.168.230.85 k8s-node02
    192.168.230.80 lb01
    192.168.230.90 lb02
    192.168.230.89 lb-vip
    
    
  2. sendbox image 已经配置为阿里云的,非谷歌的

    [root@k8s-master01 cni]# cat /etc/containerd/config.toml | grep sandbox_image
        sandbox_image = "registry.cn-hangzhou.aliyuncs.com/chenby/pause:3.
    
  3. 还有大佬说未安装calico,先安装…emmmm

评论:

最近搞了好几天了,找到问题原因 --node-labels=node.kubernetes.io/node='' 不知为何CentOS7不识别



https://github.com/cby-chen/Kubernetes

不推荐使用CentOS7安装kubernetes集群,建议使用CentOS8安装!

CentOS7安装会出现kubelet异常,无法识别 --node-labels 字段问题,目前原因不明。
 

[root@k8snode01-49 bin]# systemctl status kubelet # 查看运行状态 ● kubelet.service - kubelet: The Kubernetes Node Agent Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled) Drop-In: /usr/lib/systemd/system/kubelet.service.d └─10-kubeadm.conf Active: active (running) since 三 2025-06-25 17:07:03 CST; 7min ago Docs: https://kubernetes.io/docs/ Main PID: 15576 (kubelet) Tasks: 14 Memory: 30.0M CGroup: /system.slice/kubelet.service └─15576 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --contai... 6月 25 17:14:25 k8snode01-49 kubelet[15576]: E0625 17:14:25.611440 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:25 k8snode01-49 kubelet[15576]: E0625 17:14:25.712048 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:25 k8snode01-49 kubelet[15576]: E0625 17:14:25.812741 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:25 k8snode01-49 kubelet[15576]: E0625 17:14:25.913020 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:26 k8snode01-49 kubelet[15576]: E0625 17:14:26.024133 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:26 k8snode01-49 kubelet[15576]: E0625 17:14:26.124389 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:26 k8snode01-49 kubelet[15576]: E0625 17:14:26.224745 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:26 k8snode01-49 kubelet[15576]: E0625 17:14:26.325352 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:26 k8snode01-49 kubelet[15576]: E0625 17:14:26.426141 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:26 k8snode01-49 kubelet[15576]: E0625 17:14:26.526814 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" [root@k8snode01-49 bin]# journalctl -xeu kubelet 6月 25 17:14:26 k8snode01-49 kubelet[15576]: E0625 17:14:26.325352 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:26 k8snode01-49 kubelet[15576]: E0625 17:14:26.426141 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:26 k8snode01-49 kubelet[15576]: E0625 17:14:26.526814 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:26 k8snode01-49 kubelet[15576]: E0625 17:14:26.635436 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:26 k8snode01-49 kubelet[15576]: E0625 17:14:26.736500 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:26 k8snode01-49 kubelet[15576]: E0625 17:14:26.837486 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:26 k8snode01-49 kubelet[15576]: E0625 17:14:26.938137 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:26 k8snode01-49 kubelet[15576]: E0625 17:14:26.997608 15576 remote_runtime.go:222] "RunPodSandbox from runtime service failed" err="rpc error: code = Unknown desc = failed to 6月 25 17:14:26 k8snode01-49 kubelet[15576]: E0625 17:14:26.997657 15576 kuberuntime_sandbox.go:71] "Failed to create sandbox for pod" err="rpc error: code = Unknown desc = failed to get 6月 25 17:14:26 k8snode01-49 kubelet[15576]: E0625 17:14:26.997677 15576 kuberuntime_manager.go:772] "CreatePodSandbox for pod failed" err="rpc error: code = Unknown desc = failed to get 6月 25 17:14:27 k8snode01-49 kubelet[15576]: E0625 17:14:26.997721 15576 pod_workers.go:965] "Error syncing pod, skipping" err="failed to \"CreatePodSandbox\" for \"kube-controller-manage 6月 25 17:14:27 k8snode01-49 kubelet[15576]: E0625 17:14:27.039080 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:27 k8snode01-49 kubelet[15576]: E0625 17:14:27.139677 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:27 k8snode01-49 kubelet[15576]: E0625 17:14:27.240528 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:27 k8snode01-49 kubelet[15576]: E0625 17:14:27.340706 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:27 k8snode01-49 kubelet[15576]: E0625 17:14:27.441376 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:27 k8snode01-49 kubelet[15576]: E0625 17:14:27.542175 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:27 k8snode01-49 kubelet[15576]: E0625 17:14:27.642549 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:27 k8snode01-49 kubelet[15576]: E0625 17:14:27.743424 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:27 k8snode01-49 kubelet[15576]: E0625 17:14:27.844051 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:27 k8snode01-49 kubelet[15576]: E0625 17:14:27.944536 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:28 k8snode01-49 kubelet[15576]: E0625 17:14:28.045155 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:28 k8snode01-49 kubelet[15576]: E0625 17:14:28.145533 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:28 k8snode01-49 kubelet[15576]: E0625 17:14:28.245835 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:28 k8snode01-49 kubelet[15576]: E0625 17:14:28.346226 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:28 k8snode01-49 kubelet[15576]: E0625 17:14:28.446390 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:28 k8snode01-49 kubelet[15576]: E0625 17:14:28.547045 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:28 k8snode01-49 kubelet[15576]: E0625 17:14:28.547045 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:28 k8snode01-49 kubelet[15576]: E0625 17:14:28.647785 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:28 k8snode01-49 kubelet[15576]: E0625 17:14:28.748016 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:28 k8snode01-49 kubelet[15576]: E0625 17:14:28.848328 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:28 k8snode01-49 kubelet[15576]: E0625 17:14:28.949244 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:29 k8snode01-49 kubelet[15576]: E0625 17:14:29.031502 15576 kubelet.go:2373] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotRea 6月 25 17:14:29 k8snode01-49 kubelet[15576]: E0625 17:14:29.049821 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:29 k8snode01-49 kubelet[15576]: E0625 17:14:29.151914 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:29 k8snode01-49 kubelet[15576]: E0625 17:14:29.253839 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:29 k8snode01-49 kubelet[15576]: E0625 17:14:29.354434 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:29 k8snode01-49 kubelet[15576]: E0625 17:14:29.455443 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" [root@k8snode01-49 bin]# systemctl status kubelet # 查看运行状态 ● kubelet.service - kubelet: The Kubernetes Node Agent Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled) Drop-In: /usr/lib/systemd/system/kubelet.service.d └─10-kubeadm.conf Active: active (running) since 三 2025-06-25 17:07:03 CST; 7min ago Docs: https://kubernetes.io/docs/ Main PID: 15576 (kubelet) Tasks: 14 Memory: 30.0M CGroup: /system.slice/kubelet.service └─15576 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --contai... 6月 25 17:14:25 k8snode01-49 kubelet[15576]: E0625 17:14:25.611440 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:25 k8snode01-49 kubelet[15576]: E0625 17:14:25.712048 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:25 k8snode01-49 kubelet[15576]: E0625 17:14:25.812741 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:25 k8snode01-49 kubelet[15576]: E0625 17:14:25.913020 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:26 k8snode01-49 kubelet[15576]: E0625 17:14:26.024133 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:26 k8snode01-49 kubelet[15576]: E0625 17:14:26.124389 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:26 k8snode01-49 kubelet[15576]: E0625 17:14:26.224745 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:26 k8snode01-49 kubelet[15576]: E0625 17:14:26.325352 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:26 k8snode01-49 kubelet[15576]: E0625 17:14:26.426141 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:26 k8snode01-49 kubelet[15576]: E0625 17:14:26.526814 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" [root@k8snode01-49 bin]# journalctl -xeu kubelet 6月 25 17:14:26 k8snode01-49 kubelet[15576]: E0625 17:14:26.325352 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:26 k8snode01-49 kubelet[15576]: E0625 17:14:26.426141 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:26 k8snode01-49 kubelet[15576]: E0625 17:14:26.526814 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:26 k8snode01-49 kubelet[15576]: E0625 17:14:26.635436 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:26 k8snode01-49 kubelet[15576]: E0625 17:14:26.736500 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:26 k8snode01-49 kubelet[15576]: E0625 17:14:26.837486 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:26 k8snode01-49 kubelet[15576]: E0625 17:14:26.938137 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:26 k8snode01-49 kubelet[15576]: E0625 17:14:26.997608 15576 remote_runtime.go:222] "RunPodSandbox from runtime service failed" err="rpc error: code = Unknown desc = failed to 6月 25 17:14:26 k8snode01-49 kubelet[15576]: E0625 17:14:26.997657 15576 kuberuntime_sandbox.go:71] "Failed to create sandbox for pod" err="rpc error: code = Unknown desc = failed to get 6月 25 17:14:26 k8snode01-49 kubelet[15576]: E0625 17:14:26.997677 15576 kuberuntime_manager.go:772] "CreatePodSandbox for pod failed" err="rpc error: code = Unknown desc = failed to get 6月 25 17:14:27 k8snode01-49 kubelet[15576]: E0625 17:14:26.997721 15576 pod_workers.go:965] "Error syncing pod, skipping" err="failed to \"CreatePodSandbox\" for \"kube-controller-manage 6月 25 17:14:27 k8snode01-49 kubelet[15576]: E0625 17:14:27.039080 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:27 k8snode01-49 kubelet[15576]: E0625 17:14:27.139677 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:27 k8snode01-49 kubelet[15576]: E0625 17:14:27.240528 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:27 k8snode01-49 kubelet[15576]: E0625 17:14:27.340706 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:27 k8snode01-49 kubelet[15576]: E0625 17:14:27.441376 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:27 k8snode01-49 kubelet[15576]: E0625 17:14:27.542175 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:27 k8snode01-49 kubelet[15576]: E0625 17:14:27.642549 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:27 k8snode01-49 kubelet[15576]: E0625 17:14:27.743424 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:27 k8snode01-49 kubelet[15576]: E0625 17:14:27.844051 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:27 k8snode01-49 kubelet[15576]: E0625 17:14:27.944536 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:28 k8snode01-49 kubelet[15576]: E0625 17:14:28.045155 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:28 k8snode01-49 kubelet[15576]: E0625 17:14:28.145533 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:28 k8snode01-49 kubelet[15576]: E0625 17:14:28.245835 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:28 k8snode01-49 kubelet[15576]: E0625 17:14:28.346226 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:28 k8snode01-49 kubelet[15576]: E0625 17:14:28.446390 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:28 k8snode01-49 kubelet[15576]: E0625 17:14:28.547045 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:28 k8snode01-49 kubelet[15576]: E0625 17:14:28.547045 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:28 k8snode01-49 kubelet[15576]: E0625 17:14:28.647785 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:28 k8snode01-49 kubelet[15576]: E0625 17:14:28.748016 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:28 k8snode01-49 kubelet[15576]: E0625 17:14:28.848328 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:28 k8snode01-49 kubelet[15576]: E0625 17:14:28.949244 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:29 k8snode01-49 kubelet[15576]: E0625 17:14:29.031502 15576 kubelet.go:2373] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotRea 6月 25 17:14:29 k8snode01-49 kubelet[15576]: E0625 17:14:29.049821 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:29 k8snode01-49 kubelet[15576]: E0625 17:14:29.151914 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:29 k8snode01-49 kubelet[15576]: E0625 17:14:29.253839 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:29 k8snode01-49 kubelet[15576]: E0625 17:14:29.354434 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found" 6月 25 17:14:29 k8snode01-49 kubelet[15576]: E0625 17:14:29.455443 15576 kubelet.go:2448] "Error getting node" err="node \"k8snode01-49\" not found"
06-26
<think>我们正在处理用户在使用`kubeadminit`初始化Kubernetes集群时遇到的超时问题。根据错误信息,控制平面启动超时,并且kubelet报告了`Errorgettingnode`和`Failedtocreatesandboxforpod`等错误。我们将按照以下步骤进行排查和解决:###步骤1:检查kubelet状态和日志首先,我们需要检查kubelet服务的状态和日志,以确定具体错误原因。```bashsystemctlstatuskubelet#查看kubelet运行状态journalctl-ukubelet-xef#查看kubelet详细日志,重点关注错误信息```在日志中,我们可能会发现以下常见问题:-镜像拉取失败(由于网络问题或镜像仓库不可用)-容器运行时(如Docker或containerd)未正常运行-节点资源不足(内存、CPU、磁盘)-网络配置问题(如cgroup驱动不匹配、防火墙阻止端口)###步骤2:检查容器运行时状态Kubernetes依赖于容器运行时(如Docker、containerd)来管理容器。确保容器运行时已启动并正常运行:```bashsystemctlstatusdocker#如果使用Docker#或systemctlstatuscontainerd#如果使用containerd```检查是否有容器运行时的错误日志。###步骤3:检查控制平面组件的容器状态使用容器运行时的命令行工具检查控制平面组件(kube-apiserver,kube-controller-manager,kube-scheduler,etcd)的容器是否正常运行:```bashdockerps-a|grep-E'kube-apiserver|kube-controller-manager|kube-scheduler|etcd'#或使用crictl(如果使用containerd)crictlps-a|grep-E'kube-apiserver|kube-controller-manager|kube-scheduler|etcd'```如果发现容器处于非运行状态(Exited状态),查看其日志:```bashdockerlogs<容器ID>#或crictllogs<容器ID>```###步骤4:常见问题及解决方案####问题1:镜像拉取失败由于网络问题,可能无法从默认仓库拉取镜像。我们已经指定了阿里云镜像仓库,但可能仍需要手动拉取镜像。**解决方案**:手动拉取所需镜像```bashkubeadmconfigimagespull\--image-repositoryregistry.cn-hangzhou.aliyuncs.com/google_containers\--kubernetes-versionv1.25.0```####问题2:容器运行时配置问题Kubernetes与容器运行时的cgroup驱动不一致(Docker使用`cgroupfs`,而kubelet默认使用`systemd`)。**解决方案**:确保容器运行时和kubelet使用相同的cgroup驱动。1.修改Docker的cgroup驱动为`systemd`(如果使用Docker):```bashcat<<EOF|sudotee/etc/docker/daemon.json{"exec-opts":["native.cgroupdriver=systemd"]}EOFsystemctlrestartdocker```2.修改kubelet的cgroup驱动(如果已初始化过kubeadm,需要重置后重新配置):在`kubeadminit`时可以通过配置文件指定kubelet的cgroup驱动,或者使用以下方式:```bashkubeadminit...--config=kubeadm-config.yaml```其中`kubeadm-config.yaml`内容包含:```yamlapiVersion:kubeadm.k8s.io/v1beta3kind:InitConfigurationnodeRegistration:criSocket:unix:///var/run/containerd/containerd.sock#根据实际容器运行时调整kubeletExtraArgs:cgroup-driver:systemd#与容器运行时一致---#其他配置...```####问题3:防火墙或端口冲突确保所需端口(6443、10250等)已打开,并且没有其他进程占用。**解决方案**:```bash#检查端口占用sudoss-tulnp|grep-E'6443|10250'#开放防火墙端口(如果使用firewalld)sudofirewall-cmd--permanent--add-port={6443,2379-2380,10250,10251,10252}/tcpsudofirewall-cmd--reload```####问题4:节点资源不足确保节点满足Kubernetes的最低资源要求(至少2CPU、2GB内存)。###步骤5:重置集群并重新初始化如果以上步骤无法解决问题,我们可以重置集群并重新初始化。```bash#重置集群kubeadmreset-f#清理残留文件rm-rf/etc/cni/net.d/etc/kubernetes//var/lib/etcd//var/lib/kubelet#清理iptables规则iptables-F&&iptables-tnat-F&&iptables-tmangle-F&&iptables-X#重启kubeletsystemctlrestartkubelet```然后重新运行`kubeadminit`命令(注意:确保已经解决了之前发现的问题):```bashkubeadminit\--apiserver-advertise-address=192.168.199.149\--control-plane-endpoint=cluster-endpoint\--image-repositoryregistry.cn-hangzhou.aliyuncs.com/google_containers\--kubernetes-versionv1.25.0\--service-cidr=10.96.0.0/12\--pod-network-cidr=172.20.0.0/16```###步骤6:安装网络插件初始化成功后,必须安装网络插件(如Calico)才能使Pod之间通信。```bash#安装Calico(注意pod-network-cidr与初始化时指定的CIDR一致)kubectlapply-fhttps://docs.projectcalico.org/manifests/calico.yaml```###步骤7:验证集群状态```bashkubectlgetpods-nkube-systemkubectlgetnodes```###相关问题1.如何解决Kubernetes节点`NotReady`状态问题?2.如何更新Kubernetes集群证书以避免过期问题?3.如何配置Kubernetes使用私有镜像仓库?4.如何诊断和修复Calico网络插件导致的Pod网络不通问题?希望以上步骤能帮助您成功初始化Kubernetes集群。如果问题仍然存在,请提供kubelet日志的详细错误信息以便进一步分析。</think>### Kubernetes 集群初始化超时问题分析与解决 您的 `kubeadm init` 失败主要涉及两个关键错误: 1. **控制平面启动超时**:`timed out waiting for the condition` 2. **kubelet 报错**:`Error getting node` 和 `Failed to create sandbox for pod` 以下是系统化的解决方案: --- ### 一、核心问题诊断 #### 1. **检查 kubelet 状态与日志** ```bash systemctl status kubelet # 检查服务状态 journalctl -u kubelet -f --since "5 minutes ago" | grep -i -E "error|failed" # 关键错误过滤 ``` 重点关注: - `Failed to create sandbox for pod` → 容器运行时问题 - `Error getting node` → 节点注册失败 - `cgroup` 相关错误 → 系统配置问题 #### 2. **检查容器运行时状态** ```bash # Docker 运行时检查 sudo systemctl status docker docker info | grep -i cgroup # 确认 cgroup 驱动 # Containerd 运行时检查 sudo systemctl status containerd containerd config dump | grep cgroup ``` #### 3. **检查控制平面组件** ```bash # 查看未启动的组件 crictl ps -a | grep -E 'kube-apiserver|kube-controller|kube-scheduler|etcd' | grep -v Running # 检查组件日志 crictl logs <容器ID> # 替换为故障容器ID ``` --- ### 二、根本原因与解决方案 #### 🐋 原因1:容器运行时配置不兼容 (最常见) **现象**:`Failed to create sandbox for pod` **解决**:统一 kubelet 和容器运行时的 cgroup 驱动 ```bash # 步骤1: 修改 Docker 配置 (如使用 Docker) sudo tee /etc/docker/daemon.json <<EOF { "exec-opts": ["native.cgroupdriver=systemd"] } EOF # 步骤2: 修改 kubelet 配置 sudo tee /etc/sysconfig/kubelet <<EOF KUBELET_EXTRA_ARGS="--cgroup-driver=systemd" EOF # 步骤3: 重启服务 sudo systemctl daemon-reload sudo systemctl restart docker kubelet ``` #### 🔥 原因2:残留配置冲突 **现象**:`Using existing` 证书提示 **解决**:彻底清理旧配置 ```bash sudo kubeadm reset -f sudo rm -rf /etc/kubernetes/ /var/lib/etcd /var/lib/kubelet sudo iptables -F && sudo iptables -t nat -F sudo ipvsadm --clear ``` #### 🌐 原因3:网络插件未预装 **现象**:控制平面组件无法通信 **解决**:**先安装 CNI 插件再初始化** ```bash # 安装 containernetworking 插件 CNI_VERSION="v1.3.0" sudo mkdir -p /opt/cni/bin curl -L "https://github.com/containernetworking/plugins/releases/download/${CNI_VERSION}/cni-plugins-linux-amd64-${CNI_VERSION}.tgz" | sudo tar -C /opt/cni/bin -xz ``` #### 🔒 原因4:防火墙/端口冲突 ```bash # 开放必要端口 sudo firewall-cmd --permanent --add-port={6443,2379-2380,10250,10251,10252}/tcp sudo firewall-cmd --reload # 检查端口占用 sudo ss -tulnp | grep -E '6443|2379' ``` --- ### 三、安全初始化流程 ```bash # 1. 清理环境 sudo kubeadm reset -f sudo rm -rf /etc/kubernetes /var/lib/etcd # 2. 安装 CNI 插件 (必须!) curl -L https://github.com/containernetworking/plugins/releases/download/v1.3.0/cni-plugins-linux-amd64-v1.3.0.tgz | sudo tar -C /opt/cni/bin -xz # 3. 创建初始化配置文件 cat <<EOF | tee kubeadm-config.yaml apiVersion: kubeadm.k8s.io/v1beta3 kind: InitConfiguration nodeRegistration: criSocket: "unix:///var/run/containerd/containerd.sock" # 按实际运行时修改 kubeletExtraArgs: cgroup-driver: "systemd" --- apiVersion: kubeadm.k8s.io/v1beta3 kind: ClusterConfiguration imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers kubernetesVersion: v1.25.0 controlPlaneEndpoint: "cluster-endpoint:6443" networking: podSubnet: "172.20.0.0/16" serviceSubnet: "10.96.0.0/12" EOF # 4. 初始化集群 sudo kubeadm init --config=kubeadm-config.yaml --upload-certs --v=5 ``` --- ### 四、初始化后关键步骤 ```bash # 1. 配置 kubectl mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config # 2. 安装网络插件 (Calico 示例) kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.25.0/manifests/calico.yaml # 3. 验证节点状态 watch kubectl get nodes -o wide # 等待状态变为 Ready ``` > **关键提示**:90% 的初始化超时由 cgroup 驱动不匹配或 CNI 插件缺失导致[^2]。使用配置文件可避免参数传递错误。 --- ### 相关问题 1. 如何诊断 Kubernetes 节点 `NotReady` 状态的具体原因? 2. 如何修复 `kubelet` 与容器运行时的 CRI 接口不兼容问题? 3. 如何为 Kubernetes 配置离线镜像仓库以加速部署? 4. 如何解决 Calico 网络插件导致的 `IPAM` 地址分配错误?
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值