Pod一直显示Terminating

探讨了Kubernetes集群中Pod长时间处于Terminating状态的问题,分析了可能的原因,包括内核问题,以及如何通过检查事件、日志和系统进程来诊断问题。文中详细展示了排查过程,并指出重启物理机是当前唯一的解决方案。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

问题

集群中有一个pod一直显示Terminating

event

Normal Scheduled 1h default-scheduler Successfully assigned feed-426565da19777e5d325f-5994dc5cff-znqmh to node01
Normal SuccessfulMountVolume 1h kubelet, node01 (combined from similar events): MountVolume.SetUp succeeded for volume "lvm"
Normal Pulled 1h kubelet, node01 Container image  already present on machine
Normal Created 1h kubelet, node01 Created container
Normal Started 1h kubelet, node01 Started container
Warning Unhealthy 9m (x44 over 1h) kubelet, node01 Liveness probe failed: Get http://*:65318/state.json: dial tcp *.*.*.*:65318: getsockopt: connection refused
Warning Unhealthy 9m (x45 over 1h) kubelet, node01 Readiness probe failed: Get http://*:65318/state.json: dial tcp *.*.*.*:65318: getsockopt: connection refused
Normal Killing 9m kubelet, node01 Killing container with id docker://main:Need to kill Pod
Warning FailedKillPod 5m (x2 over 7m) kubelet, node01 error killing pod: failed to "KillPodSandbox" for "163f99a9-1aec-11e9-a7cd-246e96ab9970" with KillPodSandboxError: "rpc error: code = DeadlineExceeded desc = context deadline exceeded"

日志

kubelet: error killing pod: failed to "KillPodSandbox" for "163f99a9-1aec-11e9-a7cd-246e96ab9970" with KillPodSandboxError: "rpc error: code = DeadlineExceeded desc = context deadline exceeded"

探究

查看进程

ps aux |grep D #查看无法终止的进程(stat D)
root 2626 0.0 0.0 0 0 ? Ds 14:42 0:00 [pause]

ps afx |grep -C 10 2626 #显示父进程
root 2626 2603 0 14:42 ? 00:00:00 [pause]

ps -ef |grep 2603 #查看父进程
root 2603 27573 0 14:42 ? 00:00:00 docker-containerd-shim -namespace moby -workdir /home/docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/ba519a9f1a1102a922bcc74ced7a7fc9fd3f963feea4b8de

ps -ef |grep 27573
root 27573 27553 0 2018 ? 17:02:40 docker-containerd --config /var/run/docker/containerd/containerd.toml

ps -ef |grep 27553
root 27553 1 3 2018 ? 3-16:03:11 /usr/bin/dockerd --bip=10.126.64.193/26 --mtu=1500 -g /home/docker -D -H tcp://127.0.0.1:1983 -H unix:///var/run/docker.sock --tlsverify --iptables=false --storage-driver=devicemapper --storage-opt dm.override_udev_sync_check=true --storage-opt dm.datadev=/dev/vg_root/dmdata --storage-opt dm.metadatadev=/dev/vg_root/dmmeta --exec-opt native.cgroupdriver=cgroupfs

docker ps |grep ba519a9f1a1 #查看docker
ba519a9f1a11 k8s.gcr.io/pause-amd64:3.1 "/pause" 2 hours ago Up 2 hours k8s_POD_feed-426565da19777e5d325f-5994dc5cff-znqmh_ocean-feed_163f99a9-1aec-11e9-a7cd-246e96ab9970_0

可能原因

可能是内核原因
https://stackoverflow.com/questions/34552232/cant-kill-processes-originating-in-a-docker-container
目前只有重启物理机才能解决

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值