k8s版本v1.20.5
k8s部署使用的sealos cri-docker版本0.2.5
Mar 4 10:04:32 [localhost] kubelet: I0304 10:04:32.115116 20375 event.go:291] "Event occurred" object="ops/test-5b5c66869b-t89rt" kind="Pod" apiVersion="v1" type="Normal" reason="Pulling" message="Pulling image \"imagexxxxxxxxxxx:tag\""
此路略去一些
Mar 4 10:06:31 [localhost] cri-dockerd: time="2025-03-04T10:06:31+08:00" level=info msg="Stop pulling image imagexxxxxxxxxxx:tag: Extracting [==========================================> ] 275.2MB/324.4MB"
Mar 4 10:06:31 [localhost] dockerd: time="2025-03-04T10:06:31.375736105+08:00" level=error msg="Not continuing with pull after error: context canceled"
Mar 4 10:06:31 [localhost] kubelet: E0304 10:06:31.376130 20375 remote_image.go:113] PullImage "imagexxxxxxxxxxx:tag" from image service failed: rpc error: code = Unknown desc = context deadline exceeded
网络说的在kubelet中添加参数 --runtime-request-timeout 试过了在我的问题中没有解决
问题原因:
k8s通过runtime-request-timeout参数调用容器运行接口cri-docker 0.2.5版本不支持,所以k8s的参数不生效
解决方式:
方案一:升级cri-docker到0.2.6后解决:测试通过
cri-docker 源码 https://github.com/Mirantis/cri-dockerd, 在460da8ef84e7d2781ee275907543764b6a77c5ff 这个commit修复的, 在v0.2.6版本包含此提交
commit 460da8ef84e7d2781ee275907543764b6a77c5ff
Author: evol262 <87092321+evol262@users.noreply.github.com>
Date: Wed Aug 17 05:37:00 2022 -0400
Use context.WithCancel for potentially long operations
diff --git a/libdocker/kube_docker_client.go b/libdocker/kube_docker_client.go
index 1da13e7..b1ec287 100644
--- a/libdocker/kube_docker_client.go
+++ b/libdocker/kube_docker_client.go
@@ -397,7 +397,7 @@ func (d *kubeDockerClient) PullImage(
return err
}
opts.RegistryAuth = base64Auth
- ctx, cancel := context.WithTimeout(context.Background(), d.timeout)
+ ctx, cancel := context.WithCancel(context.Background())
defer cancel()
resp, err := d.client.ImagePull(ctx, image, opts)
if err != nil {
@@ -446,7 +446,7 @@ func (d *kubeDockerClient) Logs(
opts dockertypes.ContainerLogsOptions,
sopts StreamOptions,
) error {
方案二:启动cri-docker指定超时时间
/etc/systemd/system/cri-docker.service
注意:--runtime-request-timeout 该参数是在0.2.5版本引入的之前的版本指定此参数应该不生效
以上两个解决方式均测试能解决