一、实验环境

本实验k8s集群共两个节点,配置如下:

序号

名称

IP地址

硬件配置

操作系统

k8s集群信息

1

k8s-master

192.168.5.248

2vCPU、2G内存、60G硬盘

Ubuntu 22.04.4 LTS

k8s版本:1.23.3、

容器运行时:docker://27.5.1

2

k8s-node

192.168.5.249

2vCPU、2G内存、60G硬盘

Ubuntu 22.04.4 LTS

k8s集群信息如下:

root@k8s-master:~# kubectl get nodes -owide
NAME         STATUS   ROLES                  AGE   VERSION   INTERNAL-IP     EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION       CONTAINER-RUNTIME
k8s-master   Ready    control-plane,master   22m   v1.23.3   192.168.5.248   <none>        Ubuntu 22.04.4 LTS   5.15.0-119-generic   docker://27.5.1
k8s-node     Ready    <none>                 13m   v1.23.3   192.168.5.249   <none>        Ubuntu 22.04.4 LTS   5.15.0-119-generic   docker://27.5.1
root@k8s-master:~#
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.

Ubuntu环境kubernetes集群将容器运行时由docker迁移至containerd_k8s

注:初始环境容器运行时是docker。kubernetes的版本为1.23.3。

二、容器运行时迁移过程(docker -> containerd)

2.1、迁移master

1、驱逐master节点上的pod,将master节点状态调为“Ready,SchedulingDisabled”,被驱逐的pod将会在其他节点重新创建。 

执行命令:

# 对master节点执行drain
root@k8s-master:~# kubectl drain k8s-master --delete-emptydir-data --force --ignore-daemonsets
node/k8s-master cordoned
WARNING: ignoring DaemonSet-managed Pods: kube-system/calico-node-lxwbt, kube-system/kube-proxy-qswfj
evicting pod kube-system/coredns-6d8c4cb4d-cgwzr
evicting pod kube-system/coredns-6d8c4cb4d-96hdb
evicting pod kube-system/calico-kube-controllers-64cc74d646-8tcxr
pod/calico-kube-controllers-64cc74d646-8tcxr evicted
pod/coredns-6d8c4cb4d-cgwzr evicted
pod/coredns-6d8c4cb4d-96hdb evicted
node/k8s-master drained
# 查看master节点状态
root@k8s-master:~# kubectl get nodes
NAME         STATUS                     ROLES                  AGE   VERSION
k8s-master   Ready,SchedulingDisabled   control-plane,master   38m   v1.23.3
k8s-node     Ready                      <none>                 29m   v1.23.3
root@k8s-master:~#
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.

命令行解释:

--force  :当一些pod不是经 ReplicationController, ReplicaSet, Job, DaemonSet 或者 StatefulSet 管理的时候,就需要用--force来强制执行 (例如:kube-proxy)
--ignore-daemonsets :忽略daemonset管理的pod。
--delete-emptydir-data :如果有mount local volumn的pod,会强制驱逐pod
  • 1.
  • 2.
  • 3.

2、关闭并卸载Docker

# 停止docker服务
root@k8s-master:~# systemctl disable docker --now	# 停止docker服务并取消开机启动
# 卸载docker服务
apt-get - purge docker.io # Ubuntu卸载命令,卸载并删除文件
  • 1.
  • 2.
  • 3.
  • 4.

报错处理(如遇到):

Ubuntu环境kubernetes集群将容器运行时由docker迁移至containerd_Ubuntu_02

如上错误,是docker停止后被docker.socket重新唤起,需要将docker.socket停止,systemctl disable docker.socket --now, docker服务也正常停止了。

3、安装并配置containerd

# 安装containerd和cri-tools
root@k8s-master:~# apt-get -y install containerd cri-tools
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
containerd is already the newest version (1.7.27-0ubuntu1~22.04.1).
containerd set to manually installed.
cri-tools is already the newest version (1.26.0-00).
cri-tools set to manually installed.
The following packages were automatically installed and are no longer required:
  bridge-utils dns-root-data dnsmasq-base git git-man liberror-perl netcat netcat-openbsd patch pigz ubuntu-fan
Use 'apt autoremove' to remove them.
0 upgraded, 0 newly installed, 0 to remove and 135 not upgraded.
# 配置开机启动
root@k8s-master:~#systemctl enable containerd --now
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.

4、配置crictl并创建配置文件

# 查找确认containerd.sock的路径
root@k8s-master:~# find / -name containerd.sock
/run/containerd/containerd.sock
# 配置crictl
root@k8s-master:~# crictl config runtime-endpoint unix:///run/containerd/containerd.sock
#生成配置文件
root@k8s-master:~# mkdir /etc/containerd
root@k8s-master:~# containerd config default > /etc/containerd/config.toml
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.

5、修改containerd配置文件/etc/containerd/config.toml

# 修改配置文件:
# 打开/etc/containerd/config.toml
# 把SystemdCgroup = false修改成SystemdCgroup = true
# 把sandbox_image = "/pause:3.6"修改成sandbox_image="registry.aliyuncs.com/google_containers/pause:3.7"
# 在[plugins."io.containerd.grpc.v1.cri".registry.mirrors]行(检索mirrors)下添加以下两行。每行缩进2个字符
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
  endpoint = ["https://docker.m.daocloud.io"]
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.

6、重启containerd服务

# 重启containerd服务
root@k8s-master:~# systemctl restart containerd
# 查看containerd服务状态
root@k8s-master:~# systemctl status containerd
● containerd.service - containerd container runtime
     Loaded: loaded (/lib/systemd/system/containerd.service; enabled; vendor preset: enabled)
     Active: active (running) since Wed 2025-06-11 08:57:43 UTC; 7s ago
       Docs: https://containerd.io
    Process: 59532 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS)
   Main PID: 59533 (containerd)
      Tasks: 7
     Memory: 14.7M
        CPU: 163ms
     CGroup: /system.slice/containerd.service
             └─59533 /usr/bin/containerd

Jun 11 08:57:43 k8s-master containerd[59533]: time="2025-06-11T08:57:43.346675885Z" level=info msg="Start subscribing containerd event"
Jun 11 08:57:43 k8s-master containerd[59533]: time="2025-06-11T08:57:43.346988780Z" level=info msg="Start recovering state"
Jun 11 08:57:43 k8s-master containerd[59533]: time="2025-06-11T08:57:43.347336109Z" level=info msg="Start event monitor"
Jun 11 08:57:43 k8s-master containerd[59533]: time="2025-06-11T08:57:43.347616559Z" level=info msg="Start snapshots syncer"
Jun 11 08:57:43 k8s-master containerd[59533]: time="2025-06-11T08:57:43.347817504Z" level=info msg="Start cni network conf syncer for defau>
Jun 11 08:57:43 k8s-master containerd[59533]: time="2025-06-11T08:57:43.348743470Z" level=info msg="Start streaming server"
Jun 11 08:57:43 k8s-master containerd[59533]: time="2025-06-11T08:57:43.348426658Z" level=info msg=serving... address=/run/containerd/conta>
Jun 11 08:57:43 k8s-master containerd[59533]: time="2025-06-11T08:57:43.349243176Z" level=info msg=serving... address=/run/containerd/conta>
Jun 11 08:57:43 k8s-master containerd[59533]: time="2025-06-11T08:57:43.349447764Z" level=info msg="containerd successfully booted in 0.135>
Jun 11 08:57:43 k8s-master systemd[1]: Started containerd container runtime.
root@k8s-master:~#
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.

7、配置并启动kubelet

设置kubelet启动参数

# 修改参数,这里是Ubuntu环境,修改/var/lib/kubelet/kubeadm-flags.env文件,centos修改的是/etc/sysconfig/kubelet
root@k8s-master:~# cat /var/lib/kubelet/kubeadm-flags.env
KUBELET_KUBEADM_ARGS="--container-runtime=remote --runtime-request-timeout=15m --container-runtime-endpoint=unix:///run/containerd/containerd.sock"
# 重启kubelet服务
root@k8s-master:~# systemctl restart kubelet
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.

8、对master节点进行uncordon

# 将节点从不可调度状态恢复到可调度状态
root@k8s-master:~# kubectl uncordon k8s-master
node/k8s-master uncordoned
# 查看集群状态,master节点恢复ready状态。
root@k8s-master:~# kubectl get nodes
NAME         STATUS   ROLES                  AGE    VERSION
k8s-master   Ready    control-plane,master   101m   v1.23.3
k8s-node     Ready    <none>                 92m    v1.23.3
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.

9、验证是否把容器运行时由docker迁移到containerd

root@k8s-master:~# kubectl get nodes -owide
NAME         STATUS   ROLES                  AGE    VERSION   INTERNAL-IP     EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION       CONTAINER-RUNTIME
k8s-master   Ready    control-plane,master   103m   v1.23.3   192.168.5.248   <none>        Ubuntu 22.04.4 LTS   5.15.0-119-generic   containerd://1.7.27
k8s-node     Ready    <none>                 94m    v1.23.3   192.168.5.249   <none>        Ubuntu 22.04.4 LTS   5.15.0-119-generic   docker://27.5.1
root@k8s-master:~#
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.

可以看到master节点的容器运行时现在为containerd://1.7.27,迁移成功。

2.2、迁移work节点

1、驱逐master节点上的pod,将master节点状态调为“Ready,SchedulingDisabled”,被驱逐的pod将会在其他节点重新创建。 

执行命令(master上执行):

# 对节点执行drain
root@k8s-master:~# kubectl drain k8s-node --delete-emptydir-data --force --ignore-daemonsets
node/k8s-node cordoned
WARNING: ignoring DaemonSet-managed Pods: kube-system/calico-node-r6nn9, kube-system/kube-proxy-2fbr5
evicting pod kube-system/coredns-6d8c4cb4d-dhzwd
evicting pod kube-system/coredns-6d8c4cb4d-nkqcs
evicting pod kube-system/calico-kube-controllers-64cc74d646-msndm
pod/calico-kube-controllers-64cc74d646-msndm evicted
pod/coredns-6d8c4cb4d-dhzwd evicted
pod/coredns-6d8c4cb4d-nkqcs evicted
node/k8s-node drained
# 查看工作节点状态,调节为SchedulingDisabled状态
root@k8s-master:~# kubectl get nodes
NAME         STATUS                     ROLES                  AGE    VERSION
k8s-master   Ready                      control-plane,master   106m   v1.23.3
k8s-node     Ready,SchedulingDisabled   <none>                 98m    v1.23.3
root@k8s-master:~#
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.

2、关闭并卸载Docker

# 停止docker服务
root@k8s-node:~# systemctl disable docker --now	# 停止docker服务并取消开机启动
# 卸载docker服务
apt-get -y purge docker.io # Ubuntu卸载命令,卸载并删除文件
  • 1.
  • 2.
  • 3.
  • 4.

3、安装并配置containerd

# 安装containerd和cri-tools
root@k8s-node:~# apt-get -y install containerd cri-tools
# 配置开机启动
root@k8s-node:~#systemctl enable containerd --now
  • 1.
  • 2.
  • 3.
  • 4.

4、配置crictl并创建配置文件

# 查找确认containerd.sock的路径
root@k8s-node:~# find / -name containerd.sock
/run/containerd/containerd.sock
# 配置crictl
root@k8s-node:~# crictl config runtime-endpoint unix:///run/containerd/containerd.sock
#生成配置文件
root@k8s-node:~# mkdir /etc/containerd
root@k8s-node:~# containerd config default > /etc/containerd/config.toml
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.

5、修改containerd配置文件/etc/containerd/config.toml

# 修改配置文件:
# 打开/etc/containerd/config.toml
# 把SystemdCgroup = false修改成SystemdCgroup = true
# 把sandbox_image = "/pause:3.6"修改成sandbox_image="registry.aliyuncs.com/google_containers/pause:3.7"
# 在[plugins."io.containerd.grpc.v1.cri".registry.mirrors]行(检索mirrors)下添加以下两行。每行缩进2个字符
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
  endpoint = ["https://docker.m.daocloud.io"]
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.

6、重启containerd服务

# 重启containerd服务
root@k8s-node:~# systemctl restart containerd
  • 1.
  • 2.

7、配置并启动kubelet

设置kubelet启动参数

# 修改参数,这里是Ubuntu环境,修改/var/lib/kubelet/kubeadm-flags.env文件,centos修改的是/etc/sysconfig/kubelet
root@k8s-node:~# cat /var/lib/kubelet/kubeadm-flags.env
KUBELET_KUBEADM_ARGS="--container-runtime=remote --runtime-request-timeout=15m --container-runtime-endpoint=unix:///run/containerd/containerd.sock"
# 重启kubelet服务
root@k8s-node:~# systemctl restart kubelet
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.

8、对worker节点进行uncordon

#将节点从不可调度状态恢复到可调度状态
root@k8s-master:~# kubectl uncordon k8s-node
node/k8s-node uncordoned
# 查看节点状态,worker节点恢复ready状态。
root@k8s-master:~# kubectl get nodes
NAME         STATUS   ROLES                  AGE    VERSION
k8s-master   Ready    control-plane,master   126m   v1.23.3
k8s-node     Ready    <none>                 117m   v1.23.3
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.

三、验证是否把容器运行时由docker迁移到containerd

3.1、验证当前k8s集群容器运行时

root@k8s-master:~# kubectl get nodes -owide
NAME         STATUS   ROLES                  AGE    VERSION   INTERNAL-IP     EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION       CONTAINER-RUNTIME
k8s-master   Ready    control-plane,master   128m   v1.23.3   192.168.5.248   <none>        Ubuntu 22.04.4 LTS   5.15.0-119-generic   containerd://1.7.27
k8s-node     Ready    <none>                 120m   v1.23.3   192.168.5.249   <none>        Ubuntu 22.04.4 LTS   5.15.0-119-generic   containerd://1.7.27
root@k8s-master:~#
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.

可以看到最终两个节点的容器运行时都迁移到了containerd。

3.2、查看pod状态

root@k8s-master:~# kubectl get pods -A
NAMESPACE     NAME                                       READY   STATUS    RESTARTS      AGE
kube-system   calico-kube-controllers-64cc74d646-wsqd2   1/1     Running   0             23m
kube-system   calico-node-lxwbt                          1/1     Running   2             122m
kube-system   calico-node-r6nn9                          1/1     Running   1             121m
kube-system   coredns-6d8c4cb4d-4hrls                    1/1     Running   0             23m
kube-system   coredns-6d8c4cb4d-84h2v                    1/1     Running   0             23m
kube-system   etcd-k8s-master                            1/1     Running   5             130m
kube-system   kube-apiserver-k8s-master                  1/1     Running   6 (30m ago)   130m
kube-system   kube-controller-manager-k8s-master         1/1     Running   6             130m
kube-system   kube-proxy-2fbr5                           1/1     Running   3             121m
kube-system   kube-proxy-qswfj                           1/1     Running   5             129m
kube-system   kube-scheduler-k8s-master                  1/1     Running   6             130m
root@k8s-master:~#
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.

3.2、测试k8s的dns和网络正常不正常

创建一个测试的pod,验证下迁移容器运行时后,dns和网络是否正常。

# 用docker.io/library/busybox:1.28创建一个pods
root@k8s-master:~# kubectl run busybox-test --image docker.io/library/busybox:1.28  --image-pull-policy=IfNotPresent --restart=Never --rm -it busybox -- sh
If you don't see a command prompt, try pressing enter.
/ # 
# ping测外网,网络正常
/ # ping www.baidu.com
PING www.baidu.com (183.2.172.177): 56 data bytes
64 bytes from 183.2.172.177: seq=0 ttl=52 time=7.423 ms
64 bytes from 183.2.172.177: seq=1 ttl=52 time=7.147 ms
^C
--- www.baidu.com ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 7.147/7.285/7.423 ms
# ping测k8s内部域名。dns解析正常
/ # ping kubernetes.default.svc.cluster.local
PING kubernetes.default.svc.cluster.local (10.96.0.1): 56 data bytes
64 bytes from 10.96.0.1: seq=0 ttl=249 time=2.588 ms
64 bytes from 10.96.0.1: seq=1 ttl=249 time=2.304 ms
64 bytes from 10.96.0.1: seq=2 ttl=249 time=2.391 ms
^C
--- kubernetes.default.svc.cluster.local ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 2.304/2.427/2.588 ms
/ #
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.

经验证,容器运行由docker迁移至containerd后k8s集群dns和网络正常。