问题描述
最近发现 Docker 容器运行异常了。 错误如下:
Oct 09 14:00:29 argus dockerd[107571]: time="2021-10-09T14:00:29.429276020+08:00" level=error msg="stream copy error: reading from a closed fifo"
Oct 09 14:00:29 argus dockerd[107571]: time="2021-10-09T14:00:29.467465144+08:00" level=error msg="908db45ee03b432306090e0c7918aaad1b7ff60bc6e4524d2d9f244b8a18528c cleanup: failed to delete container from containerd: no such container"
Oct 09 14:00:29 argus dockerd[107571]: time="2021-10-09T14:00:29.467496833+08:00" level=error msg="failed to start container" container=908db45ee03b432306090e0c7918aaad1b7ff60bc6e4524d2d9f244b8a18528c error="OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: rootfs_linux.go:76: mounting \"/var/run/docker.sock\" to rootfs at \"/var/run/docker.sock\" caused: mount through procfd: not a directory: unknown: Are you trying to mount a directory onto a file (or vice-versa)? Check if the specified host path exists and is the expected type"
Oct 09 14:00:29 argus dockerd[107571]: time="2021-10-09T14:00:29.565328511+08:00" level=error msg="942b45b860afd834580041b4f7ba71bfb2168d039f8a8a7b4251063293fa2198 cleanup: failed to delete container from containerd: no such container"
Oct 09 14:00:29 argus dockerd[107571]: time="2021-10-09T14:00:29.565362064+08:00" level=error msg="failed to start container" container=942b45b860afd834580041b4f7ba71bfb2168d039f8a8a7b4251063293fa2198 error="OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: rootfs_linux.go:76: mounting \"/var/run/docker.sock\" to rootfs at \"/var/run/docker.sock\" caused: mount through procfd: not a directory: unknown: Are you trying to mount a directory onto a file (or vice-versa)? Check if the specified host path exists and is the expected type"
运行docker version
,出现
Client: Docker Engine - Community
Version: 20.10.9
API version: 1.41
Go version: go1.16.8
Git commit: c2ea9bc
Built: Mon Oct 4 16:08:14 2021
OS/Arch: linux/amd64
Context: default
Experimental: true
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
解决过程
首先怀疑是 /etc/docker/daemon.json 的格式问题。看起来格式没错。
{
"registry-mirrors": ["https://registry.aliyuncs.com","https://registry.docker-cn.com","https://docker.mirrors.ustc.edu.cn/"]
}
问题依旧。
然后执行sudo tail -100 /var/log/messages
,发现以下信息:
Oct 9 17:05:14 argus systemd: docker.service: main process exited, code=exited, status=1/FAILURE
Oct 9 17:05:14 argus systemd: Failed to start Docker Application Container Engine.
Oct 9 17:05:14 argus systemd: Unit docker.service entered failed state.
Oct 9 17:05:14 argus systemd: docker.service failed.
Oct 9 17:05:16 argus systemd: docker.service holdoff time over, scheduling restart.
Oct 9 17:05:16 argus systemd: Stopped Docker Application Container Engine.
Oct 9 17:05:16 argus systemd: Starting Docker Application Container Engine...
Oct 9 17:05:16 argus dockerd: time="2021-10-09T17:05:16.333345074+08:00" level=info msg="Starting up"
Oct 9 17:05:16 argus dockerd: time="2021-10-09T17:05:16.333959115+08:00" level=warning msg="Binding to IP address without --tlsverify is insecure and gives root access on this machine to everyone who has access to your network." host="tcp://0.0.0.0:2375"
Oct 9 17:05:16 argus dockerd: time="2021-10-09T17:05:16.333979904+08:00" level=warning msg="Binding to an IP address, even on localhost, can also give access to scripts run in a browser. Be safe out there!" host="tcp://0.0.0.0:2375"
Oct 9 17:05:17 argus dockerd: time="2021-10-09T17:05:17.334058821+08:00" level=warning msg="Binding to an IP address without --tlsverify is deprecated. Startup is intentionally being slowed down to show this message" host="tcp://0.0.0.0:2375"
Oct 9 17:05:17 argus dockerd: time="2021-10-09T17:05:17.334100630+08:00" level=warning msg="Please consider generating tls certificates with client validation to prevent exposing unauthenticated root access to your network" host="tcp://0.0.0.0:2375"
Oct 9 17:05:17 argus dockerd: time="2021-10-09T17:05:17.334115027+08:00" level=warning msg="You can override this by explicitly specifying '--tls=false' or '--tlsverify=false'" host="tcp://0.0.0.0:2375"
Oct 9 17:05:17 argus dockerd: time="2021-10-09T17:05:17.334126498+08:00" level=warning msg="Support for listening on TCP without authentication or explicit intent to run without authentication will be removed in the next release" host="tcp://0.0.0.0:2375"
Oct 9 17:05:32 argus dockerd: time="2021-10-09T17:05:32.342810539+08:00" level=info msg="parsed scheme: \"unix\"" module=grpc
Oct 9 17:05:32 argus dockerd: time="2021-10-09T17:05:32.343110722+08:00" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
Oct 9 17:05:32 argus dockerd: time="2021-10-09T17:05:32.343161327+08:00" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///run/containerd/containerd.sock <nil> 0 <nil>}] <nil> <nil>}" module=grpc
Oct 9 17:05:32 argus dockerd: time="2021-10-09T17:05:32.343178970+08:00" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Oct 9 17:05:32 argus dockerd: time="2021-10-09T17:05:32.344570079+08:00" level=info msg="parsed scheme: \"unix\"" module=grpc
Oct 9 17:05:32 argus dockerd: time="2021-10-09T17:05:32.344584115+08:00" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
Oct 9 17:05:32 argus dockerd: time="2021-10-09T17:05:32.344598783+08:00" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///run/containerd/containerd.sock <nil> 0 <nil>}] <nil> <nil>}" module=grpc
Oct 9 17:05:32 argus dockerd: time="2021-10-09T17:05:32.344608932+08:00" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Oct 9 17:05:32 argus dockerd: time="2021-10-09T17:05:32.365514254+08:00" level=info msg="[graphdriver] using prior storage driver: overlay2"
Oct 9 17:05:33 argus dockerd: failed to start daemon: error while opening volume store metadata database: timeout
Oct 9 17:05:33 argus dockerd: time="2021-10-09T17:05:33.319533268+08:00" level=info msg="stopping event stream following graceful shutdown" error="context canceled" module=libcontainerd namespace=plugins.moby
Oct 9 17:05:33 argus systemd: docker.service: main process exited, code=exited, status=1/FAILURE
Oct 9 17:05:33 argus systemd: Failed to start Docker Application Container Engine.
Oct 9 17:05:33 argus systemd: Unit docker.service entered failed state.
Oct 9 17:05:33 argus systemd: docker.service failed.
Oct 9 17:05:35 argus systemd: docker.service holdoff time over, scheduling restart.
Oct 9 17:05:35 argus systemd: Stopped Docker Application Container Engine.
Oct 9 17:05:35 argus systemd: start request repeated too quickly for docker.service
Oct 9 17:05:35 argus systemd: Failed to start Docker Application Container Engine.
Oct 9 17:05:35 argus systemd: Unit docker.service entered failed state.
Oct 9 17:05:35 argus systemd: docker.service failed.
Oct 9 17:08:02 argus firewalld[48274]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -t nat -D OUTPUT -m addrtype --dst-type LOCAL -j DOCKER' failed: iptables: No chain/target/match by that name.
Oct 9 17:08:02 argus firewalld[48274]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -t nat -D PREROUTING' failed: iptables: Bad rule (does a matching rule exist in that chain?).
Oct 9 17:08:02 argus firewalld[48274]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -t nat -D OUTPUT' failed: iptables: Bad rule (does a matching rule exist in that chain?).
Oct 9 17:08:02 argus firewalld[48274]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -t filter -X DOCKER' failed: iptables: Too many links.
Oct 9 17:08:02 argus firewalld[48274]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -t filter -X DOCKER-ISOLATION-STAGE-1' failed: iptables: Too many links.
Oct 9 17:08:02 argus firewalld[48274]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -t filter -F DOCKER-ISOLATION' failed: iptables: No chain/target/match by that name.
Oct 9 17:08:02 argus firewalld[48274]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -t filter -X DOCKER-ISOLATION' failed: iptables: No chain/target/match by that name.
Oct 9 17:08:02 argus firewalld[48274]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -D FORWARD -i docker0 -o docker0 -j DROP' failed: iptables: Bad rule (does a matching rule exist in that chain?).
Oct 9 17:08:02 argus firewalld[48274]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -D FORWARD -i docker0 -o docker0 -j DROP' failed: iptables: Bad rule (does a matching rule exist in that chain?).
Oct 9 17:08:19 argus firewalld[48274]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -t nat -D OUTPUT -m addrtype --dst-type LOCAL -j DOCKER' failed: iptables: No chain/target/match by that name.
Oct 9 17:08:19 argus firewalld[48274]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -t nat -D PREROUTING' failed: iptables: Bad rule (does a matching rule exist in that chain?).
Oct 9 17:08:19 argus firewalld[48274]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -t nat -D OUTPUT' failed: iptables: Bad rule (does a matching rule exist in that chain?).
Oct 9 17:08:19 argus firewalld[48274]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -t filter -X DOCKER' failed: iptables: Too many links.
Oct 9 17:08:19 argus firewalld[48274]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -t filter -X DOCKER-ISOLATION-STAGE-1' failed: iptables: Too many links.
Oct 9 17:08:19 argus firewalld[48274]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -t filter -F DOCKER-ISOLATION' failed: iptables: No chain/target/match by that name.
Oct 9 17:08:19 argus firewalld[48274]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -t filter -X DOCKER-ISOLATION' failed: iptables: No chain/target/match by that name.
Oct 9 17:08:19 argus firewalld[48274]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -D FORWARD -i docker0 -o docker0 -j DROP' failed: iptables: Bad rule (does a matching rule exist in that chain?).
Oct 9 17:08:20 argus firewalld[48274]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -D FORWARD -i docker0 -o docker0 -j DROP' failed: iptables: Bad rule (does a matching rule exist in that chain?).
Oct 9 17:09:30 argus NetworkManager[2147]: <info> [1633770570.5751] device (wlp74s0): set-hw-addr: set MAC address to A2:BF:36:ED:CF:52 (scanning)
Oct 9 17:09:30 argus kernel: IPv6: ADDRCONF(NETDEV_UP): wlp74s0: link is not ready
Oct 9 17:09:30 argus NetworkManager[2147]: <info> [1633770570.5912] device (wlp74s0): supplicant interface state: inactive -> disconnected
Oct 9 17:09:30 argus NetworkManager[2147]: <info> [1633770570.5912] device (p2p-dev-wlp74s0): supplicant management interface state: inactive -> disconnected
Oct 9 17:09:30 argus NetworkManager[2147]: <info> [1633770570.5967] device (wlp74s0): supplicant interface state: disconnected -> inactive
Oct 9 17:09:30 argus NetworkManager[2147]: <info> [1633770570.5968] device (p2p-dev-wlp74s0): supplicant management interface state: disconnected -> inactive
Oct 9 17:09:54 argus systemd: Starting Docker Application Container Engine...
Oct 9 17:09:54 argus dockerd: time="2021-10-09T17:09:54.803883590+08:00" level=info msg="Starting up"
Oct 9 17:09:54 argus dockerd: time="2021-10-09T17:09:54.804939390+08:00" level=warning msg="Binding to IP address without --tlsverify is insecure and gives root access on this machine to everyone who has access to your network." host="tcp://0.0.0.0:2375"
Oct 9 17:09:54 argus dockerd: time="2021-10-09T17:09:54.804964938+08:00" level=warning msg="Binding to an IP address, even on localhost, can also give access to scripts run in a browser. Be safe out there!" host="tcp://0.0.0.0:2375"
Oct 9 17:09:55 argus dockerd: time="2021-10-09T17:09:55.805050688+08:00" level=warning msg="Binding to an IP address without --tlsverify is deprecated. Startup is intentionally being slowed down to show this message" host="tcp://0.0.0.0:2375"
Oct 9 17:09:55 argus dockerd: time="2021-10-09T17:09:55.805093829+08:00" level=warning msg="Please consider generating tls certificates with client validation to prevent exposing unauthenticated root access to your network" host="tcp://0.0.0.0:2375"
Oct 9 17:09:55 argus dockerd: time="2021-10-09T17:09:55.805108336+08:00" level=warning msg="You can override this by explicitly specifying '--tls=false' or '--tlsverify=false'" host="tcp://0.0.0.0:2375"
Oct 9 17:09:55 argus dockerd: time="2021-10-09T17:09:55.805119236+08:00" level=warning msg="Support for listening on TCP without authentication or explicit intent to run without authentication will be removed in the next release" host="tcp://0.0.0.0:2375"
Oct 9 17:10:01 argus systemd: Created slice User Slice of root.
Oct 9 17:10:01 argus systemd: Started Session 10775 of user root.
Oct 9 17:10:01 argus systemd: Removed slice User Slice of root.
Oct 9 17:10:10 argus dockerd: time="2021-10-09T17:10:10.814404906+08:00" level=info msg="parsed scheme: \"unix\"" module=grpc
Oct 9 17:10:10 argus dockerd: time="2021-10-09T17:10:10.814677267+08:00" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
Oct 9 17:10:10 argus dockerd: time="2021-10-09T17:10:10.814707734+08:00" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///run/containerd/containerd.sock <nil> 0 <nil>}] <nil> <nil>}" module=grpc
Oct 9 17:10:10 argus dockerd: time="2021-10-09T17:10:10.814724035+08:00" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Oct 9 17:10:10 argus dockerd: time="2021-10-09T17:10:10.815978757+08:00" level=info msg="parsed scheme: \"unix\"" module=grpc
Oct 9 17:10:10 argus dockerd: time="2021-10-09T17:10:10.815995960+08:00" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
Oct 9 17:10:10 argus dockerd: time="2021-10-09T17:10:10.816016408+08:00" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///run/containerd/containerd.sock <nil> 0 <nil>}] <nil> <nil>}" module=grpc
Oct 9 17:10:10 argus dockerd: time="2021-10-09T17:10:10.816034732+08:00" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Oct 9 17:10:10 argus dockerd: time="2021-10-09T17:10:10.841523991+08:00" level=info msg="[graphdriver] using prior storage driver: overlay2"
Oct 9 17:10:10 argus dockerd: time="2021-10-09T17:10:10.846588298+08:00" level=info msg="Loading containers: start."
Oct 9 17:10:10 argus dockerd: time="2021-10-09T17:10:10.856093582+08:00" level=info msg="Firewalld: docker zone already exists, returning"
Oct 9 17:10:10 argus firewalld[48274]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -t nat -D OUTPUT -m addrtype --dst-type LOCAL -j DOCKER' failed: iptables: No chain/target/match by that name.
Oct 9 17:10:10 argus firewalld[48274]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -t nat -D PREROUTING' failed: iptables: Bad rule (does a matching rule exist in that chain?).
Oct 9 17:10:10 argus firewalld[48274]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -t nat -D OUTPUT' failed: iptables: Bad rule (does a matching rule exist in that chain?).
Oct 9 17:10:10 argus firewalld[48274]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -t filter -X DOCKER' failed: iptables: Too many links.
Oct 9 17:10:10 argus firewalld[48274]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -t filter -X DOCKER-ISOLATION-STAGE-1' failed: iptables: Too many links.
Oct 9 17:10:10 argus firewalld[48274]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -t filter -F DOCKER-ISOLATION' failed: iptables: No chain/target/match by that name.
Oct 9 17:10:10 argus firewalld[48274]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -t filter -X DOCKER-ISOLATION' failed: iptables: No chain/target/match by that name.
Oct 9 17:10:11 argus firewalld[48274]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -D FORWARD -i docker0 -o docker0 -j DROP' failed: iptables: Bad rule (does a matching rule exist in that chain?).
Oct 9 17:10:11 argus dockerd: time="2021-10-09T17:10:11.039279039+08:00" level=info msg="Firewalld: interface docker0 already part of docker zone, returning"
Oct 9 17:10:11 argus dockerd: time="2021-10-09T17:10:11.062120823+08:00" level=info msg="Firewalld: interface docker0 already part of docker zone, returning"
Oct 9 17:10:11 argus dockerd: time="2021-10-09T17:10:11.245623234+08:00" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address"
Oct 9 17:10:11 argus firewalld[48274]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -D FORWARD -i docker0 -o docker0 -j DROP' failed: iptables: Bad rule (does a matching rule exist in that chain?).
Oct 9 17:10:11 argus dockerd: time="2021-10-09T17:10:11.329345413+08:00" level=info msg="Firewalld: interface docker0 already part of docker zone, returning"
Oct 9 17:10:11 argus dockerd: time="2021-10-09T17:10:11.422929896+08:00" level=info msg="Loading containers: done."
Oct 9 17:10:11 argus dockerd: time="2021-10-09T17:10:11.450417171+08:00" level=info msg="Docker daemon" commit=79ea9d3 graphdriver(s)=overlay2 version=20.10.9
Oct 9 17:10:11 argus dockerd: time="2021-10-09T17:10:11.450480350+08:00" level=info msg="Daemon has completed initialization"
Oct 9 17:10:11 argus systemd: Started Docker Application Container Engine.
Oct 9 17:10:11 argus dockerd: time="2021-10-09T17:10:11.471178323+08:00" level=info msg="API listen on /var/run/docker.sock"
Oct 9 17:10:11 argus dockerd: time="2021-10-09T17:10:11.480313344+08:00" level=info msg="API listen on [::]:2375"
于是删除 /var/run/docker.pid 文件。问题依旧。
# dockerd
INFO[2021-10-09T17:17:59.260280296+08:00] Starting up
failed to start daemon: pid file found, ensure docker is not running or delete /var/run/docker.pid
运行以下命令,
ps axf | grep docker | grep -v grep | awk '{print "kill -9 " $1}' | sudo sh
然后修改docker.service
[Service]
Type=notify
# the default is not to use systemd for cgroups because the delegate issues still
# exists and systemd currently does not support the cgroup feature set required
# for containers run by docker
# ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock -H tcp://0.0.0.0:2375 # 原始代码
ExecStart=/usr/bin/dockerd -H unix:///var/run/docker.sock -H tcp://0.0.0.0:2375
ExecReload=/bin/kill -s HUP $MAINPID
TimeoutSec=0
RestartSec=2
Restart=always
然后执行sudo daemon-reload && sudo systemctl start docker
重启成功,问题解决
参考资料
[1] https://blog.youkuaiyun.com/chentaichi6002/article/details/100920771
[2] https://blog.youkuaiyun.com/jerry010101/article/details/85817893
[3] https://blog.youkuaiyun.com/weixin_30663471/article/details/99627169