GPFS报错 “stale file handle”

本文介绍了如何应对计算机节点上GPFS服务因内存不足被杀死导致的文件系统无法使用的问题,包括检查、强制卸载并重新启动GPFS的详细步骤。解决方案简单易行,适合快速恢复服务。
部署运行你感兴趣的模型镜像

Unfortunately, the GPFS service running at compute nodes “mmfsd” gets killed sometimes by the out of memory killer. This usually unmounts GPFS from the compute node and makes it unavailable in LSF. The mount point /gpfs3 shows “stale file handle”. This can also happen even after the node was rebooted.

The good news is that there is a quite easy fix.

Login to compute node with SSH

Check /gpfs3 mount point
[root@b35n20 ~]# ls -l /gpfs3/
ls: /gpfs3/: Stale file handle
ls: cannot open directory /gpfs3/: Stale file handle

shut down GPFS locally, start it again and wait a few seconds
[root@b35n20 ~]# mmshutdown
Wed Oct 2 09:32:10 CEST 2019: mmshutdown: Starting force unmount of GPFS file systems
Wed Oct 2 09:32:15 CEST 2019: mmshutdown: Shutting down GPFS daemons
Shutting down!
Unloading modules from /lib/modules/3.10.0-957.21.3.el7.x86_64/extra
Unloading module mmfs26
Unloading module mmfslinux
Wed Oct 2 09:32:19 CEST 2019: mmshutdown: Finished
[root@b35n20 ~]# mmstartup
Wed Oct 2 09:32:31 CEST 2019: mmstartup: Starting GPFS …

Check /gpfs3 mount point again
[root@b35n20 xcatpost]# ls -l /gpfs3
total 123534208
drwxrwxr-x 171 root root 32768 Sep 30 10:04 applications

您可能感兴趣的与本文相关的镜像

Stable-Diffusion-3.5

Stable-Diffusion-3.5

图片生成
Stable-Diffusion

Stable Diffusion 3.5 (SD 3.5) 是由 Stability AI 推出的新一代文本到图像生成模型,相比 3.0 版本,它提升了图像质量、运行速度和硬件效率

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值