环境:REHL8.10
现象:在某个路径下系统命令无法使用
[root@RHEL8-xxx opt]# ll
ls: cannot open directory '.': Input/output error
[root@RHEL8-xxx opt]# ll
ls: cannot open directory '.': Input/output error
[root@RHEL8-xxx opt]# pwd
/opt
df -h 查看磁盘挂载如下
/dev/mapper/rhel-opt 4.2T 4.1T 90G 98% /opt
查看/var/log/messages
[root@RHEL8-xxx ~]# vim /var/log/messages
[root@RHEL8-xxx ~]#
[root@RHEL8-xxx ~]# grep dm-0 /var/log/messages
Feb 12 22:20:16 RHEL8-xxx kernel: Workqueue: xfs-conv/dm-0 xfs_end_io [xfs]
Feb 12 22:20:16 RHEL8-xxx kernel: XFS (dm-0): Internal error xfs_trans_cancel at line 957 of file fs/xfs/xfs_trans.c. Caller xfs_iomap_write_unwritten+0x281/0x2a0 [xfs]
Feb 12 22:20:16 RHEL8-xxx kernel: Workqueue: xfs-conv/dm-0 xfs_end_io [xfs]
Feb 12 22:20:16 RHEL8-xxx kernel: XFS (dm-0): Corruption of in-memory data (0x8) detected at xfs_trans_cancel+0xc6/0x130 [xfs] (fs/xfs/xfs_trans.c:958). Shutting down filesystem
Feb 12 22:20:16 RHEL8-xxx kernel: XFS (dm-0): Please unmount the filesystem and rectify the problem(s)
Feb 14 10:06:27 RHEL8-xxx kernel: dm-0: writeback error on inode 8006597246, offset 8613888, sector 1322459936
Feb 14 14:38:53 RHEL8-xxx kernel: XFS (dm-0): Unmounting Filesystem
Feb 14 15:10:57 RHEL8-xxx kernel: XFS (dm-0): Mounting V5 Filesystem
Feb 14 15:10:58 RHEL8-xxx kernel: XFS (dm-0): Ending clean mount
[root@RHEL8-xxx ~]#
[root@RHEL8-xxx ~]#
如果vim无法使用,则尝试如下命令
[root@RHEL8-xxx ~]# dmesg | grep dm-0
[ 8.230683] XFS (dm-0): Mounting V5 Filesystem
[ 8.464148] XFS (dm-0): Starting recovery (logdev: internal)
[ 9.278087] XFS (dm-0): Ending recovery (logdev: internal)
[559929.348236] Workqueue: xfs-conv/dm-0 xfs_end_io [xfs]
[559929.349739] XFS (dm-0): Internal error xfs_trans_cancel at line 957 of file fs/xfs/xfs_trans.c. Caller xfs_iomap_write_unwritten+0x281/0x2a0 [xfs]
[559929.349947] Workqueue: xfs-conv/dm-0 xfs_end_io [xfs]
[559929.352650] XFS (dm-0): Corruption of in-memory data (0x8) detected at xfs_trans_cancel+0xc6/0x130 [xfs] (fs/xfs/xfs_trans.c:958). Shutting down filesystem
[559929.352872] XFS (dm-0): Please unmount the filesystem and rectify the problem(s)
[559929.353785] dm-0: writeback error on inode 8006597246, offset 8613888, sector 1322459936
[705046.354376] XFS (dm-0): Unmounting Filesystem
[706970.301055] XFS (dm-0): Mounting V5 Filesystem
[706970.617985] XFS (dm-0): Ending clean mount
[root@RHEL8-xxx ~]#
根据提示"Please unmount the filesystem and rectify the problem(s)",接下来卸载/dev/mapper/rhel-opt -> 修复 -> 重新挂载
1.如果需要重新挂载一个使用 LVM(Logical Volume Manager)管理的磁盘(通常位于/dev/mapper),需要确认挂载点、文件系统等
/dev/mapper/rhel-opt 挂载点是/opt,现在要确认文件系统,通过如下方式可以确认文件系统是xfs
方法一 df -T,说明:-T 可以打印文件系统信息
[root@RHEL8-xxx ~]# df -Th /opt
Filesystem Type Size Used Avail Use% Mounted on
/dev/mapper/rhel-opt xfs 4.2T 3.5T 677G 85% /opt
[root@RHEL8-xxx ~]#
方法二:
如果该盘重启后会自动挂载,查看 /etc/fstab
# After editing this file, run 'systemctl daemon-reload' to update systemd
# units generated from this file.
#
UUID=xxxxxxxx-94d5-xxxx-xxxx-xxxxxxxxxxxx / xfs defaults 0 0
UUID=xxxxxxxx-1218-xxxx-xxxx-xxxxxxxxxxxx /boot xfs defaults 0 0
UUID=xxxxxxxx-445a-xxxx-xxxx-xxxxxxxxxxxx /home xfs defaults 0 0
/dev/mapper/rhel-opt /opt xfs defaults 0 0
UUID=xxxxxxxx-4f2b-xxxx-xxxx-xxxxxxxxxxxx /var xfs defaults 0 0
UUID=xxxxxxxx-669f-4c0c-xxxx-xxxx-xxxxxxxxxxxx none swap defaults 0 0
~
说明:此文件中/dev/mapper/rhel-opt 没有对应的UUID, 可以通过如下命令查询,然后加上
[root@RHEL8-xxx ~]# sudo blkid | grep /dev/mapper/rhel-opt
/dev/mapper/rhel-opt: UUID="xxxxxxxx-de5e-xxxx-xxxx-xxxxxxxxxxxx" BLOCK_SIZE="512" TYPE="xfs"
[root@RHEL8-xxx ~]#
方法三
[root@RHEL8-xxx ~]# mount | grep /dev/mapper/rhel-opt
/dev/mapper/rhel-opt on /opt type xfs (rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota)
[root@RHEL8-xxx ~]#
2. 确保没有进程占用 /opt
执行以下命令,检查是否有进程正在使用 /opt 目录
# lsof +D /opt
如果有进程占用,可以尝试结束相关进程,如果不确定有哪些进程在使用该目录,可以强制卸载:
umount -l /opt 或者 umount -f /opt
但是lsof 命令报 Input/output error , umount -f /opt 提示 target is busy
说明仍然有进程或内核组件占用了 /opt。但由于 lsof 不能使用(Input/output error),需要其它方法来找到占用 /opt 的进程并卸载。
执行如下命令查看/opt是否有进程占用
[root@RHEL8-xxx ~]# fuser -vm /dev/mapper/rhel-opt
Cannot stat file /proc/68141/fd/3: Input/output error
Cannot stat file /proc/68141/fd/4: Input/output error
Cannot stat file /proc/68141/fd/5: Input/output error
......
USER PID ACCESS COMMAND
/dev/dm-0: tcuser 68141 ....m java
tcuser 68204 ....m java
tcuser 68561 ....m redis-server
......
tcuser 68921 ....m java
tcuser 68974 ....m java
查看进程并kill
ps -ef | grep tcuser
kill -9 $(ps -ef | grep tcuser | awk '{print $2}')
3.. 卸载文件系统
如果 /opt 仍然被占用,尝试:
4.运行 XFS 文件系统检查
使用 xfs_repair 修复 /dev/mapper/rhel-opt
如果 xfs_repair 提示设备正在使用,可执行 xfs_repair -L /dev/mapper/rhel-opt
注意: -L 选项会清除 XFS 日志,可能导致部分数据丢失,但在文件系统无法访问的情况下,这是必要的措施。
[root@RHEL8-xxx ~]# xfs_repair -L /dev/mapper/rhel-opt
Phase 1 - find and verify superblock...
Phase 2 - using internal log
- zero log...
ALERT: The filesystem has valuable metadata changes in a log which is being
destroyed because the -L option was used.
- scan filesystem freespace and inode maps...
sb_icount 11373792, counted 11373728
sb_ifree 201840, counted 306347
sb_fdblocks 13143922, counted 31306926
- found root inode chunk
Phase 3 - for each AG...
- scan and clear agi unlinked lists...
- process known inodes and perform inode discovery...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
correcting nblocks for inode 6444415873, was 3233 - counted 3249
correcting nextents for inode 6444415873, was 81 - counted 82
- agno = 4
- process newly discovered inodes...
Phase 4 - check for duplicate blocks...
- setting up duplicate extent list...
- check for inodes claiming duplicate blocks...
- agno = 0
- agno = 2
- agno = 4
- agno = 1
- agno = 3
Phase 5 - rebuild AG headers and trees...
- reset superblock...
Phase 6 - check inode connectivity...
- resetting contents of realtime bitmap and summary inodes
- traversing filesystem ...
- traversal finished ...
- moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...
Maximum metadata LSN (83:1692886) is ahead of log (1:2).
Format log to cycle 86.
done
[root@RHEL8-xxx ~]#
5.重新挂载
修复完成后,尝试重新挂载:
#mount /dev/mapper/rhel-opt /opt
然后检查是否恢复正常:
df -h /opt
dmesg | tail -n 20
检查开机自启动/etc/fstab,重启环境后再检查挂载
=============================================================
参考资料:chatGTP
Q: RHEL8 中磁盘出现问题如何重新挂载?具体如下,dmesg中提示XFS (dm-0): Please unmount the filesystem and rectify the problem(s) ,出问题的是 /dev/mapper/rhel-opt;# mount | grep /dev/mapper/rhel-opt的输出是 /dev/mapper/rhel-opt on /opt type xfs (rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota);lvdisplay的输出是 --- Logical volume --- LV Path /dev/rhel/opt LV Name opt VG Name rhel LV UUID ……
Q: lsof命令用不了,提示 Input/output error
1. 尝试 fuser 释放占用
执行以下命令查看 /opt 是否有进程占用:
fuser -vm /opt
如果 fuser 正常工作,它会列出占用 /opt 的进程 ID。然后强制终止这些进程:
fuser -k /opt
如果 fuser 也报 Input/output error,继续下一步。
2. 使用 umount -l 进行懒卸载
如果 umount -f 无效,可以尝试 -l 选项(懒卸载),它不会立即释放 /opt,而是等所有进程结束后自动卸载:
umount -l /opt
如果成功卸载,则可以继续执行 xfs_repair 进行文件系统修复。
3. 强制停止可能的相关进程
有些进程可能没有直接使用 /opt,但仍然持有文件句柄。例如:
ps aux | grep /opt
然后手动终止相关进程:
kill -9 <PID>
如果 ps 也遇到 Input/output error,可能需要直接重启系统。
4. 进入单用户模式或救援模式
如果仍然无法卸载,建议重启系统并进入 单用户模式(rescue mode):
在 GRUB 菜单 选择 RHEL 内核,按 e 进入编辑模式。
找到 linux 内核行(以 linux 或 linux16 开头),在行尾添加:
systemd.unit=rescue.target
按 Ctrl + X 启动。
在救援模式下,执行:
umount /opt
xfs_repair /dev/mapper/rhel-opt
修复完成后,执行 reboot。
如果 rescue mode 仍然无法修复,可以尝试 Live CD 或 initramfs 修复模式。
5. 检查 LVM 设备
如果 xfs_repair 仍然报错,可能 LVM 设备异常:
lvchange -an /dev/mapper/rhel-opt
lvchange -ay /dev/mapper/rhel-opt
然后再次尝试 umount 和 xfs_repair。
6. 硬件检查
如果 xfs_repair 报告设备 I/O 错误(I/O error),可能磁盘损坏。执行:
smartctl -a /dev/sdX # X 替换为实际设备
如果磁盘有坏块,可以尝试:
badblocks -sv /dev/sdX
如果磁盘损坏严重,建议更换硬件或尝试 ddrescue 进行数据恢复。
如果你尝试了这些步骤仍然无法修复,提供 dmesg | tail -n 50 和 xfs_repair 的完整错误信息,我可以帮你进一步分析。