文章目录
一、pg相关
1、xx objects unfound
- 问题描述:
dmesg查看磁盘发现读写异常,部分对象损坏(处于objects nofound状态),集群处于ERR状态
root@node1101:~# ceph health detail
HEALTH_ERR noscrub,nodeep-scrub flag(s) set; 13/409798 objects unfound(0.003%);17 stuck requests are blocked > 4096 sec. Implicated osds 38
OSDMAP_FLAGS noscrub,nodeep-scrub flag(s) set
OBJECT_UNFOUND 13/409798 objects unfound (0.003%)
pg 5.309 has 1 unfound objects
pg 5.2da has 1 unfound objects
pg 5.2c9 has 1 unfound objects
pg 5.1e2 has 1 unfound objects
pg 5.6a has 1 unfound objects
pg 5.120 has 1 unfound objects
pg 5.148 has 1 unfound objects
pg 5.14b has 1 unfound objects
pg 5.160 has 1 unfound objects
pg 5.35b has 1 unfound objects
pg 5.39c has 1 unfound objects
pg 5.3ad has 1 unfound objects
REQUEST_STUCK 17 stuck requests are blocked > 4096 sec. Implicated osds 38
17 ops are blocked > 67108.9 sec
osd.38 has stuck requests > 67108.9 sec
- 处理措施:
将unfound pg强制删除,参考命令:ceph pg {pgid} mark_unfound_lost delete
注:如需批量删除unfound pg,则参考命令如下
for i in `ceph health detail | grep pg | awk '{print $2}'`;do ceph pg $i mark_unfound_lost delete;done
2、Reduced data availability: xx pgs inactive
- 问题描述:
磁盘出现读写异常,osd无法启动,强制替换故障盘为新盘加入到集群,出现pgs inactive(unkown)
root@node1106:~# ceph -s
cluster:
id: 7f1aa879-afbb-4b19-9bc3-8f55c8ecbbb4
health: HEALTH_WARN
4 clients failing to respond to capability release
3 MDSs report slow metadata IOs
1 MDSs report slow requests
3 MDSs behind on trimming
noscrub,node

最低0.47元/天 解锁文章
1900

被折叠的 条评论
为什么被折叠?



