1 确认故障磁盘
dmesg
[6061566.878131] sd 0:2:2:0: [sdc]
[6061566.878141] sd 0:2:2:0: [sdc]
[6061566.878147] sd 0:2:2:0: [sdc]
[6061566.878152] sd 0:2:2:0: [sdc] CDB:
[6061566.878162] end_request: critical medium error, dev sdc, sector 4973176
[6061617.793479] sd 0:2:5:0: [sdf]
[6061617.793501] sd 0:2:5:0: [sdf]
[6061617.793513] sd 0:2:5:0: [sdf]
[6061617.793530] sd 0:2:5:0: [sdf] CDB:
[6061617.793549] end_request: critical medium error, dev sdf, sector 12773264
或者使用 megaCli 工具进行检测
[root@hh-yun-ceph-cinder016-128056 ~]# MegaCli -PDList -aALL | less
Enclosure Device ID: 0
Slot Number: 3
Enclosure position: 0
Device Id: 2
Sequence Number: 2
Media Error Count: 227 <- 物理故障
Other Error Count: 2
Enclosure Device ID: 0
Slot Number: 6
Enclosure position: 0
Device Id: 5
Sequence Number: 2
Media Error Count: 573 <- 物理故障
Other Error Count: 0
Predictive Failure Count: 0
确认对应的 OSD NUM
[root@hh-yun-ceph-cinder016-128056 ~]# mount | grep -E 'sdc|sdf'
/dev/sdc1 on /var/lib/ceph/osd/ceph-11 type xfs (rw,relatime,attr2,inode64,noquota)
/dev/sdf1 on /var/lib/ceph/osd/ceph-14 type xfs (rw,relatime,attr2,inode64,noquota)
停止故障 ceph
[root@hh-yun-ceph-cinder016-128056 ~]# /etc/init.d/ceph stop osd.11
=== osd.11 ===
Stopping Ceph osd.11 on hh-yun-ceph-cinder016-128056...kill 173789...kill 173789...done
[root@hh-yun-ceph-cinder016-128056 ~]# /etc/init.d/ceph stop osd.14
=== osd.14 ===
Stopping Ceph osd.14 on hh-yun-ceph-cinder016-128056...kill 173789...kill 173789...done
4. 检测 ceph 数据迁移情况
[root@hh-yun-ceph-cinder016-128056 ~]# ceph -s
cluster dc4f91c1-8792-4948-b68f-2fcea75f53b9
health HEALTH_WARN 103 pgs backfill; 184 pgs backfilling; 288 pgs degraded; 11 pgs peering; 1 pgs recovery_wait; 287 pgs stuck degraded; 325 pgs stuck unclean; 294 pgs stuck undersized; 294 pgs undersized; 5 requests are blocked > 32 sec; recovery 57850/11243389 objects degraded (0.515%); 154869/11243389 objects misplaced (1.377%)
monmap e3: 5 mons at {hh-yun-ceph-cinder015-128055=240.30.128.55:6789/0,hh-yun-ceph-cinder017-128057=240.30.128.57:6789/0,hh-yun-ceph-cinder024-128074=240.30.128.74:6789/0,hh-yun-ceph-cinder025-128075=240.30.128.75:6789/0,hh-yun-ceph-cinder026-128076=240.30.128.76:6789/0}, election epoch 22, quorum 0,1,2,3,4 hh-yun-ceph-cinder015-128055,hh-yun-ceph-cinder017-128057,hh-yun-ceph-cinder024-128074,hh-yun-ceph-cinder025-128075,hh-yun-ceph-cinder026-128076
osdmap e1577: 70 osds: 68 up, 68 in
pgmap v7535397: 20544 pgs, 2 pools, 14286 GB data, 3637 kobjects
42