ceph osd slow ops 检测

原创已于 2023-02-28 16:55:03 修改 · 1.3k 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#ceph

于 2023-02-28 16:42:44 首次发布

ceph 专栏收录该内容

64 篇文章

订阅专栏

本文介绍了如何检测和处理Ceph存储系统中OSD（Object Storage Daemon）的慢操作问题，包括message layer、osd prepares、filestore问题、与本地磁盘相关的OSD事件，以及获取和修改OSD配置信息的方法。

目的

常用的方法检测 ceph slow 问题

参考

yceph -s
  cluster:
    id:     22908555-e596-4c2d-a1f6-34fcf4d3e935
    health: HEALTH_WARN
            Degraded data redundancy: 46384/12805029 objects degraded (0.362%), 145 pgs degraded, 122 pgs undersized
            309 slow ops, oldest one blocked for 252 sec, daemons [osd.0,osd.10,osd.101,osd.105,osd.106,osd.107,osd.110,osd.111,osd.112,osd.116]... have slow ops.

  services:
    mon: 3 daemons, quorum gd15-ceph-mon-dbbackup-003,gd15-ceph-mon-dbbackup-001,gd15-ceph-mon-dbbackup-002 (age 4d)
    mgr: gd15-ceph-mon-dbbackup-001(active, since 4d), standbys: gd15-ceph-mon-dbbackup-003
    mds: dba_fs:1 {0=gd15-ceph-mds-dbbackup-002=up:active} 2 up:standby
    osd: 152 osds: 152 up (since 28m), 152 in (since 51m); 122 remapped pgs

  data:
    pools:   3 pools, 4353 pgs
    objects: 4.27M objects, 16 TiB
    usage:   75 TiB used, 784 TiB / 860 TiB avail
    pgs:     46384/12805029 objects degraded (0.362%)
             1260/12805029 objects misplaced (0.010%)
             4205 active+clean
             119  active+recovery_wait+undersized+degraded+remapped
             24   active+recovery_wait+degraded
             2    active+recovering+undersized+remapped
             1    active+recovering+degraded
             1    active+recovery_wait
             1    active+recovering+undersized+degraded+remapped

  io:
    client:   7.5 GiB/s wr, 0 op/s rd, 2.04k op/s wr
    recovery: 10 MiB/s, 2 objects/s

检测 OSD slow 信息

ceph daemon /var/run/ceph/vip-ceph-osd.0.asok dump_ops_in_flight 
ceph daemon /var/run/ceph/vip-ceph-osd.0.asok dump_historic_ops

返回信息提示

message layer

信息	解释
header_read	When the messenger first started reading the message off the wire.
throttled	When the messenger tried to acquire memory throttle space to read the message into memory.
all_read	When the messenger finished reading the message off the wire.
dispatched	When the messenger gave the message to the OSD.
initiated	This is identical to header_read. The existence of both is a historical oddity.

osd prepares

信息	解释
queued_for_pg	The op has been put into the queue for processing by its PG.
reached_pg	The PG has started doing the op.
waiting for *	The op is waiting for some other work to complete before it can proceed (e.g. a new OSDMap; for its object target to scrub; for the PG to finish peering; all as specified in the message).
started	The op has been accepted as something the OSD should do and is now being performed.
waiting for subops from	The op has been sent to replica OSDs.

filestore problem

信息	解释
commit_queued_for_journal_write	The op has been given to the FileStore.
write_thread_in_journal_buffer	The op is in the journal’s buffer and waiting to be persisted (as the next disk write).
journaled_completion_queued	The op was journaled to disk and its callback queued for invocation.

osd 事件，与本地盘相关

信息	解释
op_commit	The op has been committed by the primary OSD.
op_applied	The op has been write()’en to the backing FS on the primary.
sub_op_applied: op_applied	For a replica’s “subop”.
sub_op_committed: op_commit	For a replica’s sub-op (only for EC pools).
sub_op_commit_rec/sub_op_apply_rec from	The primary marks this when it hears about the above, but for a particular replica (i.e. ).
commit_sent	We sent a reply back to the client (or primary OSD, for sub ops).

获取 osd 配置信息方法

ceph daemon /var/run/ceph/vip-ceph-osd.0.asok config  show

修改方法

ceph daemon /var/run/ceph/vip-ceph-osd.0.asok config  set name value