从故障ceph cluster中恢复rbd

本文详细介绍了如何在故障ceph集群中恢复rbd的过程。首先创建并记录rbd信息,然后将rbd映射到客户端进行数据写入。接着卸载客户端上的rbd,之后在osd服务器上执行恢复操作。最后使用特定脚本恢复rbd,并验证恢复成功。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

操作环境

centos 6.5 x86_64
ceph 0.87

操作步骤

本文主要介绍从故障ceph cluster中恢复rbd方法步骤:

1.创建rbd,并记录rbd信息,一定要预先保存rbd信息主要是size、block_name_prefix信息,这些信息用于恢复rbd使用,否则无法恢复rbd;
root@ceph-osd-2 recovery]# rbd create -s 10240 bobtest
[root@ceph-osd-2 recovery]# rbd info bobtest
rbd image 'bobtest':
        size 10240 MB in 2560 objects
        order 22 (4096 kB objects)
        block_name_prefix: rb.0.bcb9.238e1f29
        format: 1     

2.将bobtest rbd映射到client端,写入测试数据
[root@linux-nfs ~]# rbd -p rbd map bobtest
/dev/rbd0
[root@linux-nfs ~]# rbd showmapped
id pool image   snap device    
0  rbd  bobtest -    /dev/rbd0 
[root@linux-nfs ~]# mkfs.xfs /dev/rbd0
log stripe unit (4194304 bytes) is too large (maximum is 256KiB)
log stripe unit adjusted to 32KiB
meta-data=/dev/rbd0              isize=256    agcount=17, agsize=162816 blks
         =                       sectsz=512   attr=2, projid32bit=0
data     =                       bsize=4096   blocks=2621440, imaxpct=25
         =                       sunit=1024   swidth=1024 blks
naming   =version 2              bsize=4096   ascii-ci=0
log      =internal log           bsize=4096   blocks=2560, version=2
         =                       sectsz=512   sunit=8 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
[root@linux-nfs ~]# mount /dev/rbd0 /mnt/
[root@linux-nfs ~]# touch /mnt/test
[root@linux-nfs ~]# echo recovery test > /mnt/test
[root@linux-nfs ~]# cat /mnt/test
recovery test

3.将bobtest rbd从client端卸载掉
[root@linux-nfs ~]# umount /mnt/
[root@linux-nfs ~]# rbd unmap /dev/rbd0

4.在osd server中对bobtest rbd进行恢复(ceph cluster中必须有1个可用的osd server)
下载恢复rbd所需的脚本
#wget -O rbd_restore https://raw.githubusercontent.com/smmoore/ceph/master/rbd_restore.sh
#chmod a+x rbd_restore
脚本具体如下:
#!/bin/sh
#
# AUTHORS
# Shawn Moore <smmoore@catawba.edu>
# Rodney Rymer <rrr@catawba.edu>
#
#
# REQUIREMENTS
# GNU Awk (gawk)
#
#
# NOTES
# This utility assumes one copy of all object files needed to construct the rbd
# are located in the present working direcory at the time of execution.
# For example all the rb.0.1032.5e69c215.* files.
#
# When listing the "RBD_SIZE_IN_BYTES", be sure you list the full potential size,
# not just what it appears to be. If you do not know the true size of the rbd,
# you can input a size in bytes that you know is larger than the disk could be
# and it will be a large sparse file with un-partioned space at the end of the
# disk.  In our tests, this doesn't occupy any more space/objects in the cluster
# but the rbd could be resized from within the rbd (VM) to grow.  Once you bring
# it up and are able to find the true size, you can resize with "rbd resize ..".
"recovery/rbd_restore" 73L, 2771C
#!/bin/sh
#
# AUTHORS
# Shawn Moore <smmoore@catawba.edu>
# Rodney Rymer <rrr@catawba.edu>
#
#
# REQUIREMENTS
# GNU Awk (gawk)
#
#
# NOTES
# This utility assumes one copy of all object files needed to construct the rbd
# are located in the present working direcory at the time of execution.
# For example all the rb.0.1032.5e69c215.* files.
#
# When listing the "RBD_SIZE_IN_BYTES", be sure you list the full potential size,
# not just what it appears to be. If you do not know the true size of the rbd,
# you can input a size in bytes that you know is larger than the disk could be
# and it will be a large sparse file with un-partioned space at the end of the
# disk.  In our tests, this doesn't occupy any more space/objects in the cluster
# but the rbd could be resized from within the rbd (VM) to grow.  Once you bring
# it up and are able to find the true size, you can resize with "rbd resize ..".
#
# To obtain needed utility input information if not already known run:
# rbd info RBD
#
# To find needed files we run the following command on all nodes that might have
# copies of the rbd objects:
# find /${CEPH} -type f -name rb.0.1032.5e69c215.*
# Then copy the files to a single location from all nodes.  If using btrfs be
# sure to pay attention to the btrfs snapshots that ceph takes on it's own.
# You may want the "current" or one of the "snaps".
#
# We are actually taking our own btrfs snapshots cluster osd wide at the same
# time with parallel ssh and then using "btrfs subvolume find-new" command to
# merge them all together for disaster recovery and also outside of ceph rbd
# versioning.
#
# Hopefully once the btrfs send/recv functionality is stable we can switch to it.
#
#
# This utility works for us but may not for you.  Always test with non-critical
# data first.
#

# Rados object size
obj_size=4194304

# DD bs value
rebuild_block_size=512

rbd="${1}"
base="${2}"
rbd_size="${3}"
if [ "${1}" = "-h" -o "${1}" = "--help" -o "${rbd}" = "" -o "${base}" = "" -o "${rbd_size}" = "" ]; then
  echo "USAGE: $(echo ${0} | awk -F/ '{print $NF}') RESTORE_RBD BLOCK_PREFIX RBD_SIZE_IN_BYTES"
  exit 1
fi
base_files=$(ls -1 ${base}.* 2>/dev/null | wc -l | awk '{print $1}')
if [ ${base_files} -lt 1 ]; then
  echo "COULD NOT FIND FILES FOR ${base} IN $(pwd)"
  exit
fi

# Create full size sparse image.  Could use truncate, but wanted
# as few required files and dd what a must.
dd if=/dev/zero of=${rbd} bs=1 count=0 seek=${rbd_size} 2>/dev/null

for file_name in $(ls -1 ${base}.* 2>/dev/null); do
  seek_loc=$(echo ${file_name} | awk -F_ '{print $1}' | awk -v os=${obj_size} -v rs=${rebuild_block_size} -F. '{print os*strtonum("0x" $NF)/rs}')
  dd conv=notrunc if=${file_name} of=${rbd} seek=${seek_loc} bs=${rebuild_block_size} 2>/dev/null
done

开始恢复rbd
[root@ceph-osd-2 recovery]# for block in $(find / -type f -name  rb.0.bcb9.238e1f29.*); do cp $block . ; done
[root@ceph-osd-2 recovery]# ./rbd_restore bobtest  rb.0.bcb9.238e1f29 10737418240      #这两个参数是rbd信息,前者为block_name_prefix,后者为rbd大小(单位KB)
[root@ceph-osd-2 recovery]# file bobtest
bobtest: SGI XFS filesystem data (blksz 4096, inosz 256, v2 dirs)
[root@ceph-osd-2 recovery]# du -h bobtest
15M     bobtest

通过loop挂载bobtest
[root@ceph-osd-2 recovery]# losetup -f
/dev/loop1
[root@ceph-osd-2 recovery]# losetup /dev/loop1 bobtest                                                                                                    [root@ceph-osd-2 recovery]# mount /dev/loop1 /mnt/
[root@ceph-osd-2 recovery]# ls /mnt/
test
[root@ceph-osd-2 recovery]# cat /mnt/test 
recovery test

ok~~~以上就完成了在故障ceph cluster中(必须有1个可用的osd server),rbd恢复的全过程。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值