本手册详解讲解部署、运维和使用 Ceph 的过程。
部署:涉及 Ceph 资源规划、组件安装&配置、状态检查等,提供一个高性能、高可靠性、多功能的存储集群;
运维:扩容、下线节点、常见问题和故障、Troubleshooting 等;
应用:详细演示 磁盘快、对象、文件系统 的使用方式,以及作为 K8S 持久化存储的使用方式(PV PVC StorageClass) 等;
一、部署
本手册讲解使用 ceph-deploy 工具部署 luminous 版本 Ceph 集群的步骤。
主机规划如下:
| IP | 主机名 | 内容 |
|---|---|---|
| 172.27.132.65 | kube-node1 | mgr、mon、osd |
| 172.27.132.66 | kube-node2 | osd |
| 172.27.132.67 | kube-node3 | mds、osd |
1. 节点初始化
节点初始化
配置软件源
sudo yum install -y epel-release
cat << EOM > /etc/yum.repos.d/ceph.repo
[ceph-noarch]
name=Ceph noarch packages
baseurl=https://download.ceph.com/rpm-luminous/el7/noarch
enabled=1
gpgcheck=1
type=rpm-md
gpgkey=https://download.ceph.com/keys/release.asc
EOM
安装依赖的软件包
sudo yum install -y ntp ntpdate ntp-doc openssh-server
创建和配置 ceph 账户
在所有的 Ceph Node 创建 ceph 运行的专有账户:
sudo useradd -d /home/ceph -m ceph
sudo passwd ceph # 这里设置密码为 ceph
为 Ceph 用户添加 sudo 权限:
echo “ceph ALL = (root) NOPASSWD:ALL” | sudo tee /etc/sudoers.d/ceph
sudo chmod 0440 /etc/sudoers.d/ceph
配置主机别名
在所有节点上设置 hosts,使得各个 ceph node 可以通过 hostname 访问,注意访问自己的 hostname 的时候,不能解析到 127.0.0.1:
$ grep node /etc/hosts
172.27.132.65 kube-node1 kube-node1
172.27.132.66 kube-node2 kube-node2
172.27.132.67 kube-node3 kube-node3
关闭 SELinux
关闭 SELinux,否则后续 K8S 挂载目录时可能报错 Permission denied:
$ sudo setenforce 0
$ grep SELINUX /etc/selinux/config
SELINUX=disabled
修改配置文件,永久生效;
其它
关闭 requiretty:修改 /etc/sudoers 文件,注释 Defaults requiretty,或设置为:Defaults:ceph !requiretty
初始化 Ceph deploy 节点
按照规则,172.27.132.65 kube-node1 节点将作为 deploy 节点。
配置 kube-node1 节点的 ceph 账户可以免密码登陆到所有节点(包括自身):
su -l ceph
ssh-keygen -t rsa
ssh-copy-id ceph@kube-node1
ssh-copy-id ceph@kube-node2
ssh-copy-id ceph@kube-node3
设置 kube-node1 的 ceph 账户默认登录其它节点的用户名为 ceph:
cat >>/home/ceph/.ssh/config <<EOF
Host kube-node1
Hostname kube-node1
User ceph
Host kube-node2
Hostname kube-node2
User ceph
Host kube-node3
Hostname kube-node3
User ceph
EOF
chmod 600 ~/.ssh/config
安装 ceph-deploy 工具:
sudo yum update
sudo yum install ceph-deploy
2. 部署 monitor 节点
创建 Ceph 集群和部署 monitor 节点
如果未指明,本文档中的所有操作均在 deploy 节点 ceph 用户家目录 (/home/ceph) 下进行。
创建 deploy 工作目录,保存安装过程中生成的文件:
su -l ceph
mkdir my-cluster
cd my-cluster
创建 ceph 集群
创建名为 ceph 的集群:
[ceph@kube-node1 my-cluster]$ ceph-deploy new kube-node1 # 参数为初始的 monitor 节点(实际上只是在当前目录下生成 ceph.conf 和 ceph.mon.keyring 文件)
输出:
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/ceph/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.0): /bin/ceph-deploy new kube-node1
…
[ceph_deploy.new][DEBUG ] Creating new cluster named ceph
[ceph_deploy.new][INFO ] making sure passwordless SSH succeeds
[kube-node1][DEBUG ] connection detected need for sudo
[kube-node1][DEBUG ] connected to host: kube-node1
[kube-node1][DEBUG ] detect platform information from remote host
[kube-node1][DEBUG ] detect machine type
[kube-node1][DEBUG ] find the location of an executable
[kube-node1][INFO ] Running command: sudo /usr/sbin/ip link show
[kube-node1][INFO ] Running command: sudo /usr/sbin/ip addr show
[kube-node1][DEBUG ] IP addresses found: [u’172.30.53.0’, u’172.30.53.1’, u’172.27.132.65’]
[ceph_deploy.new][DEBUG ] Resolving host kube-node1
[ceph_deploy.new][DEBUG ] Monitor kube-node1 at 172.27.132.65
[ceph_deploy.new][DEBUG ] Monitor initial members are [‘kube-node1’]
[ceph_deploy.new][DEBUG ] Monitor addrs are [‘172.27.132.65’]
[ceph_deploy.new][DEBUG ] Creating a random mon key…
[ceph_deploy.new][DEBUG ] Writing monitor keyring to ceph.mon.keyring…
[ceph_deploy.new][DEBUG ] Writing initial config to ceph.conf…
命令结束后,在当前工作目录自动生成了集群配置文件 ceph.conf、日志文件和用于 bootstrap monitor 节点的 ceph.mon.keyring 文件:
[ceph@kube-node1 my-cluster]$ ls .
ceph.conf ceph-deploy-ceph.log ceph.mon.keyring
修改 ceph.conf 文件中的默认配置,最终结果如下:
[ceph@kube-node1 my-cluster]$ cat ceph.conf
[global]
fsid = 0dca8efc-5444-4fa0-88a8-2c0751b47d28
初始 monitor 节点
mon_initial_members = kube-node1
mon_host = 10.64.3.9
cephx 认证授权
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
副本数,应该 <= OSD 数目
osd pool default size = 3
最小副本数
osd pool default min size = 1
PG 和 PGP 默认值
osd pool default pg num = 128
osd pool default pgp num = 128
只启用 Centos Kernel 支持的 layering 特性
rbd_default_features = 1
osd crush chooseleaf type = 1
max mds = 5
mds max file size = 100000000000000
mds cache size = 1000000
文件系统调优
osd_mkfs_type = xfs
osd_mount_options_xfs = rw,noatime,inode64,logbsize=256k,delaylog
osd_mkfs_options_xfs = -f -i size=2048
Journal 调优
journal_max_write_entries = 1000
journal_queue_max_ops = 3000
journal_max_write_bytes = 1048576000
journal_queue_max_bytes = 1048576000
Op tracker
osd_enable_op_tracker = false
OSD Client
osd_client_message_size_cap = 0
osd_client_message_cap = 0
Objector
objecter_inflight_ops = 102400
objector_inflight_op_bytes = 1048576000
Throttles
ms_dispatch_throttle_bytes = 1048576000
OSD Threads
osd_op_threads = 32
osd_op_num_shards = 5
osd_op_num_threads_per_shard = 2
Network,适用于多个网卡的情况
public network = 10.0.0.0/8 # 适用于 ceph client 与集群间的通信
cluster network = 10.0.0.0/8 # 适用于 ceph OSD 之间的数据传输和通信
在所有节点上安装 ceph 软件包(ceph 和 ceph-radosgw):
–release 参数指定安装的版本为 luminous(不指定时默认为 jewel):
[ceph@kube-node1 my-cluster]$ ceph-deploy install --release luminous kube-node1 kube-node2 kube-node3
部署 monitor 节点
初始化 ceph-deploy new kube-node1 命令指定的初始 monitor 节点:
[ceph@kube-node1 my-cluster]$ ceph-deploy mon create-initial # create-initial/stat/remove
输出:
kube-node1][INFO ] monitor: mon.kube-node1 is running
[kube-node1][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.kube-node1.asok mon_status
[ceph_deploy.mon][INFO ] processing monitor mon.kube-node1
…
[kube-node1][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.kube-node1.asok mon_status
[ceph_deploy.mon][INFO ] mon.kube-node1 monitor has reached quorum!
[ceph_deploy.mon][INFO ] all initial monitors are running and have formed quorum
[ceph_deploy.mon][INFO ] Running gatherkeys…
…
[kube-node1][INFO ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --admin-daemon=/var/run/ceph/ceph-mon.kube-node1.asok mon_status
[kube-node1][INFO ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-kube-node1/keyring auth get client.admin
[kube-node1][INFO ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-kube-node1/keyring auth get client.bootstrap-mds
[kube-node1][INFO ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-kube-node1/keyring auth get client.bootstrap-mgr
[kube-node1][INFO ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-kube-node1/keyring auth get-or-create client.bootstrap-mgr mon allow profile bootstrap-mgr
[kube-node1][INFO ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-kube-node1/keyring auth get client.bootstrap-osd
[kube-node1][INFO ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-kube-node1/keyring auth get client.bootstrap-rgw
[ceph_deploy.gatherkeys][INFO ] Storing ceph.client.admin.keyring
[ceph_deploy.gatherkeys][INFO ] Storing ceph.bootstrap-mds.keyring
[ceph_deploy.gatherkeys][INFO ] Storing ceph.bootstrap-mgr.keyring
[ceph_deploy.gatherkeys][INFO ] keyring ‘ceph.mon.keyring’ already exists
[ceph_deploy.gatherkeys][INFO ] Storing ceph.bootstrap-osd.keyring
[ceph_deploy.gatherkeys][INFO ] Storing ceph.bootstrap-rgw.keyring
[ceph_deploy.gatherkeys][INFO ] Destroy temp directory /tmp/tmpw0Skr7
命令执行结束后,会在当前目录生成初始化 mds、osd、rgw 的 keyring 文件 {cluster-name}.bootstrap-{type}.keyring,同时创建了 client.admin 用户和它的 keyring,这些 keyring 都已经保存到集群中,供后续部署相应节点时使用:
[ceph@kube-node1 my-cluster]$ ls -l
total 476
-rw------- 1 ceph ceph 113 Jul 5 16:08 ceph.bootstrap-mds.keyring
-rw------- 1 ceph ceph 71 Jul 5 16:08 ceph.bootstrap-mgr.keyring
-rw------- 1 ceph ceph 113 Jul 5 16:08 ceph.bootstrap-osd.keyring
-rw------- 1 ceph ceph 113 Jul 5 16:08 ceph.bootstrap-rgw.keyring
-rw------- 1 ceph ceph 129 Jul 5 16:08 ceph.client.admin.keyring
-rw-rw-r-- 1 ceph ceph 201 Jul 5 14:15 ceph.conf
-rw-rw-r-- 1 ceph ceph 456148 Jul 5 16:08 ceph-deploy-ceph.log
-rw------- 1 ceph ceph 73 Jul 5 14:15 ceph.mon.keyring
[ceph@kube-node1 my-cluster]$ ls -l /var/lib/ceph/
total 0
drwxr-x— 2 ceph ceph 26 Apr 24 00:59 bootstrap-mds
drwxr-x— 2 ceph ceph 26 Apr 24 00:59 bootstrap-mgr
drwxr-x— 2 ceph ceph 26 Apr 24 00:59 bootstrap-osd
drwxr-x— 2 ceph ceph 6 Apr 24 00:59 bootstrap-rbd
drwxr-x— 2 ceph ceph 26 Apr 24 00:59 bootstrap-rgw
drwxr-x— 2 ceph ceph 6 Apr 24 00:59 mds
drwxr-x— 3 ceph ceph 29 Apr 24 00:59 mgr
drwxr-x— 3 ceph ceph 29 Apr 24 00:59 mon
drwxr-x— 2 ceph ceph 6 Apr 24 00:59 osd
drwxr-xr-x 2 root root 6 Apr 24 00:59 radosgw
drwxr-x— 2 ceph ceph 6 Apr 24 00:59 tmp
[ceph@kube-node1 my-cluster]$ ls /var/lib/ceph//
/var/lib/ceph/bootstrap-mds/ceph.keyring /var/lib/ceph/bootstrap-mgr/ceph.keyring /var/lib/ceph/bootstrap-osd/ceph.keyring /var/lib/ceph/bootstrap-rgw/ceph.keyring
/var/lib/ceph/mon/ceph-kube-node1:
done keyring kv_backend store.db systemd
推送 admin keyring 和 ceph.conf 集群配置文件到所有节点( /etc/ceph/ 目录下),这样后续执行 ceph 命令时不需要指定 monitor 地址和 ceph.client.admin.keyring 文件路径:
[ceph@kube-node1 my-cluster]$ ceph-deploy admin kube-node1 kube-node2 kube-node3
部署 manager 节点 (luminous+ 版本才需要部署 manager 节点)
[ceph@kube-node1 my-cluster] ceph-deploy mgr create kube-node1
输出:
ceph_deploy.conf][DEBUG ] found configuration file at: /home/ceph/.cephdeploy.conf
…
[ceph_deploy.mgr][DEBUG ] Deploying mgr, cluster ceph hosts kube-node1:kube-node1
[kube-node1][DEBUG ] connection detected need for sudo
[kube-node1][DEBUG ] connected to host: kube-node1
[kube-node1][DEBUG ] detect platform information from remote host
[kube-node1][DEBUG ] detect machine type
[ceph_deploy.mgr][INFO ] Distro info: CentOS Linux 7.5.1804 Core
[ceph_deploy.mgr][DEBUG ] remote host will use systemd
[ceph_deploy.mgr][DEBUG ] deploying mgr bootstrap to kube-node1
[kube-node1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[kube-node1][INFO ] Running command: sudo ceph --cluster ceph --name client.bootstrap-mgr --keyring /var/lib/ceph/bootstrap-mgr/ceph.keyring auth get-or-create mgr.kube-node1 mon allow profile mgr osd allow * mds allow * -o /var/lib/ceph/mgr/ceph-kube-node1/keyring
[kube-node1][INFO ] Running command: sudo systemctl enable ceph-mgr@kube-node1
[kube-node1][WARNIN] Created symlink from /etc/systemd/system/ceph-mgr.target.wants/ceph-mgr@kube-node1.service to /usr/lib/systemd/system/ceph-mgr@.service.
[kube-node1][INFO ] Running command: sudo systemctl start ceph-mgr@kube-node1
[kube-node1][INFO ] Running command: sudo systemctl enable ceph.target
查看集群状态
切换到 montor 节点 kube-node1,修改 keyring 文件的权限,使非 root 可读:
$ ssh ceph@kube-node1
[ceph@kube-node1 ~]$ ls /etc/ceph/
ceph.client.admin.keyring ceph.conf rbdmap tmp018nTi
[ceph@kube-node1 ~]$ ls -l /etc/ceph/ceph.client.admin.keyring
-rw------- 1 root root 129 Mar 11 23:43 /etc/ceph/ceph.client.admin.keyring
[ceph@kube-node1 ~]$ sudo chmod +r /etc/ceph/ceph.client.admin.keyring
[ceph@kube-node1 ~]$ ls -l /etc/ceph/ceph.client.admin.keyring
-rw-r–r-- 1 root root 129 Mar 11 23:43 /etc/ceph/ceph.client.admin.keyring
查看当前集群状态:
[ceph@kube-node1 my-cluster]$ ceph -s
cluster:
id: b7b9e370-ea9b-4cc0-8b09-17167c876c24
health: HEALTH_ERR
64 pgs are stuck inactive for more than 60 seconds
64 pgs stuck inactive
64 pgs stuck unclean
no osds
services:
mon: 1 daemons, quorum kube-node1
mgr: kube-node1(active)
osd: 0 osds: 0 up, 0 in
data:
pools: 1 pools, 64 pgs
objects: 0 objects, 0 bytes
usage: 0 kB used, 0 kB / 0 kB avail
pgs: 100.000% pgs not active
64 creating
查看 monitor 节点信息:
[ceph@kube-node1 my-cluster]$ ceph mon dump
dumped monmap epoch 2
epoch 2
fsid b7b9e370-ea9b-4cc0-8b09-17167c876c24
last_changed 2018-07-05 16:34:09.194222
created 2018-07-05 16:07:57.975307
0: 172.27.132.65:6789/0 mon.kube-node1
扩容 montor 节点
安装 luminous 版本的 ceph 软件包:
[ceph@kube-node1 my-cluster]$ ceph-deploy

本文详细介绍Ceph的部署过程,包括资源规划、组件安装与配置、状态检查等,覆盖高性能存储集群搭建、运维策略、常见问题处理及Troubleshooting。同时,深入演示磁盘快照、对象和文件系统的使用,以及作为Kubernetes持久化存储的方式。
最低0.47元/天 解锁文章
1万+

被折叠的 条评论
为什么被折叠?



