概述
本文介绍 Ubuntu 环境下使用 saltstack 。
环境
测试环境为 Ubuntu server 14.04 。
禁用 : 所有 Ubuntu 系统都禁用 selinux , iptables 。
5 个运行 Ubuntu server 14.04 x86_64 的虚拟机:
192.168.1.119 ceph-node1 192.168.1.111 ceph-node2 192.168.1.112 ceph-node3 192.168.1.113 ceph-node4 192.168.1.114 ceph-node5
我们分配 saltstack 中的角色:
所有节点 都担任 Minion 角色,ceph-node1 同时担任 Master 角色。
主机名
请按上面的机器分配,设置好每个机器的主机名。编辑各机器上的 /etc/hostname 文件即可。并修改 /etc/hosts里的 127.0.1.1 指向该名。本测试配置完成后是这样的:
ouser@ceph-node1:~$ sudo salt '*' cmd.run 'grep 127.0.1.1 /etc/hosts' ceph-node2: 127.0.1.1 ceph-node2 ceph-node4: 127.0.1.1 ceph-node4 ceph-node1: 127.0.1.1 ceph-node1 ceph-node5: 127.0.1.1 ceph-node5 ceph-node3: 127.0.1.1 ceph-node3 ouser@ceph-node1:~$ sudo salt '*' cmd.run 'cat /etc/hostname' ceph-node1: ceph-node1 ceph-node5: ceph-node5 ceph-node4: ceph-node4 ceph-node3: ceph-node3 ceph-node2: ceph-node2
安装
所有安装在相应角色虚拟机上执行。
Master 角色
sudo apt-get install salt-master salt-minion
Minion 角色
sudo apt-get install salt-minion
配置
只需配置 Minion 即可,编辑每个 Minion 角色机器上的 /etc/salt/minion 文件,配置 master 选项:
master: 192.168.1.119
并重启所有 Minion 角色服务器上的 salt-minion 服务:
sudo /etc/init.d/salt-minion restart
测试
注意 : 除特别说明,以下所有命令均在 Master 服务器上执行。
接受 Minion 的认证
所有的 Minion 配置完成,并重启 salt-minion 服务后。我们在 Master 服务器上执行 sudo salt-key -L 命令可以查看到当前 等待认证的列表:
$ sudo salt-key -L Accepted Keys: Unaccepted Keys: ceph-node1 ceph-node2 ceph-node3 ceph-node4 ceph-node5 Rejected Keys:
运行 sudo salt-key -A 授受所有这些认证:
$ sudo salt-key -A The following keys are going to be accepted: Unaccepted Keys: ceph-node1 ceph-node2 ceph-node3 ceph-node4 ceph-node5 Proceed? [n/Y] Y Key for minion ceph-node1 accepted. Key for minion ceph-node2 accepted. Key for minion ceph-node3 accepted. Key for minion ceph-node4 accepted. Key for minion ceph-node5 accepted.
批量测试命令
$ sudo salt '*' test.ping ceph-node2: True ceph-node1: True ceph-node5: True ceph-node4: True ceph-node3: True
批量执行命令
$ sudo salt '*' cmd.run 'hostname -s' ceph-node2: ceph-node2 ceph-node5: ceph-node5 ceph-node1: ceph-node1 ceph-node4: ceph-node4 ceph-node3: ceph-node3
安装Ceph
参考
http://ceph.com/docs/master/install/manual-deployment/概述
本文手动安装 ceph 环境。使用 saltstack 批量管理。
请先保证按照上面手册安装好 saltstack 环境。
机器分配如下:
192.168.1.119 ceph-node1 192.168.1.111 ceph-node2 192.168.1.112 ceph-node3 192.168.1.113 ceph-node4 192.168.1.114 ceph-node5
saltstack 角色划分
所有节点均担任 Minion 角色, ceph-node1 同时担任 Minion 角色
Ceph 角色划分
主机名
请按上面的机器分配,设置好每个机器的主机名。编辑各机器上的 /etc/hostname 文件即可。并修改 /etc/hosts里的 127.0.1.1 指向该名。本测试配置完成后是这样的:
ouser@ceph-node1:~$ sudo salt '*' cmd.run 'grep 127.0.1.1 /etc/hosts' ceph-node2: 127.0.1.1 ceph-node2 ceph-node4: 127.0.1.1 ceph-node4 ceph-node1: 127.0.1.1 ceph-node1 ceph-node5: 127.0.1.1 ceph-node5 ceph-node3: 127.0.1.1 ceph-node3 ouser@ceph-node1:~$ sudo salt '*' cmd.run 'cat /etc/hostname' ceph-node1: ceph-node1 ceph-node5: ceph-node5 ceph-node4: ceph-node4 ceph-node3: ceph-node3 ceph-node2: ceph-node2
所有节点均担任 osd 节点角色。
注意 : 所有 salt 命令操作都是在 saltstack Master 角色服务器上执行。
准备
解决 locale 警告:
sudo salt '*' cmd.run 'locale-gen zh_CN.UTF-8'
安装
Ceph Storage Cluster
$ sudo salt '*' cmd.run 'apt-get install ceph ceph-mds'
Deploy a Cluster Manually
所有 Ceph clusters 都需要:至少 1 台 monitor , 至少 as many OSDs as copies of an object stored on the cluster 。
Monitor Bootstrapping
这是第一步,我们将在 ceph-node1 节点安装 monitor 服务。
启用 monitor 需要:
Unique Identifier : fsid Cluster Name : 默认的名字是 ceph Monitor Name : 默认是 hostname ,可以使用 hostname -s 命令获取短名字 Monitor Map : Monitor Keyring : Monitors 之间的通信需要一个密钥。 Administrator Keyring : 使用 ceph 命令,需要有个 client.admin 用户。操作流程
我们在 ceph-node1 上启用 monitor .
登录 ceph-node1 。
确认 /etc/ceph 存在。
默认我们使用 ceph 作为 cluster 名字,因此创建配置文件 /etc/ceph/ceph.conf。
使用 uuidgen 命令生成 fsid
$ uuidgen 4e7d2940-7824-4b43-b85e-1078a1b54cb5
配置文件 ceph.conf 里 fsid :
fsid = 4e7d2940-7824-4b43-b85e-1078a1b54cb5
配置 ceph.conf 其它:
mon initial members = ceph-node1 mon host = 192.168.1.119
Create a keyring for your cluster and generate a monitor secret key.
$ ceph-authtool --create-keyring /tmp/ceph.mon.keyring --gen-key -n mon. --cap mon 'allow *'
Generate an administrator keyring, generate a client.admin user and add the user to the keyring.
$ sudo ceph-authtool --create-keyring /etc/ceph/ceph.client.admin.keyring --gen-key -n client.admin --set-uid=0 --cap mon 'allow *' --cap osd 'allow *' --cap mds 'allow'
Add the client.admin key to the ceph.mon.keyring.
$ sudo ceph-authtool /tmp/ceph.mon.keyring --import-keyring /etc/ceph/ceph.client.admin.keyring
Generate a monitor map using the hostname(s), host IP address(es) and the FSID. Save it as /tmp/monmap:
$ monmaptool --create --add ceph-node1 192.168.1.119 --fsid 4e7d2940-7824-4b43-b85e-1078a1b54cb5 /tmp/monmap
Create a default data directory (or directories) on the monitor host(s).
$ sudo mkdir /var/lib/ceph/mon/ceph-node1
Populate the monitor daemon(s) with the monitor map and keyring.
$ sudo ceph-mon --mkfs -i ceph-node1 --monmap /tmp/monmap --keyring /tmp/ceph.mon.keyring
最终 /etc/ceph/ceph.con 文件内容配置为下:
[global] fsid = 4e7d2940-7824-4b43-b85e-1078a1b54cb5 mon initial members = ceph-node1 mon host = 192.168.1.119 public network = 192.168.1.0/24 auth cluster required = cephx auth service required = cephx auth client required = cephx osd journal size = 1024 filestore xattr use omap = true osd pool default size = 2 osd pool default min size = 1 osd pool default pg num = 333 osd pool default pgp num = 333 osd crush chooseleaf type = 1
启动 monitor :
sudo start ceph-mon id=ceph-node1
Verify that Ceph created the default pools.
sudo ceph osd lspools
应该能看见下面信息:
0 data,1 metadata,2 rbd,
Verify that the monitor is running :
ouser@ceph-node1:~$ sudo ceph -s cluster 4e7d2940-7824-4b43-b85e-1078a1b54cb5 health HEALTH_ERR 192 pgs stuck inactive; 192 pgs stuck unclean; no osds monmap e1: 1 mons at {ceph-node1=192.168.1.119:6789/0}, election epoch 2, quorum 0 ceph-node1 osdmap e1: 0 osds: 0 up, 0 in pgmap v2: 192 pgs, 3 pools, 0 bytes data, 0 objects 0 kB used, 0 kB / 0 kB avail 192 creating
Adding OSDs
现在我们设置好了 1 个 monitor , 是时候增加 osd 了。只有增加足够的 osd , 我们的 cluster 才能达到active + clean 正確狀態 。 osd pool default size = 2 设置决定至少需要 2 个 osd 节点加入。
在 bootstrapping monitor 後,我的 cluster 有了一個默認的 CRUSH map,但是該 CRUSH 還沒有任何 Ceph OSD Daemons 映射到 Ceph Node 。
Short Form
Ceph 提供一個 ceph-disk 工具,可以用來爲 Ceph 準備磁盤,分區或目錄。 ceph-disk 工具會自動執行下面Long Form 步驟。
在 ceph-node1, ceph-node2 上執行下面命令創建 OSD :
$ sudo ceph-disk prepare --cluster ceph --cluster-uuid 4e7d2940-7824-4b43-b85e-1078a1b54cb5 --fs-type ext4 /dev/hdd1
Activate the OSD:
$ sudo ceph-disk activate /dev/hdd1
Long Form
進入 OSD 節點。手動創建 OSD 並加入到 cluster 和 CRUSH map 。
取得一個 UUID :
$ uuidgen b373f62e-ddf6-41d5-b8ee-f832318a31e1
創建 OSD , 如果沒有指定 UUID ,它啓動時自動分配一個 UUID 。下面的命令會輸出 osd number ,後面需要用到:
$ sudo ceph osd create b373f62e-ddf6-41d5-b8ee-f832318a31e1 1
進入新的 OSD 節點,並執行:
$ ssh {new-osd-host} $ sudo mkdir /var/lib/ceph/osd/ceph-{osd-number}
If the OSD is for a drive other than the OS drive, prepare it for use with Ceph, and mount it to the directory you just created:
$ ssh {new-osd-host} $ sudo mkfs -t {fstype} /dev/{hdd} $ sudo mount -o user_xattr /dev/{hdd} /var/lib/ceph/osd/ceph-{osd-number}
Initialize the OSD data directory.
$ ssh {new-osd-host} $ sudo ceph-osd -i 1 --mkfs --mkkey --osd-uuid b373f62e-ddf6-41d5-b8ee-f832318a31e1
Register the OSD authentication key. The value of ceph for ceph-{osd-num} in the path is the $cluster-$id. If your cluster name differs from ceph, use your cluster name instead.:
$ sudo ceph auth add osd.{osd-num} osd 'allow *' mon 'allow profile osd' -i /var/lib/ceph/osd/ceph-{osd-num}/keyring
Storage: Ceph
Build a three node ceph storage cluster
It is recommended you look through the Official installation documents for the most up to date information : http://ceph.com/docs/master/install/
Currently, it's not possible to build the cluster on proxmox host. For a production system you need 3 servers minimum. For testing you can get by with less, although you may be unable to properly test all the features of the cluster.
Proxmox Supports CEPH >= 0.56
Prepare nodes
Install Ubuntu.It is recommended to use Ubuntu 12.04 LTS, this is the distribution used by Inktank for Ceph development. (you need recent filesystem version and glibc)
Create SSH key on server1 and distribute it.Generate a ssh key
ssh-keygen -t rsa
and copy it to the other servers
ssh-copy-id user@server2
ssh-copy-id user@server3
Configure ntp on all nodes to keep time updated:
sudo apt-get install ntp
Install Ceph-Deploy
Create entries for all other Ceph nodes in /etc/hosts Add Ceph repositorieswget -q -O- 'https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc' | sudo apt-key add -
echo deb http://ceph.com/debian-dumpling/ $(lsb_release -sc) main | sudo tee /etc/apt/sources.list.d/ceph.list
Install packages
sudo apt-get update
sudo apt-get install ceph-deploy
Create cluster using Ceph-Deploy
Create your clusterceph-deploy new server1
Install Ceph on all nodes
ceph-deploy install server1 server2 server3
You could also run:
ceph-deploy install server{1..3}
Add a Ceph monitor.
ceph-deploy mon create server{1..3}
(You must have an odd number of monitors. If you only have one it will be a single point of failure so consider using at least 3 for high availability.)
Gather keysceph-deploy gatherkeys server1
Prepare OSDs on each server For each data disk, you need 1 osd daemon.It is assumed that these disks are empty and contain no data, zap will delete all data on disks.Verify the names of your data disks!
sudo fdisk -l
For servers that are not identical:
ceph-deploy osd --zap-disk create server1:sdb
ceph-deploy osd --zap-disk create server2:sdb
ceph-deploy osd --zap-disk create server3:sdc
For 3 identical servers, each with 3 data disks (sdb, sdc, sdd)
ceph-deploy osd --zap-disk create server{1..3}:sd{b..d}
By default the journal is placed on the same disk. To change this specify the path to the journal: ceph-deploy osd prepare {node-name}:{disk}[:{path/to/journal}]
Check the health of the cluster
sudo ceph -s
Customize Ceph
Set your number of placement groupssudo ceph osd pool set rbd pg_num 512
The following formula is generally used:
Total PGs = (# of OSDs * 100) / ReplicasTake this result and round up to the nearest Power of 2. For 9 OSDS you would do:9 * 100 = 900Default number of replicas is 2 so 900/2 = 450 rounded to the next power of 2 so 512. Create a new poolsudo ceph osd pool create {name_of_pool} {pg_num}
Example:
sudo ceph osd pool create pve_data 512
Change the number of replica groups for a pool
sudo ceph osd pool set {name_of_pool} size {number_of_replicas}
Example:
sudo ceph osd pool set pve_data size 3
Configure Proxmox to use the ceph cluster
GUI
You can use proxmox GUI to add the rbd storage
Manual configuration edit
edit your /etc/pve/storage.cfg and add the configuration
rbd: mycephcluster
monhost 192.168.0.1:6789;192.168.0.2:6789;192.168.0.3:6789
pool rbd (optional, default =r rbd)
username admin (optional, default = admin)
content images
note: you must use ip (not dns fqdn) for monhost
Authentication
If you use cephx authentication, you need to copy the keyfile from Ceph to Proxmox VE host.
Create the /etc/pve/priv/ceph directory
mkdir /etc/pve/priv/ceph
Copy the keyring
scp cephserver1:/etc/ceph/ceph.client.admin.keyring /etc/pve/priv/ceph/StorageID.keyring
The keyring must be named to match your Storage ID
Copying the keyring generally requires root privileges. If you do not have the root account enabled on Ceph, you can "sudo scp" the keyring from the Ceph server to Proxmox.
Note that for early versions of Ceph *Argonaut*, the keyring was named ceph.keyring rather than ceph.client.admin.keyring