Pacemaker集群管理:支持fence;对服务本身也有监控。
使用corosync(提供底层的心跳)检测结点心跳
一、搭建集群架构
1)、配置yum源
server1和server2:
yum install pacemaker -y
yum install crmsh-1.2.6-0.rc2.2.1.x86_64.rpm pssh-2.3.1-2.1.x86_64.rpm -y ##这两个包需要自行下载安装
2)配置
server1:
cd /etc/corosync/
cp corosync.conf.example corosync.conf ##拷贝主配置文件
vim corosync.conf ##配置
9 ringnumber: 0
10 bindnetaddr: 172.25.23.0 ##集群节点网段
11 mcastaddr: 226.94.1.1 ##多播地址
12 mcastport: 5428 ##多播端口
13 ttl: 1
22 logfile: /var/log/cluster/corosync.log ##日志记录位置
34 service { ##添加模块:联动pacemaker服务,启动corosync时,pacemaker服务同时开启
35 name:pacemaker ##添加pacemaker服务
36 ver:0 ##版本:为0时表示可自启动pacemaker服务,1表示不自启动packmaker服务
37 }
##,因为节点配置是相同的,因此server1的配置文件发给server2
scp /etc/corosync/corosync.conf root@172.25.23.2:/etc/corosync/
server2
##查看配置文件
cd /etc/corosync/
vim corosync.conf
3)开启corosync服务
server1
/etc/init.d/corosync start
server2
/etc/init.d/corosync start
tail -f /var/log/cluster/corosync ##查看日志,看服务是否可以正常启动
4)测试一下
##online:表示服务正常且可以接管资源的结点;标记部分为添加的资源;started:表示该资源正运行在server1结点上
pacemaker默认无回切机制
crm有两种方式使用方式:交互式及非交互式
server1
crm_mon ##监控
server2
##此处用交互式
[root@server2 corosync]# crm ##进入交互式界面,可以使用Table键来补齐命令
crm(live)# node ##进入对结点操作模式
crm(live)node# standby server1 ##使其为standby状态
crm(live)node#
server1:
Node server1: standby ##server1状态
Online: [ server2 ]
##再让server1 上线
server2:
crm(live)node# online server1
server1:
Online: [ server1 server2 ]
5)使用交互式在各个集群结点上,设置fence
server2:
[root@server2 corosync]# crm
crm(live)# configure ##configure 节点
crm(live)configure# show ##可以使用show命令来看集群内容
node server1 ##node 节点管理
node server2
property $id=“cib-bootstrap-options”
dc-version=“1.1.10-14.el6-368c726”
cluster-infrastructure=“classic openais (with plugin)”
expected-quorum-votes=“2”
crm(live)configure# verify
error: unpack_resources: Resource start-up disabled since no STONITH resources have been defined
error: unpack_resources: Either configure some or disable STONITH with the stonith-enabled option
error: unpack_resources: NOTE: Clusters with shared data need STONITH to ensure data integrity
Errors found during check: config not valid
crm(live)configure# property stonith-enabled=false ##关闭fence模块
crm(live)configure# verify ##检测
crm(live)configure# commit ##保存
6)添加资源
server2上
crm(live)configure# primitive vip ocf💓IPaddr2 params ip=172.25.23.100 nic=eth0 cidr_netmask=24 ##创建资源vip,配置在eth0网卡
crm(live)configure# verify
crm(live)configure# commit
##server1上查看
server1:
Online: [ server1 server2 ]
vip (ocf::heartbeat:IPaddr2): Started server1
7)安装apache服务
server1:
yum install httpd -y
vim /var/www/html/index.html
server2:yum install httpd -y
vim /var/www/html/index.html
server1:
root@server1 corosync]# crm
crm(live)# configure
crm(live)configure# primitive apache
lsb: ocf: service: stonith:
crm(live)configure# primitive apache lsb:httpd op monitor interval=10s
crm(live)configure# group website vip apache
crm(live)configure# verify
crm(live)configure# commit ##保存
测试:
[root@foundation23 pacemaker]# curl 172.25.23.100
server1
[root@server1 corosync]# /etc/init.d/corosync stop ##在server1端服务宕掉服务
Signaling Corosync Cluster Engine (corosync) to terminate: [ OK ]
Waiting for corosync services to unload:. [ OK ]
server2:
Online: [ server2 ]
OFFLINE: [ server1 ]
##若关闭服务,则为OFFline;若是一个双结点集群,当有一端宕掉服务时,另一结点则会自动丢弃资源,也不再接管,因为一个结点不能构成集群。
真机
[root@foundation23 pacemaker]# curl 172.25.23.100
找不到服务器
在server1:
crm(live)configure# property no-quorum-policy=ignore ##忽略集群对结点数目的控制
crm(live)configure# verify
crm(live)configure# commit ##提交使生效
crm(live)configure# exit
bye
真机:
[root@foundation23 pacemaker]# curl 172.25.23.100
server2
二、添加fence
1)开启fence服务
真机
systemctl status fence_virtd.service
cd /etc/cluster/
rm -fr fence_xvm.key
vim /etc/fence_virt.conf
dd if=/dev/urandom of=/etc/cluster/fence_xvm.key bs=128 count=1
server1
mkdir /etc/cluster
server2
mkdir /etc/cluster
真机
scp fence_xvm.key root@172.25.23.1:/etc/cluster/
scp fence_xvm.key root@172.25.23.2:/etc/cluster/
systemctl start fence_virtd.service
systemctl status fence_virtd.service
2)添加fence设备
server1
stonith_admin -I ##查看fence设备
yum provides */fence_xvm
yum install fence-virt-0.2.3-15.el6.x86_64 -y
crm_mon ##监控全部
chkconfig corosync on ##设置开机自启
server2:
stonith_admin -I
yum provides */fence_xvm
yum install fence-virt-0.2.3-15.el6.x86_64 -y
stonith_admin -I
fence_xvm
fence_virt
fence_pcmk
fence_legacy
4 devices found
[root@server2 cluster]# crm
crm(live)# configure
crm(live)configure# property stonith-enabled=true ##开启fence模块
crm(live)configure# primitive vmfence stonith:fence_xvm params pcmk_host_map=“server1:server1;server2:server2” op monitor interval=1min ##主机名(hostname):虚拟机名字(),监控以上资源在哪个结点上运行,每隔1min检测一次;
crm(live)configure# verify ##检测
crm(live)configure# commit ##保存
crm(live)configure# exit
chkconfig corosync on ##设置服务开机自启动
3)测试
测试1:
sever2上关闭apache ,fence把服务重启
[root@server2 cluster]# /etc/init.d/httpd stop
Stopping httpd: [ OK ]
[root@server2 cluster]# /etc/init.d/httpd status
httpd (pid 2986) is running…
真机:
[root@foundation23 cluster]# curl 172.25.23.100
server2
测试2:把server2内核写死,fence对server2重启
[root@server2 cluster]# echo c >/proc/sysrq-trigger
[root@foundation23 cluster]# curl 172.25.23.100
server1
pacemaker 默认无回切机制
三、nfs共享文件系统
1)
server3上
yum install nfs-utils rpcbind -y
/etc/init.d/rpcbind start
mkdir -p /web/htdocs
ll -d /web/htdocs/
chmod o+w /web/htdocs/
[root@server3 ~]# ll -d /web/htdocs/
drwxr-xrwx 2 root root 4096 Apr 16 13:51 /web/htdocs/
vim /etc/exports
/web/htdocs 172.25.23.0/24(rw)
exportfs -r
showmount -e
clnt_create: RPC: Program not registered
[root@server3 ~]# cd /web/htdocs/
[root@server3 htdocs]# vim index.html
/etc/init.d/nfs start
/etc/init.d/nfs status
/etc/init.d/rpcbind status
2)
server2
[root@server2 corosync]# mount -t nfs 172.25.23.3:/web/htdocs/ /mnt ##挂载
[root@server2 corosync]# df ##可以看到已经挂载
[root@server2 corosync]# umount /mnt/
[root@server2 corosync]# crm_mon ##在server2上监控
3)
server1
mount -t nfs 172.25.23.3:/web/htdocs/ /mnt
df
umount /mnt/
[root@server1 corosync]# crm
crm(live)# configure
crm(live)configure# show
crm(live)configure# cd
crm(live)# resource
crm(live)resource# show
Resource Group: website
vip (ocf:💓IPaddr2): Stopped
apache (lsb:httpd): Stopped
vmfence (stonith:fence_xvm): Started
webdata (ocf:💓Filesystem): Started
crm(live)resource# stop website
crm(live)resource# show
Resource Group: website
vip (ocf:💓IPaddr2): Stopped
apache (lsb:httpd): Stopped
vmfence (stonith:fence_xvm): Started
webdata (ocf:💓Filesystem): Started
crm(live)resource# cd
crm(live)# configure
decrm(live)configure# delete website
crm(live)configure# show
crm(live)configure# group webgroup vip webdata apache ##
crm(live)configure# show
crm(live)configure# verify
crm(live)configure# commit
测试:在server1
crm(live)configure# cd
crm(live)# node
crm(live)node# show
server2: normal
server1: normal
standby: off
crm(live)node# standby server1 ##下线 ##apache在哪台服务器上就下线哪台服务器(从server2的监控上看)
crm(live)node# show
server2: normal
server1: normal
standby: on
crm(live)node# online server1 ##上线
server1上线之后,apache回切到server1上就是正确,本来是默认不会切,但是fence设备和服务尽量不能工作在同主机,此时,就自动把两个服务分离