Linux——MySQL-MMM

MySQL—MMM

MMM官网:http://mysql-mmm.org

​ MMM即Multi-Master Replication Manager for MySQL:mysql多主复制管理器,基于perl实现,关于mysql主主复制配置的监控、故障转移和管理的一套可伸缩的脚本套件(在任何时候只有一个节点可以被写入),MMM也能对从服务器进行读负载均衡,所以可以用它来在一组用于复制的服务器启动虚拟ip,除此之外,它还有实现数据备份、节点之间重新同步功能的脚本。MySQL本身没有提供replication failover的解决方案,通过MMM方案能实现服务器的故障转移,从而实现mysql的高可用。MMM不仅能提供浮动IP的功能,如果当前的主服务器挂掉后,会将你后端的从服务器自动转向新的主服务器进行同步复制,不用手工更改同步配置。这个方案是目前比较成熟的解决方案。

优点:

​ 高可用性,扩展性好,出现故障自动切换,对于主主同步,在同一时间只提供一台数据库写操作,保证的数据的一致性。当主服务器挂掉以后,另一个主立即接管,其他的从服务器能自动切换,不用人工干预。

缺点:

​ monitor节点是单点,不过这个你也可以结合keepalived或者haertbeat做成高可用;至少三个节点,对主机的数量有要求,需要实现读写分离,还需要在前端编写读写分离程序。在读写非常繁忙的业务系统下表现不是很稳定,可能会出现复制延时、切换失效等问题。MMM方案并不太适应于对数据安全性要求很高,并且读、写繁忙的环境中。 适用场景: MMM的适用场景为数据库访问量大,并且能实现读写分离的场景。

MMM主要功能:
由下面三个脚本提供: mmm_mond 负责所有的监控工作的监控守护进程,决定节点的移除(mmm_mond进程定时心跳检测,失败则将write ip浮动到另外一台master)等等 mmm_agentd 运行在mysql服务器上的代理守护进程,通过简单远程服务集提供给监控节点 mmm_control 通过命令行管理mmm_mond进程 在整个监管过程中,需要在mysql中添加相关授权用户,授权的用户包括一个mmm_monitor用户和一个mmm_agent用户,如果想使用mmm的备份工具则还要添加一个mmm_tools用户。

环境搭建:

环境:

主机名IP虚拟IP(写)虚拟IP(读)
master01192.168.1.20192.168.1.50
master02192.168.1.21192.168.1.51
slave01192.168.1.22192.168.1.52
slave02192.168.1.23192.168.1.53
monitor192.168.1.24

PS:准备工作都在master01上做,其余四台照样master01,此处为测试环境,所以其余四台克隆master01即可,生产环境则无!

1.环境准备工作

1.1安装perl依赖

[root@master01 ~]# yum -y install perl-* libart_lgpl.x86_64 rrdtool.x86_64 rrdtool-perl.x86_64

1.2 安装cpan工具

cpan资料命令:http://www.361way.com/change-cpan-default-mirror/5094.html

[root@master01 ~]# cpan
#一直回车即可
......

#原装是国外源,更换为国内阿里源
[root@master01 ~]# cpan

cpan shell -- CPAN exploration and modules installation (v1.9800)
Enter 'h' for help.

#查看帮助
cpan[1]> o conf
$CPAN::Config options from /root/.cpan/CPAN/MyConfig.pm:
    commit             [Commit changes to disk]
    defaults           [Reload defaults from disk]
    help               [Short help about 'o conf' usage]
    init               [Interactive setting of all options]
......

#查看当前的URL
cpan[2]> o conf urllist
    urllist           
	0 [http://mirror.intergrid.com.au/cpan/]
	1 [http://cpan.catalyst.net.nz/CPAN/]
Type 'o conf' to view all configuration items

#删除当前的URL
cpan[3]> o conf urllist pop http://mirror.intergrid.com.au/cpan/
Please use 'o conf commit' to make the config permanent!


cpan[4]> o conf urllist pop http://cpan.catalyst.net.nz/CPAN/
Please use 'o conf commit' to make the config permanent!

#添加国内阿里源
cpan[5]> o conf urllist push  http://mirrors.aliyun.com/CPAN/
Please use 'o conf commit' to make the config permanent!

#提交相关配置(保存)
cpan[6]> o conf commit
commit: wrote '/root/.cpan/CPAN/MyConfig.pm'

cpan[7]> exit
Lockfile removed.

1.3 安装perl的相关库

自行下载相关库网址:https://mirrors.aliyun.com/CPAN/?spm=a2c6h.13651104.0.0.7288cbf2FZqEzD

[root@master01 ~]# cpan -i Algorithm::Diff Class::Singleton DBI DBD::mysql Log::Dispatch Log::Log4perl Mail::Send Net::Ping Proc::Daemon Time::HiRes Params::Validate Net::ARP
#有几个相关库需手动下载如下
[root@master01 ~]#  wget https://cpan.metacpan.org/authors/id/C/CR/CRAZYDJ/Net-ARP-1.0.11.tgz
[root@master01 ~]# wget https://cpan.metacpan.org/authors/id/D/DV/DVEEDEN/DBD-mysql-4.050.tar.gz
[root@master01 ~]# wget https://cpan.metacpan.org/authors/id/R/RU/RURBAN/Net-Ping-2.74.tar.gz
[root@master01 ~]# tar zxf Net-Ping-2.74.tar.gz 
[root@master01 ~]# tar zxf Net-ARP-1.0.11.tgz 
[root@master01 ~]# tar zxf DBD-mysql-4.050.tar.gz
[root@master01 ~]# cd Net-Ping-2.74/
[root@master01 Net-Ping-2.74]# perl Makefile.PL 
[root@master01 Net-Ping-2.74]# make install
[root@master01 Net-Ping-2.74]# cd ../Net-ARP-1.0.11/
[root@master01 Net-ARP-1.0.11]# perl Makefile.PL 
[root@master01 Net-ARP-1.0.11]# make install
[root@master01 Net-ARP-1.0.11]# cd ../DBD-mysql-4.050/
[root@master01 DBD-mysql-4.050]# perl Makefile.PL 
[root@master01 DBD-mysql-4.050]# make install

#查看是否全部下载完成
[root@master01 DBD-mysql-4.050]# cpan -i Algorithm::Diff Class::Singleton DBI DBD::mysql Log::Dispatch Log::Log4perl Mail::Send Net::Ping Proc::Daemon Time::HiRes Params::Validate Net::ARP
cpan -i Algorithm::Diff Class::Singleton DBI DBD::mysql Log::Dispatch Log::Log4perl Mail::Send Net::Ping Proc::Daemon Time::HiRes Params::Validate Net::ARP
CPAN: Storable loaded ok (v2.45)
Reading '/root/.cpan/Metadata'
  Database was generated on Tue, 09 Mar 2021 15:17:03 GMT
CPAN: Module::CoreList loaded ok (v5.20210220)
Algorithm::Diff is up to date (1.201).
Class::Singleton is up to date (1.6).
DBI is up to date (1.643).
DBD::mysql is up to date (4.050).
Log::Dispatch is up to date (2.70).
Log::Log4perl is up to date (1.54).
Mail::Send is up to date (2.21).
Net::Ping is up to date (2.74).
Proc::Daemon is up to date (0.23).
Time::HiRes is up to date (1.9764).
Params::Validate is up to date (1.30).
Net::ARP is up to date (1.0.11).

1.4 NTP服务配置

[root@master01 DBD-mysql-4.050]# vim /etc/chrony.conf 
server cn.pool.ntp.org iburst
......
allow 192.168.1.0/24
......
[root@master01 DBD-mysql-4.050]# systemctl restart chronyd
[root@master01 DBD-mysql-4.050]# vim /etc/hosts
[root@master01 DBD-mysql-4.050]# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.1.20	master01
192.168.1.21	master02
192.168.1.22	slave01
192.168.1.23	slave02
192.168.1.24	monitor

2.数据库配置

2.1 MySQL配置

PS:此处为克隆环境,每台的默认MySQLauto文件都一样 所以删除以下自动生成, 注意server-id不能重复

#四台DB都需操作
[root@master01 ~]# rm -rf /usr/local/mysql/data/auto.cnf

#master01(192.168.1.20)
[root@master01 ~]# vim /etc/my.cnf
[root@master01 ~]# cat /etc/my.cnf
[mysqld]
basedir=/usr/local/mysql
datadir=/usr/local/mysql/data
pid-file=/usr/local/mysql/data/mysqld.pid
log-error=/usr/local/mysql/data/mysql.err
socket=/var/lib/mysql/mysql.sock
server-id = 1
binlog_format = mixed
log-bin = mysql-bin
relay-log = relay-bin
relay-log-index = slave-realy-bin.index
log-slave-updates = 1
auto-increment-increment = 2
auto-increment-offset = 1
[root@master01 ~]# systemctl restart mysqld.service 

#master02(192.168.1.21)
[root@master02 ~]# vim /etc/my.cnf
[root@master02 ~]# cat /etc/my.cnf
[mysqld]
basedir=/usr/local/mysql
datadir=/usr/local/mysql/data
pid-file=/usr/local/mysql/data/mysqld.pid
log-error=/usr/local/mysql/data/mysql.err
socket=/var/lib/mysql/mysql.sock
server-id = 2
binlog_format = mixed
log-bin = mysql-bin
relay-log = relay-bin
relay-log-index = slave-realy-bin.index
log-slave-updates = 1
auto-increment-increment = 2
auto-increment-offset = 2
[root@master02 ~]# systemctl restart mysqld.service 

#slave01(192.168.1.22)
[root@slave01 ~]# vim /etc/my.cnf
[root@slave01 ~]# tail -4 /etc/my.cnf
server-id = 3
relay-log = relay-bin
relay-log-index = slave-relay-bin.index
read_only = 1
[root@slave01 ~]# systemctl restart mysqld.service 

#slave02(192.168.1.23)
[root@slave02 ~]# vim /etc/my.cnf
[root@slave02 ~]# tail -4 /etc/my.cnf
server-id = 4
relay-log = relay-bin
relay-log-index = slave-relay-bin.index
read_only = 1
[root@slave02 ~]# systemctl restart mysqld.service 

2.2 配置防火墙规则

#四台DB做
[root@master01 ~]# firewall-cmd --permanent --add-port=3306/tcp
success
[root@master01 ~]# firewall-cmd --reload
success

2.3 配置主主从

PS:master01和master02互为主从,slave01和slave02为master01的从!!!

#创建主从专用用户
#master01(192.168.1.20)
[root@master01 ~]# mysql -uroot -p123456
MySQL [(none)]> grant replication slave on *.* to test@'192.168.1.%' identified by '123456';
Query OK, 0 rows affected, 1 warning (0.00 sec)

MySQL [(none)]> flush privileges;
Query OK, 0 rows affected (0.67 sec)

MySQL [(none)]> show master status;
+------------------+----------+--------------+------------------+-------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000001 |      609 |              |                  |                   |
+------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)



#master02(192.168.1.21)
[root@master02 ~]# mysql -uroot -p123456

MySQL [(none)]> grant replication slave on *.* to test@'192.168.1.%' identified by '123456';
Query OK, 0 rows affected, 1 warning (0.00 sec)

MySQL [(none)]> flush privileges;
Query OK, 0 rows affected (0.00 sec)

MySQL [(none)]> show master status;
+------------------+----------+--------------+------------------+-------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000001 |      609 |              |                  |                   |
+------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)


#master01(192.168.1.20)
MySQL [(none)]> change master to  master_host='192.168.1.21',master_user='test',master_password='123456',master_log_file='mysql-bin.000001',master_log_pos=609;
Query OK, 0 rows affected, 2 warnings (0.12 sec)

MySQL [(none)]> start slave;
Query OK, 0 rows affected (0.00 sec)


MySQL [(none)]> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.1.21
                  Master_User: test
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000001
          Read_Master_Log_Pos: 609
               Relay_Log_File: relay-bin.000002
                Relay_Log_Pos: 320
        Relay_Master_Log_File: mysql-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes


#master02(192.168.1.21)
MySQL [(none)]> change master to  master_host='192.168.1.20',master_user='test',master_password='123456',master_log_file='mysql-bin.000001',master_log_pos=609;
Query OK, 0 rows affected, 2 warnings (0.00 sec)

MySQL [(none)]> start slave;
Query OK, 0 rows affected (0.01 sec)

MySQL [(none)]> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.1.20
                  Master_User: test
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000001
          Read_Master_Log_Pos: 609
               Relay_Log_File: relay-bin.000002
                Relay_Log_Pos: 320
        Relay_Master_Log_File: mysql-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes

#slave01(192.168.1.22)
MySQL [(none)]> change master to  master_host='192.168.1.20',master_user='test',master_password='123456',master_log_file='mysql-bin.000001',master_log_pos=609;
Query OK, 0 rows affected, 2 warnings (0.02 sec)

MySQL [(none)]> start slave;
Query OK, 0 rows affected (0.00 sec)

MySQL [(none)]> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.1.20
                  Master_User: test
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000001
          Read_Master_Log_Pos: 609
               Relay_Log_File: relay-bin.000002
                Relay_Log_Pos: 320
        Relay_Master_Log_File: mysql-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
            
            
#slave02(192.168.1.23)
MySQL [(none)]> change master to  master_host='192.168.1.20',master_user='test',master_password='123456',master_log_file='mysql-bin.000001',master_log_pos=609;
Query OK, 0 rows affected, 2 warnings (0.00 sec)

MySQL [(none)]> start slave;
Query OK, 0 rows affected (0.00 sec)

MySQL [(none)]> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.1.20
                  Master_User: test
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000001
          Read_Master_Log_Pos: 609
               Relay_Log_File: relay-bin.000002
                Relay_Log_Pos: 320
        Relay_Master_Log_File: mysql-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes

验证主从复制

#master01(192.168.1.20)
MySQL [(none)]> create database test;
Query OK, 1 row affected (0.00 sec)

MySQL [(none)]> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| sys                |
| test               |
+--------------------+
5 rows in set (0.10 sec)

#master02(192.168.1.21)
MySQL [(none)]> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| sys                |
| test               |
+--------------------+
5 rows in set (0.00 sec)

#slave01(192.168.1.22)
MySQL [(none)]> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| sys                |
| test               |
+--------------------+
5 rows in set (0.00 sec)

#slave02(192.168.1.23)
MySQL [(none)]> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| sys                |
| test               |
+--------------------+
5 rows in set (0.00 sec)

3.MySQL-MMM配置

3.1 用户配置

PS:已经做了主主从,所以只在master01上创建,其他三台DB也就有了!!!

  • mmm_monitor用户:mmm监控用于对mysql服务器进程健康检查
  • mmm_agent用户:mmm代理用来更改只读模式,复制的主服务器等
#代理账号
MySQL [(none)]> grant super,replication client,process on *.* to 'mmm_agent'@'192.168.1.%' identified by '123456';
Query OK, 0 rows affected, 1 warning (0.00 sec)

#监控账号
MySQL [(none)]> grant replication client on *.* to 'mmm_monitor'@'192.168.1.%' identified by '123456';Query OK, 0 rows affected, 1 warning (0.00 sec)


#可以在其他DB上查看是否有账户
MySQL [(none)]> select user,host from mysql.user where user in ('mmm_monitor','mmm_agent');
+-------------+-------------+
| user        | host        |
+-------------+-------------+
| mmm_agent   | 192.168.1.% |
| mmm_monitor | 192.168.1.% |
+-------------+-------------+
2 rows in set (0.00 sec)

MySQL [(none)]> show grants for 'mmm_agent'@'192.168.1.%';
+------------------------------------------------------------------------------+
| Grants for mmm_agent@192.168.1.%                                             |
+------------------------------------------------------------------------------+
| GRANT PROCESS, SUPER, REPLICATION CLIENT ON *.* TO 'mmm_agent'@'192.168.1.%' |
+------------------------------------------------------------------------------+
1 row in set (0.00 sec)

MySQL [(none)]> show grants for 'mmm_monitor'@'192.168.1.%';
+----------------------------------------------------------------+
| Grants for mmm_monitor@192.168.1.%                             |
+----------------------------------------------------------------+
| GRANT REPLICATION CLIENT ON *.* TO 'mmm_monitor'@'192.168.1.%' |
+----------------------------------------------------------------+
1 row in set (0.00 sec)

3.2 安装监控程序

#monitor(192.168.1.24)
[root@monitor ~]# cd /tmp/
[root@monitor tmp]# wget http://pkgs.fedoraproject.org/repo/pkgs/mysql-mmm/mysql-mmm-2.2.1.tar.gz/f5f8b48bdf89251d3183328f0249461e/mysql-mmm-2.2.1.tar.gz
[root@monitor tmp]# tar -zxf mysql-mmm-2.2.1.tar.gz
[root@monitor tmp]# cd mysql-mmm-2.2.1/
[root@monitor mysql-mmm-2.2.1]# make install

DB服务器也都安装代理,此处master01示例!

#master01(192.168.1.20)
[root@master01 ~]# cd /tmp/
[root@master01 tmp]# wget http://pkgs.fedoraproject.org/repo/pkgs/mysql-mmm/mysql-mmm-2.2.1.tar.gz/f5f8b48bdf89251d3183328f0249461e/mysql-mmm-2.2.1.tar.gz
[root@master01 tmp]# tar zxf mysql-mmm-2.2.1.tar.gz 
[root@master01 tmp]# cd mysql-mmm-2.2.1/
[root@master01 mysql-mmm-2.2.1]# make install

#master02、slave01、slave02同上

3.3 修改配置文件

#monitor(192.168.1.24)
[root@monitor ~]# vim /etc/mysql-mmm/mmm_common.conf 
[root@monitor ~]# cat /etc/mysql-mmm/mmm_common.conf 
active_master_role	writer		#积极的master角色的标示,所有的db服务器要开启read_only参数,对于writer服务器监控代理会自动将read_only属性关闭。


<host default>
	cluster_interface		ens33		#群集的网络接口

	pid_path				/var/run/mmm_agentd.pid		#pid路径
	bin_path				/usr/lib/mysql-mmm/			#可执行文件路径

    replication_user        test		#复制用户				
    replication_password    123456		#复制用户密码

	agent_user				mmm_agent	#代理用户
	agent_password			123456		#代理用户密码
</host>

<host master01>			#master1的host名
	ip						192.168.1.20		#master1的ip
	mode					master				#角色属性,master代表是
	peer					master02			#与master1对等的服务器的host名,也就是master2的服务器host名
</host>

<host master02>  #和master的概念一样
	ip						192.168.1.21
	mode					master
	peer					master01
</host>

<host slave01>		#从库的host名,如果存在多个从库可以重复一样的配置
	ip						192.168.1.22		#从的ip
	mode					slave				#slave的角色属性代表当前host是从
</host>	

<host slave02>		#和slave的概念一样
	ip						192.168.1.23
	mode					slave
</host>

<role writer>		#writer角色配置
	hosts					master01,master02		#能进行写操作的服务器的host名,如果不想切换写操作这里可以只配置master,这样也可以避免因为网络延时而进行write的切换,但是一旦master出现故障那么当前的MMM就没有writer了只有对外的read操作。
	ips						192.168.1.50			#对外提供的写操作的虚拟IP
	mode					exclusive				#exclusive代表只允许存在一个主,也就是只能提供一个写的IP
</role>	

<role reader>		#read角色配置
	hosts					master02,slave01,slave02		#对外提供读操作的服务器的host名,当然这里也可以把master加进来
	ips						192.168.1.51,192.168.1.52,192.168.1.53	#对外提供读操作的虚拟ip,这三个ip和host不是一一对应的,并且ips也hosts的数目也可以不相同,如果这样配置的话其中一个hosts会分配两个ip
	mode					balanced 		#balanced代表负载均衡
</role>


#将这个文件发送到DB服务器上
#使用如下这个方法的话,必须在hosts文件下填写服务器IP和主机名!!!!
[root@monitor ~]# for  i in master01 master02 slave01 slave02 ; do scp /etc/mysql-mmm/mmm_common.conf  $i:/etc/mysql-mmm/mmm_common.conf ; done

3.4 代理服务器配置

#master01(192.168.1.20)
[root@master01 ~]# vim /etc/mysql-mmm/mmm_agent.conf 
[root@master01 ~]# cat /etc/mysql-mmm/mmm_agent.conf 
include mmm_common.conf
this master01
[root@master01 ~]# vim /etc/init.d/mysql-mmm-agent 
source /root/.bash_profile

#添加成系统服务并设置为自启动
[root@master01 ~]# chkconfig --add mysql-mmm-agent
[root@master01 ~]# chkconfig mysql-mmm-agent on
[root@master01 ~]# /etc/init.d/mysql-mmm-agent start
Daemon bin: '/usr/sbin/mmm_agentd'
Daemon pid: '/var/run/mmm_agentd.pid'
Starting MMM Agent daemon... Ok
[root@master01 ~]# netstat -antp | grep mmm_agentd
tcp        0      0 192.168.1.20:9989       0.0.0.0:*               LISTEN      59346/mmm_agentd 

#配置防火墙
[root@master01 ~]# firewall-cmd --permanent --add-port=9989/tcp
success
[root@master01 ~]# firewall-cmd --reload
success


#master02(192.168.1.21)
[root@master02 mysql-mmm-2.2.1]#  vim /etc/mysql-mmm/mmm_agent.conf
[root@master02 mysql-mmm-2.2.1]# cat /etc/mysql-mmm/mmm_agent.conf
include mmm_common.conf
this master02
[root@master02 mysql-mmm-2.2.1]# vim /etc/init.d/mysql-mmm-agent
source /root/.bash_profile
[root@master02 mysql-mmm-2.2.1]# chkconfig --add mysql-mmm-agent
[root@master02 mysql-mmm-2.2.1]# chkconfig mysql-mmm-agent on
[root@master02 mysql-mmm-2.2.1]# /etc/init.d/mysql-mmm-agent start
Daemon bin: '/usr/sbin/mmm_agentd'
Daemon pid: '/var/run/mmm_agentd.pid'
Starting MMM Agent daemon... Ok
[root@master02 mysql-mmm-2.2.1]# netstat -antp | grep mmm_agentd
tcp        0      0 192.168.1.21:9989       0.0.0.0:*               LISTEN      59785/mmm_agentd   
[root@master02 mysql-mmm-2.2.1]# firewall-cmd --permanent --add-port=9989/tcp
success
[root@master02 mysql-mmm-2.2.1]# firewall-cmd --reload
success


#slave01(192.168.1.22)
[root@slave01 mysql-mmm-2.2.1]#  vim /etc/mysql-mmm/mmm_agent.conf
[root@slave01 mysql-mmm-2.2.1]# cat /etc/mysql-mmm/mmm_agent.conf
include mmm_common.conf
this slave01
[root@slave01 mysql-mmm-2.2.1]# vim /etc/init.d/mysql-mmm-agent
source /root/.bash_profile
[root@slave01 mysql-mmm-2.2.1]# chkconfig --add mysql-mmm-agent
[root@slave01 mysql-mmm-2.2.1]# chkconfig mysql-mmm-agent on
[root@slave01 mysql-mmm-2.2.1]# /etc/init.d/mysql-mmm-agent start
Daemon bin: '/usr/sbin/mmm_agentd'
Daemon pid: '/var/run/mmm_agentd.pid'
Starting MMM Agent daemon... Ok
[root@slave01 mysql-mmm-2.2.1]# netstat -antp | grep mmm_agentd
tcp        0      0 192.168.1.22:9989       0.0.0.0:*               LISTEN      59820/mmm_agentd 
[root@slave01 mysql-mmm-2.2.1]# firewall-cmd --permanent --add-port=9989/tcp
success
[root@slave01 mysql-mmm-2.2.1]# firewall-cmd --reload
success

#slave02(192.168.1.23)
[root@slave02 mysql-mmm-2.2.1]#  vim /etc/mysql-mmm/mmm_agent.conf
[root@slave02 mysql-mmm-2.2.1]# cat /etc/mysql-mmm/mmm_agent.conf
include mmm_common.conf
this slave02
[root@slave02 mysql-mmm-2.2.1]# vim /etc/init.d/mysql-mmm-agent
source /root/.bash_profile
[root@slave02 mysql-mmm-2.2.1]# chkconfig --add mysql-mmm-agent
[root@slave02 mysql-mmm-2.2.1]# chkconfig mysql-mmm-agent on
[root@slave02 mysql-mmm-2.2.1]# /etc/init.d/mysql-mmm-agent start
Daemon bin: '/usr/sbin/mmm_agentd'
Daemon pid: '/var/run/mmm_agentd.pid'
Starting MMM Agent daemon... Ok
[root@slave02 mysql-mmm-2.2.1]# netstat -antp | grep mmm_agentd
tcp        0      0 192.168.1.23:9989       0.0.0.0:*               LISTEN      59761/mmm_agentd 
[root@slave02 mysql-mmm-2.2.1]# firewall-cmd --permanent --add-port=9989/tcp
success
[root@slave02 mysql-mmm-2.2.1]# firewall-cmd --reload
success
  • 添加source /root/.bash_profile目的是为了mysql-mmm-agent服务能启机自启。 自动启动和手动启动的唯一区别,就是激活一个console 。那么说明在作为服务启动的时候,可能是由于缺少环境变量 服务启动失败

3.5 监控服务器配置

#monitor(192.168.1.24)
[root@monitor ~]# vim /etc/mysql-mmm/mmm_mon.conf 
[root@monitor ~]# cat /etc/mysql-mmm/mmm_mon.conf 
include mmm_common.conf

<monitor>
	ip						127.0.0.1		##为了安全性,设置只在本机监听,mmm_mond默认监听9988
	pid_path				/var/run/mmm_mond.pid
	bin_path				/usr/lib/mysql-mmm/
	status_path				/var/lib/misc/mmm_mond.status
	ping_ips				192.168.1.20,192.168.1.21,192.168.1.22,192.168.1.23		#用于测试网络可用性 IP 地址列表,只要其中有一个地址 ping 通,就代表网络正常,这里不要写入本机地址(其实就是填写四台DB服务器的IP)
	auto_set_online 		0		#设置自动online的时间,默认是超过60s就将它设置为online,默认是60s,		
</monitor>

<check default>
	check_period 5
	trap_period 10
	timeout 2
	restart_after 10000
	max_backlog 86400
</check>

#上一段描述
#check_period:描述:检查周期默认为5s 默认值:5s
#trap_period:描述:一个节点被检测不成功的时间持续trap_period秒,就慎重的认为这个节点失败了。默认值:10s
#timeout:描述:检查超时的时间 默认值:2s
#restart_after :描述:在完成restart_after次检查后,重启checker进程 默认值:10000
#max_backlog:描述:记录检查rep_backlog日志的最大次数默认值:60

<host default>			
	monitor_user			mmm_monitor		#监控db服务器的用户
	monitor_password		123456			#监控db服务器的密码	
</host>

debug 0				#debug 0正常模式,1为debug模式

[root@monitor ~]# vim /etc/init.d/mysql-mmm-agent
source /root/.bash_profile
[root@monitor ~]# vim /etc/init.d/mysql-mmm-agent 
[root@monitor ~]# chkconfig --add mysql-mmm-monitor
[root@monitor ~]# chkconfig mysql-mmm-monitor on
[root@monitor ~]# /etc/init.d/mysql-mmm-monitor start
Daemon bin: '/usr/sbin/mmm_mond'
Daemon pid: '/var/run/mmm_mond.pid'
Starting MMM Monitor daemon: Ok
[root@monitor ~]# netstat -anpt | grep 9988
tcp        0      0 127.0.0.1:9988          0.0.0.0:*               LISTEN      59826/mmm_mond 

**PS:**无论是在db端还是在监控端如果有对配置文件进行修改操作都需要重启代理进程和监控进程。 MMM启动顺序:先启动monitor,再启动 agent 。

4.验证结果

4.1 检查集群状态

[root@monitor DBD-mysql-4.050]# mmm_control show
  master01(192.168.1.20) master/ONLINE. Roles: writer(192.168.1.50)
  master02(192.168.1.21) master/ONLINE. Roles: reader(192.168.1.52)
  slave01(192.168.1.22) slave/ONLINE. Roles: reader(192.168.1.51)
  slave2(192.168.1.23) slave/ONLINE. Roles: reader(192.168.1.53)

从上面的显示可以看到,写请求的VIP在master01上,所有从节点也都把master01当做主节点。

如果服务器状态不是ONLINE,可以用如下命令将服务器上线

[root@monitor DBD-mysql-4.050]# mmm_control set_online master01
OK: State of 'master01' changed to ONLINE. Now you can wait some time and check its new roles!

4.2 查看是否启用vip

#master01(192.168.1.20)
[root@master01 ~]# ip a show dev ens33
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 00:0c:29:95:6e:64 brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.20/24 brd 192.168.1.255 scope global noprefixroute ens33
       valid_lft forever preferred_lft forever
    inet 192.168.1.50/32 scope global ens33
       valid_lft forever preferred_lft forever
    inet6 fe80::dde:d77f:a5c2:1ade/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever

.....

5.高可用测试

  • MMM高可用性测试: 服务器读写采有VIP地址进行读写,出现故障时VIP会漂移到其它节点,由其它节点提供服务。 首先查看整个集群的状态,可以看到整个集群状态正常。

5.1 在从主机查看mysqk的指向

MySQL [(none)]> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.1.20
                  Master_User: test
                  Master_Port: 3306
                Connect_Retry: 60

5.2 停掉master01的mysqld服务

[root@master01 ~]# systemctl stop mysqld.service 

观察monitor日志,master1的日志如下:

[root@monitor ~]# tail -f /var/log/mysql-mmm/mmm_mond.log
2021/03/10 18:22:31  INFO Orphaned role 'reader(192.168.1.52)' has been assigned to 'master02'
2021/03/10 18:23:19 FATAL Admin changed state of 'slave01' from AWAITING_RECOVERY to ONLINE
2021/03/10 18:23:19  INFO Moving role 'reader(192.168.1.51)' from host 'master02' to host 'slave01'
2021/03/10 18:37:22  WARN Check 'rep_threads' on 'master01' is in unknown state! Message: UNKNOWN: Connect error (host = 192.168.1.20:3306, user = mmm_monitor)! Can't connect to MySQL server on '192.168.1.20' (111)
2021/03/10 18:37:22  WARN Check 'rep_backlog' on 'master01' is in unknown state! Message: UNKNOWN: Connect error (host = 192.168.1.20:3306, user = mmm_monitor)! Can't connect to MySQL server on '192.168.1.20' (111)
2021/03/10 18:37:32 ERROR Check 'mysql' on 'master01' has failed for 10 seconds! Message: ERROR: Connect error (host = 192.168.1.20:3306, user = mmm_monitor)! Can't connect to MySQL server on '192.168.1.20' (111)
2021/03/10 18:37:33 FATAL State of host 'master01' changed from ONLINE to HARD_OFFLINE (ping: OK, mysql: not OK)
2021/03/10 18:37:33  INFO Removing all roles from host 'master01':
2021/03/10 18:37:33  INFO     Removed role 'writer(192.168.1.50)' from host 'master01'
2021/03/10 18:37:33  INFO Orphaned role 'writer(192.168.1.50)' has been assigned to 'master02'

5.3 查看群集的最新状态

[root@monitor ~]# mmm_control show
  master01(192.168.1.20) master/HARD_OFFLINE. Roles: 
  master02(192.168.1.21) master/ONLINE. Roles: reader(192.168.1.52), writer(192.168.1.50)
  slave01(192.168.1.22) slave/ONLINE. Roles: reader(192.168.1.51)
  slave2(192.168.1.23) slave/ONLINE. Roles: reader(192.168.1.53)

从显示结果可以看出master1的状态有ONLINE转换为HARD_OFFLINE,写VIP转移到了master2主机上。

在从主机查看mysqk的指向

MySQL [(none)]> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.1.21
                  Master_User: test
                  Master_Port: 3306
                Connect_Retry: 60

5.4 检查所有的db服务器群集状态

[root@monitor ~]# mmm_control checks all
master01  ping         [last change: 2021/03/10 18:21:56]  OK
master01  mysql        [last change: 2021/03/10 18:37:33]  ERROR: Connect error (host = 192.168.1.20:3306, user = mmm_monitor)! Can't connect to MySQL server on '192.168.1.20' (111)
.......

从上面可以看到master01能ping通,说明只是服务死掉了。

5.5 启动master01DB服务器

root@master01 ~]# systemctl start  mysqld.service 

观察monitor日志,master1的日志如下:

[root@monitor ~]# tail -f /var/log/mysql-mmm/mmm_mond.log
2021/03/10 18:37:22  WARN Check 'rep_backlog' on 'master01' is in unknown state! Message: UNKNOWN: Connect error (host = 192.168.1.20:3306, user = mmm_monitor)! Can't connect to MySQL server on '192.168.1.20' (111)
2021/03/10 18:37:32 ERROR Check 'mysql' on 'master01' has failed for 10 seconds! Message: ERROR: Connect error (host = 192.168.1.20:3306, user = mmm_monitor)! Can't connect to MySQL server on '192.168.1.20' (111)
2021/03/10 18:37:33 FATAL State of host 'master01' changed from ONLINE to HARD_OFFLINE (ping: OK, mysql: not OK)
2021/03/10 18:37:33  INFO Removing all roles from host 'master01':
2021/03/10 18:37:33  INFO     Removed role 'writer(192.168.1.50)' from host 'master01'
2021/03/10 18:37:33  INFO Orphaned role 'writer(192.168.1.50)' has been assigned to 'master02'
2021/03/10 18:40:42  INFO Check 'rep_threads' on 'master01' is ok!
2021/03/10 18:40:42  INFO Check 'mysql' on 'master01' is ok!
2021/03/10 18:40:42  INFO Check 'rep_backlog' on 'master01' is ok!
2021/03/10 18:40:45 FATAL State of host 'master01' changed from HARD_OFFLINE to AWAITING_RECOVERY

从上面可以看到master01的状态由hard_offline改变为awaiting_recovery状态 用如下命令将服务器上线

5.6 手动将服务器上线

[root@monitor ~]# mmm_control  set_online master01
OK: State of 'master01' changed to ONLINE. Now you can wait some time and check its new roles!

查看群集最新状态

[root@monitor ~]# mmm_control  show
  master01(192.168.1.20) master/ONLINE. Roles: 
  master02(192.168.1.21) master/ONLINE. Roles: reader(192.168.1.52), writer(192.168.1.50)
  slave01(192.168.1.22) slave/ONLINE. Roles: reader(192.168.1.51)
  slave2(192.168.1.23) slave/ONLINE. Roles: reader(192.168.1.53)

PS:可以看到主库启动不会接管主,只到现有的主再次宕机,它才会接管组

总结

  • master02备选主节点宕机不影响集群的状态,就是移除了master02备选节点的读状态。

  • master01主节点宕机,由master2备选主节点接管写角色,slave01,slave02指向新master02主库进行复制,slave01,slave02会自动change master到master02.

  • 如果master01主库宕机,master02复制应用又落后于master01时就变成了主可写状态,这时的数据主无法保证一致性。 如果master02,slave01,slave02延迟于master01主,这个时master1宕机,slave01,slave02将会等待数据追上后,再重新指向新的主进行复制操作,这时的数据也无法保证同步的一致性。

  • 如果采用MMM高可用架构,主主备选节点机器配置一样,而且开启半同步进一步提高安全性或采用MariaDB/mysql5.7进行多线程从复制,提高复制的性能

  • 日志文件: 日志文件往往是分析错误的关键,所以要善于利用日志文件进行问题分析。 db端:/var/log/mysql-mmm/mmm_agentd.log 监控端:/var/log/mysql-mmm/mmm_mond.log

命令文件:

  • mmm_agentd:db代理进程的启动文件
  • mmm_mond:监控进程的启动文件
  • mmm_backup:备份文件
  • mmm_restore:还原文件
  • mmm_control:监控操作命令文件db服务器端只有mmm_agentd程序,其它的都是在monitor服务器端。

mmm_control用法

  • mmm_control程序可以用于监控群集状态、切换writer、设置online\offline操作等。
  • Valid commands are: help - show this message

#帮助信息

  • ping - ping monitor:ping当前的群集是否正常
  • show - show status :群集在线状态检查
  • checks [|all [|all]] - show checks status:执行监控检查操作
  • set_online - set host online :将host设置为online set_offline
  • set host offline :将host设置为offline
  • mode - print current mode. :打印输出当前的mode set_active - switch into active mode.

其它处理问题 如果不想让writer从master切换到backup(包括主从的延时也会导致写VIP的切换),那么可以在配置/etc/mysql-mmm/mmm_common.conf时,去掉中的backup #writer角色配置 hosts master1#这里只配置一个Hosts ips 192.168.31.2#对外提供的写操作的虚拟IP mode exclusive #exclusive代表只允许存在一个主,也就是只能提供一个写的IP 这样的话当master1出现故障了writer写操作不会切换到master2服务器,并且slave也不会指向新的master,此时当前的MMM之前对外提供写服务。

总结:

  • 对外提供读写的虚拟IP是由monitor程序控制。如果monitor没有启动那么db服务器不会被分配虚拟ip,但是如果已经分配好了虚拟ip,当monitor程序关闭了原先分配的虚拟ip不会立即关闭外部程序还可以连接访问(只要不重启网络),这样的好处就是对于monitor的可靠性要求就会低一些,但是如果这个时候其中的某一个db服务器故障了就无法处理切换,也就是原先的虚拟ip还是维持不变,挂掉的那台DB的虚拟ip会变的不可访问。
  • agent程序受monitor程序的控制处理write切换,从库切换等操作。如果monitor进程关闭了那么agent进程就起不到什么作用,它本身不能处理故障。
  • monitor程序负责监控db服务器的状态,包括Mysql数据库、服务器是否运行、复制线程是否正常、主从延时等;它还用于控制agent程序处理故障。
  • monitor会每隔几秒钟监控db服务器的状态,如果db服务器已经从故障变成了正常,那么monitor会自动在60s之后将其设置为online状态(默认是60s可以设为其它的值),有监控端的配置文件参数“auto_set_online”决定,群集服务器的状态有三种分别是:HARD_OFFLINE→AWAITING_RECOVERY→online
  • 默认monitor会控制mmm_agent会将writer db服务器read_only修改为OFF,其它的db服务器read_only修改为ON,所以为了严谨可以在所有的服务器的my.cnf文件中加入read_only=1由monitor控制来控制writer和read,root用户和复制用户不受read_only参数的影响。

对指定的host执行offline操作:

[root@monitor1 ~]# mmm_controlset_offline slave2

对指定的host执行onine操作:

[root@monitor1 ~]# mmm_controlset_online slave2
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值