027:高级复制

MySQL复制技术详解

一.半同步和无损复制

主从复制基本上都是 异步复制, Master并不关Slave节点有没有获取到数据 ,所以复制效率很高,但是数据有可能会丢失。

  • 从 MySQL5.5 开始,MySQL推出了semi-sync replication (半同步复制)

    • 至少有一个Slave节点收到binlog后再返回( IO线程接收到即可 )
    • 减少数据丢失风险
    • 不能完全避免数据丢失
    • 超时后,切换回异步复制
  • 从 MySQL5.7.2 开始,MySQL推出了lossless semi-sync replication (无损复制)

    • 二进制日志(binlog)先写远程( IO线程接收到即可 )
    • 可保证数据完全不丢失

1166598-20180226143638027-20234604.png

1.1. loss less / semi-sync replication插件安装

  • 1、手工安装
mysql> INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so';
mysql> INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so';
  • 2、写入配置文件
[mysqld]
plugin_dir=/usr/local/mysql/lib/plugin
plugin_load = "rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so"

上述操作 仅仅是加载了插件 ,还 未启动 对应的功能,需要配置额外的参数:

[mysqld]
# 等同于 rpl_semi_sync_master_enabled = 1
loose_rpl_semi_sync_master_enabled = 1
# 等同于 rpl_semi_sync_slave_enabled = 1
loose_rpl_semi_sync_slave_enabled = 1
# 超时5秒后,则切换回异步方式
loose_rpl_semi_sync_master_timeout = 5000

使用 loose_ 前缀表示如果没有加载 semi_sync 的插件,则 忽略该参数
当Slave在Timeout后,又追上了Master了( IO线程 ),则会 自动切换回半同步复制

注意:半同步复制 / 无损复制主从 上都要 安装插件和开启功能

注意:要保证任意时刻发生一台机器宕机都不丢失数据的前提是 master sync_binlog设置为1,slave sync_relay_log设置为1。

1.2. semi-sync replication

semi-sync replication 称为 半同步复制 ,在一个事务 提交(commit) 的过程时,在 InnoDB 层的 commit log 步骤后,Master节点需要收到 至少一个 Slave节点回复的 ACK (表示 收到了binlog )后,才能继续下一个事务;
如果在一定时间内(Timeout)内 没有收到ACK ,则 切换为异步模式 ,具体流程如下:

1166598-20180226171507946-512408627.png

对应的配置参数如下:

[mysqld]
# 开启主的半同步复制
rpl_semi_sync_master_enabled=1
# 开启从的半同步复制
rpl_semi_sync_slave_enabled=1
# 超时1秒,切回异步
rpl_semi_sync_master_timeout=1000
# 至少收到 1 个 slave发回的ack
rpl_semi_sync_master_wait_for_slave_count=1

1.3. loss less semi-sync replication

loss less semi-sync replication 称为 无损复制 ,在一个事务提交(commit) 的过程时,在 MySQL 层的write binlog 步骤后,Master节点需要收到 至少一个 Slave节点回复的 ACK (表示 收到了binlog )后,才能继续下一个事务;
如果在一定时间内(Timeout)内 没有收到ACK ,则 切换为异步模式 ,具体流程如下:

1166598-20180226171630358-2129298265.png

对应的配置参数如下:

[mysqld]
# 开启主的半同步复制
rpl_semi_sync_master_enabled=1
# 开启从的半同步复制
rpl_semi_sync_slave_enabled=1
# 超时1秒,切回异步
rpl_semi_sync_master_timeout=1000
[mysqld57]
# 控制 半同步复制 还是 无损复制 的参数
# - AFTER_SYNC 表示的是无损复制;(5.7 默认)
# - AFTER_COMMIT 表示的是半同步复制;
rpl_semi_sync_master_wait_point=AFTER_SYNC
# 至少收到 1 个 slave发回的ack
rpl_semi_sync_master_wait_for_slave_count=1

1.4. 半同步复制与无损复制的对比

  • ACK的时间点不同

    • 半同步复制InnoDB层Commit Log后等待ACK,主从切换会有数据丢失风险。
    • 无损复制MySQL Server层Write binlog后等待ACK,主从切换会有数据变多风险。
  • 主从数据一致性

    • 半同步复制意味着在Master节点上,这个刚刚提交的事务对数据库的修改,对其他事务是可见的。因此,如果在等待Slave ACK的时候crash了,那么会对其他事务出现幻读数据丢失
    • 无损复制在write binlog完成后,就传输binlog,但还没有去写commit log,意味着当前这个事务对数据库的修改,其他事务也是不可见的。因此,不会出现幻读,数据丢失风险。
    • 因此5.7.2引入了无损复制(after_sync)模式,带来的主要收益是解决 after_commit导致的master crash数据丢失问题,因此在引入after_sync模式后,所有提交的数据已被复制,故障切换时数据一致性将得到提升。

参考- MySQL半同步复制数据一致性分析

1.5. 演示无损/半同步复制

  • master server
mysql root@localhost:employees> INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so';
Query OK, 0 rows affected
Time: 0.063s

mysql root@localhost:employees>  INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so';
Query OK, 0 rows affected
Time: 0.005s

mysql root@localhost:employees> set global rpl_semi_sync_master_enabled = 1;
Query OK, 0 rows affected
Time: 0.020s

mysql root@localhost:employees> set global rpl_semi_sync_master_timeout = 5000;
Query OK, 0 rows affected
Time: 0.001s

mysql root@localhost:employees> show global status like "%rpl%";
+--------------------------------------------+-------+
| Variable_name                              | Value |
+--------------------------------------------+-------+
| Rpl_semi_sync_master_clients               | 0     |
| Rpl_semi_sync_master_net_avg_wait_time     | 0     |
| Rpl_semi_sync_master_net_wait_time         | 0     |
| Rpl_semi_sync_master_net_waits             | 0     |
| Rpl_semi_sync_master_no_times              | 0     |
| Rpl_semi_sync_master_no_tx                 | 0     |
| Rpl_semi_sync_master_status                | ON    |   -- status ok
| Rpl_semi_sync_master_timefunc_failures     | 0     |
| Rpl_semi_sync_master_tx_avg_wait_time      | 0     |
| Rpl_semi_sync_master_tx_wait_time          | 0     |
| Rpl_semi_sync_master_tx_waits              | 0     |
| Rpl_semi_sync_master_wait_pos_backtraverse | 0     |
| Rpl_semi_sync_master_wait_sessions         | 0     |
| Rpl_semi_sync_master_yes_tx                | 0     |
| Rpl_semi_sync_slave_status                 | OFF   |
+--------------------------------------------+-------+
15 rows in set
Time: 0.013s
mysql root@localhost:employees>
  • slave server
(root@localhost) 09:59:14 [employees]> INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so';
Query OK, 0 rows affected (0.05 sec)

(root@localhost) 18:07:15 [employees]> INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so';
Query OK, 0 rows affected (0.00 sec)

(root@localhost) 18:07:29 [employees]> set global rpl_semi_sync_slave_enabled = 1;
Query OK, 0 rows affected (0.00 sec)

(root@localhost) 18:14:59 [employees]> show global variables like '%semi%';
+-------------------------------------------+------------+
| Variable_name                             | Value      |
+-------------------------------------------+------------+
| rpl_semi_sync_master_enabled              | OFF        |
| rpl_semi_sync_master_timeout              | 10000      |
| rpl_semi_sync_master_trace_level          | 32         |
| rpl_semi_sync_master_wait_for_slave_count | 1          |
| rpl_semi_sync_master_wait_no_slave        | ON         |
| rpl_semi_sync_master_wait_point           | AFTER_SYNC |
| rpl_semi_sync_slave_enabled               | ON         |
| rpl_semi_sync_slave_trace_level           | 32         |
+-------------------------------------------+------------+
8 rows in set (0.02 sec)

半复制切换异步同步过程的状态

  • master server
(root@localhost) 10:11:10 [tablespace]>  show global status like "%rpl%";
+--------------------------------------------+-------+
| Variable_name                              | Value |
+--------------------------------------------+-------+
| Rpl_semi_sync_master_clients               | 1     |  -- 半同步复制的client数量
| Rpl_semi_sync_master_net_avg_wait_time     | 0     |
| Rpl_semi_sync_master_net_wait_time         | 0     |  
| Rpl_semi_sync_master_net_waits             | 1     |  -- master总的等待slave的次数
| Rpl_semi_sync_master_no_times              | 0     |  -- 切成异步的次数(no = number of)
| Rpl_semi_sync_master_no_tx                 | 0     |  -- 切成异步后提交的事物数
| Rpl_semi_sync_master_status                | ON    |  -- 半同步复制的状态
| Rpl_semi_sync_master_timefunc_failures     | 0     |
| Rpl_semi_sync_master_tx_avg_wait_time      | 2209  |  -- master等待事物的平均时间
| Rpl_semi_sync_master_tx_wait_time          | 2209  |  -- master等待事物的总的时间
| Rpl_semi_sync_master_tx_waits              | 1     |  -- master等待事物的次数
| Rpl_semi_sync_master_wait_pos_backtraverse | 0     |
| Rpl_semi_sync_master_wait_sessions         | 0     |
| Rpl_semi_sync_master_yes_tx                | 1     |
| Rpl_semi_sync_slave_status                 | OFF   |
+--------------------------------------------+-------+
15 rows in set (0.03 sec)
  • slave server
(root@localhost) 13:47:48 [tablespace]> stop slave io_thread;  --停掉IO线程
Query OK, 0 rows affected (0.05 sec)
  • master server
(root@localhost) 14:48:11 [tablespace]>  show global status like "%rpl%";
+--------------------------------------------+-------+
| Variable_name                              | Value |
+--------------------------------------------+-------+
| Rpl_semi_sync_master_clients               | 1     |
| Rpl_semi_sync_master_net_avg_wait_time     | 0     |
| Rpl_semi_sync_master_net_wait_time         | 0     |
| Rpl_semi_sync_master_net_waits             | 1     |
| Rpl_semi_sync_master_no_times              | 0     |
| Rpl_semi_sync_master_no_tx                 | 0     |
| Rpl_semi_sync_master_status                | ON    |
| Rpl_semi_sync_master_timefunc_failures     | 0     |
| Rpl_semi_sync_master_tx_avg_wait_time      | 2209  |
| Rpl_semi_sync_master_tx_wait_time          | 2209  |
| Rpl_semi_sync_master_tx_waits              | 1     |
| Rpl_semi_sync_master_wait_pos_backtraverse | 0     |
| Rpl_semi_sync_master_wait_sessions         | 0     |
| Rpl_semi_sync_master_yes_tx                | 1     |
| Rpl_semi_sync_slave_status                 | OFF   |
+--------------------------------------------+-------+
15 rows in set (0.00 sec)

(root@localhost) 14:48:14 [tablespace]> insert into qqq values(99);  --插入sql
Query OK, 1 row affected (5.04 sec)    -- 等待5秒后,切成异步

(root@localhost) 14:49:28 [tablespace]>  show global status like "%rpl%";
+--------------------------------------------+-------+
| Variable_name                              | Value |
+--------------------------------------------+-------+
| Rpl_semi_sync_master_clients               | 0     |
| Rpl_semi_sync_master_net_avg_wait_time     | 0     |
| Rpl_semi_sync_master_net_wait_time         | 0     |
| Rpl_semi_sync_master_net_waits             | 1     | 
| Rpl_semi_sync_master_no_times              | 1     |  --切成异步的次数
| Rpl_semi_sync_master_no_tx                 | 1     |  --切成异步后的事物数
| Rpl_semi_sync_master_status                | OFF   |  --status 为off
| Rpl_semi_sync_master_timefunc_failures     | 0     |
| Rpl_semi_sync_master_tx_avg_wait_time      | 2209  |
| Rpl_semi_sync_master_tx_wait_time          | 2209  |
| Rpl_semi_sync_master_tx_waits              | 1     |
| Rpl_semi_sync_master_wait_pos_backtraverse | 0     |
| Rpl_semi_sync_master_wait_sessions         | 0     |
| Rpl_semi_sync_master_yes_tx                | 1     |
| Rpl_semi_sync_slave_status                 | OFF   |
+--------------------------------------------+-------+
15 rows in set (0.00 sec)

(root@localhost) 14:49:57 [tablespace]>

1.6. 两种复制方式的性能

1166598-20180227090833331-32637415.png

备注:上图是Facebook的测试性能图;其中Y轴是QPS,X轴是并发数

  • 蓝色的 Normal Slave异步复制
    • 性能很好,但是随着并发数的增长,性能有所下降
  • 绿色的 Enhanced mysqlbinlog无损复制
    • 随着并发数的增长,性能几乎是线性增长的,在高并发下,性能会优于异步复制
  • 紫色的 Normal Semi Slave 是 半同步复制
    • 性能较低

无损复制性能优于半同步复制的原因

  1. 等待ACK回包问题上,其实两种复制的开销是一样的,没有区别,都是网络的等待开销。
  2. 无损复制由于在 write binlog (commit 的第二步)后,需要等待ACK,后续的事务无法提交,这样就 堆积N多个需要落盘的事务(半同步复制由于已经提交了事务,没有堆积事务的效果),通过 组提交 机制一次 fsync的多个事务(半同步复制也有组提交,只是一次 fsync 的事务数没那么多), 相当于提高了I/O性能 ;所以线程(事务)越多,效果越明显,以至于有上图中超过异步复制的效果。(无损复制的组提交比例比原版的高3~4倍)

产生上述测试效果的前提:测试用例是 IO Bound 的(比如数据量有 100G,而 buffer pool 只有 10G),且并发数足够多。

下面这两个参数不要去设置,设置了反而性能差

(root@localhost) 10:10:47 [tablespace]> show variables like "%binlog_group%";
+-----------------------------------------+-------+
| Variable_name                           | Value |
+-----------------------------------------+-------+
| binlog_group_commit_sync_delay          | 0     |
| binlog_group_commit_sync_no_delay_count | 0     |   -- 等待一组里面有多少事务我才提交
+-----------------------------------------+-------+
2 rows in set (0.01 sec)

(root@localhost) 10:10:50 [tablespace]> show variables like "%binlog_max%";
+-----------------------------+-------+
| Variable_name               | Value |
+-----------------------------+-------+
| binlog_max_flush_queue_time | 0     |               -- 等待多少时间后才进行组提交
+-----------------------------+-------+
1 row in set (0.00 sec)

(root@localhost) 10:11:10 [tablespace]>

1.7. rpl_semi_sync_master_wait_for_slave_count

该参数控制Master在收到 多少个 Slave的ACK 后,才可以继续commit。配置多个ACK和配置一个ACK的效果是类似的,因为他们是 并行执行 的(理论上来说不会有两倍的等待时间), 取决于最慢的那个 。


二. 并行复制(Multi-Threaded Slave)

2.1. MTS介绍

在官方文档中,并行复制的叫法为 Multi-Threaded Slave (MTS)

  • MySQL的并行复制基于组提交:

一个组提交中的事务都是可以并行执行的 ,因为既然处于组提交中,这意味着事务之间没有冲突(不会去更新同一行数据),否则不可能在同一个组里面。
Slave上开启并行复制,需要在配置文件中增加以下参数:

[mysqld]
slave-parallel-type=LOGICAL_CLOCK   
slave_preserve_commit_order=1
slave-parallel-workers=4
  • slave-parallel-type 参数

    • DATABASE 基于库级别的并行复制,如果只有一个库,就还是串行(为了兼容5.6)。
    • LOGICAL_CLOCK 基于逻辑时钟,主上怎么并行执行,从上也是怎么并行回放的。
  • slave-parallel-workers 并行复制的线程数,一般设置为一个组内提交的事务数,线上设置为32足够了

  • slave_preserve_commit_order Slave上commit的顺序保持一致,必须为1,否则可能会有GAP锁产生

2.2. 动态调整复制线程数

配置并行复制后,Slave节点可以看到4个 Coordinator 线程

mysql root@localhost:(none)> show processlist;
+----+-------------+-----------+--------+---------+--------+--------------------------------------------------------+------------------+
| Id | User        | Host      | db     | Command | Time   | State                                                  | Info             |
+----+-------------+-----------+--------+---------+--------+--------------------------------------------------------+------------------+
| 1  | system user |           | <null> | Connect | 148155 | Waiting for master to send event                       | <null>           |
| 2  | system user |           | <null> | Connect | 57792  | Slave has read all relay log; waiting for more updates | <null>           |
| 3  | system user |           | <null> | Connect | 57791  | Waiting for an event from Coordinator                  | <null>           |
| 4  | system user |           | <null> | Connect | 148155 | Waiting for an event from Coordinator                  | <null>           |
| 7  | system user |           | <null> | Connect | 148155 | Waiting for an event from Coordinator                  | <null>           |
| 8  | system user |           | <null> | Connect | 148155 | Waiting for an event from Coordinator                  | <null>           |
| 12 | root        | localhost | <null> | Query   | 0      | starting                                               | show processlist |
+----+-------------+-----------+--------+---------+--------+--------------------------------------------------------+------------------+
7 rows in set
Time: 0.019s
mysql root@localhost:(none)>

-- 动态调整方式如下:

mysql root@localhost:(none)> set global slave_parallel_workers=8;
Query OK, 0 rows affected
Time: 0.003s
mysql root@localhost:(none)> stop slave;  一定要重启一下slave才能有效
Query OK, 0 rows affected
Time: 0.038s
mysql root@localhost:(none)> start slave;
Query OK, 0 rows affected
Time: 0.080s
mysql root@localhost:(none)> show processlist;
+----+-------------+-----------+--------+---------+------+--------------------------------------------------------+------------------+
| Id | User        | Host      | db     | Command | Time | State                                                  | Info             |
+----+-------------+-----------+--------+---------+------+--------------------------------------------------------+------------------+
| 12 | root        | localhost | <null> | Query   | 0    | starting                                               | show processlist |
| 13 | system user |           | <null> | Connect | 3    | Waiting for master to send event                       | <null>           |
| 14 | system user |           | <null> | Connect | 3    | Slave has read all relay log; waiting for more updates | <null>           |
| 15 | system user |           | <null> | Connect | 3    | Waiting for an event from Coordinator                  | <null>           |
| 16 | system user |           | <null> | Connect | 3    | Waiting for an event from Coordinator                  | <null>           |
| 17 | system user |           | <null> | Connect | 3    | Waiting for an event from Coordinator                  | <null>           |
| 18 | system user |           | <null> | Connect | 3    | Waiting for an event from Coordinator                  | <null>           |
| 19 | system user |           | <null> | Connect | 3    | Waiting for an event from Coordinator                  | <null>           |
| 20 | system user |           | <null> | Connect | 3    | Waiting for an event from Coordinator                  | <null>           |
| 21 | system user |           | <null> | Connect | 3    | Waiting for an event from Coordinator                  | <null>           |
| 22 | system user |           | <null> | Connect | 3    | Waiting for an event from Coordinator                  | <null>           |
+----+-------------+-----------+--------+---------+------+--------------------------------------------------------+------------------+
11 rows in set
Time: 0.019s
mysql root@localhost:(none)>

特别注意:

这里的 并行复制 指的是 SQL Thread (回放线程),而非IO Thread (IO线程)
Waiting for master to send event 这个 Stateshow processlist 中只有一个,即只有一个 IO Thread

线上环境可以配置成两台Slave做无损复制(保证数据不丢),其他的Slave做异步复制(配置为只读,用于负载均衡),都指向同一台Master。


三. GTID

3.1. GTID的介绍

  • 1.Global Transaction Id entifier -- 全局事物ID

  • 2.GTID = Server_UUID + Transaction_ID

    • Server_UUID 是全局唯一的
    • Transaction_ID 是自增的
  • 3.GTID 的作用是替代 Filename + Position

(root@localhost) 14:54:25 [tablespace]> show variables like "server_uuid";
+---------------+--------------------------------------+
| Variable_name | Value                                |
+---------------+--------------------------------------+
| server_uuid   | 9dc847d8-bf72-11e7-9ec4-000c2998e4f1 |
+---------------+--------------------------------------+
1 row in set (0.01 sec)

在MySQL中看到的 UUID ,实际是保存在 $DATADIR/auto.cnf 中的,且该文件是服务器初始化的时候自动生成的。

[root@node1 mysqldata]# cat /r2/mysqldata/auto.cnf
[auto]
server-uuid=9dc847d8-bf72-11e7-9ec4-000c2998e4f1
[root@node1 mysqldata]#

通过 冷备 做备份,拷贝 $DATADIR 时,记得要把备份中的 auto.cnf 给删除、

3.2. GTID的意义

1166598-20180227173818190-1127938351.png

  • 未使用GTID

    • Master宕机 后,一个 Slave选举提升New Master,如果需要重建复制关系,就需要把另外两个SlaveCHANGE MASTER 指向 New Master
    • 那问题来了,原来Slave是指向 MasterFilename_M + Position_M 的位置,现在要指向 New Master 上新的 Filename_N + Position_N 的位置,由于基于二进制日志复制 ,定位两个位置步骤繁琐; 此时两个Slave要继续重建复制关系(CHANGE MASTER)会比较麻烦。
  • 使用GTID

    • 和上面一样的场景,选举机制提升为New Master的GTID 执行到最新事务, 两个Slave需要重新指向 New Master ,由于 使用了GTID ,目前 Slave-A 获取到的事务对应GTID为GTID_ASlave-B 获取到的事务对应GTID为GTID_B;
    • 此时 New Master 上GTID是存在 GTID_A 和 GTID_B,那两个Slave就可以直接使用 GTID_A 和 GTID_B 这两个GTID,通过指向 New Master接着重建复制;

3.3. GTID的配置

[mysqld]
log_bin = binlog
log_slave_updates = 1
gtid_mode = ON 
enforce-gtid-consistency = true
  • 注意:

    • 1.MySQL5.6 必须开启参数 log_slave_updates (5.6版本的限制)

    • 2.MySQL5.6 升级到gtid模式需要停机重启

    • 3.MySQL5.7 版本开始可以不开启 log_slave_updates

    • 4.MySQL5.7.6 版本开始可以在线升级gtid模式

3.4. 基于GTID的复制

1166598-20180228092515483-1745824375.png

3.4.1. 配置基于GTID的复制

配置信息主库(master)从库(slave)
主机node1.gczheng.comnode3.gczheng.com
IP192.168.88.88192.168.88.100
Port33063306
MySQL版本MySQL5.7.18MySQL5.7.18
Server_ID888888100

数据库的备份和还原步骤略过,这里使用 mydumper备份,master上创建账号repl@'192.168.88.100',密码和权限同repl@'192.168.88.99'

  • 1.master节点的配置文件加入
[mysqld]
gtid_mode = ON 
enforce-gtid-consistency = true
  • 2.slave 节点加入
[mysqld]
gtid_mode = ON 
enforce-gtid-consistency = true
  • 3.查看metadata信息
[root@node3 alldb]# cat metadata
Started dump at: 2018-02-15 18:53:58                    --dump时间
SHOW MASTER STATUS:
    Log: binlog.000014                                  --Filename
    Pos: 1361                                           --Pos
    GTID:9dc847d8-bf72-11e7-9ec4-000c2998e4f1:1-52      --GTID时间

注意:metadata中的 GTID:9dc847d8-bf72-11e7-9ec4-000c2998e4f1:1-52 表示为:这个GTID对应的 事物 已在备份中了,Slave还原备份后,进行复制时要跳过这个GTID即对应的事物。

3.4.2. 跳过GTIDs

  • Slave节点

如果备份使用 mysqldump 进行备份, 请跳过该步骤 ,因为在备份中已经存在下面这个语句:

--
-- GTID state at the beginning of the backup 
--
SET @@GLOBAL.GTID_PURGED='9dc847d8-bf72-11e7-9ec4-000c2998e4f1:1-52';

因为我们使用 mydumper 进行备份, myloader还原并不会帮我们执行上述语句,需要手工执行该语句,让 Slave 知道这个GTID需要跳过:

mysql> SET @@GLOBAL.GTID_PURGED='9dc847d8-bf72-11e7-9ec4-000c2998e4f1:1-52';
ERROR 1840 (HY000): @@GLOBAL.GTID_PURGED can only be set when @@GLOBAL.GTID_EXECUTED is empty.
-- 这里报出一个错误,意思为如果要设置 @@GLOBAL.GTID_PURGED ,则 @@GLOBAL.GTID_EXECUTED 必须为空。

需要使用 reset master 命令,将 @@GLOBAL.GTID_EXECUTED清空。

(root@localhost) 09:22:38 [tablespace]> reset master;
Query OK, 0 rows affected (0.07 sec)

(root@localhost) 09:23:18 [tablespace]> SET @@GLOBAL.GTID_PURGED='9dc847d8-bf72-11e7-9ec4-000c2998e4f1:1-52';
Query OK, 0 rows affected (0.02 sec)

至此,我们已经跳过了备份中存在的GTIDs所对应的事物

3.4.3. CHANGE MASTER

  • Slave节点
(root@localhost) 10:33:35 [tablespace]> change master to master_host="192.168.88.88", master_port=3306, master_user='repl',master_password='123456', master_auto_position=1;
Query OK, 0 rows affected, 2 warnings (0.04 sec)

(root@localhost) 10:33:46 [tablespace]> start slave;
Query OK, 0 rows affected (0.03 sec)

(root@localhost) 10:58:50 [tablespace]> show slave status  \G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.88.88
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: binlog.000014
          Read_Master_Log_Pos: 1893
               Relay_Log_File: node3-relay-bin.000007
                Relay_Log_Pos: 445
        Relay_Master_Log_File: binlog.000014
             Slave_IO_Running: Yes                          --IO线程ok
            Slave_SQL_Running: Yes                          --SQL线程ok
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 1893
              Relay_Log_Space: 2598
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 8888
                  Master_UUID: 9dc847d8-bf72-11e7-9ec4-000c2998e4f1
             Master_Info_File: /r2/mysqldata/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set: 9dc847d8-bf72-11e7-9ec4-000c2998e4f1:29-54   --获取到GTID
            Executed_Gtid_Set: 9dc847d8-bf72-11e7-9ec4-000c2998e4f1:29-54   --执行到GTID
                Auto_Position: 1
         Replicate_Rewrite_DB:
                 Channel_Name:
           Master_TLS_Version:
1 row in set (0.00 sec)

ERROR:
No query specified

3.4.4. 复制完成

  • master节点
(root@localhost) 10:57:09 [tablespace]> show slave hosts;
+-----------+------+------+-----------+--------------------------------------+
| Server_id | Host | Port | Master_id | Slave_UUID                           |
+-----------+------+------+-----------+--------------------------------------+
|     88100 |      | 3306 |      8888 | 5d4e3453-0fc5-11e8-b387-000c29c9f498 |  --新建slave
|      8899 |      | 3306 |      8888 | 1951c7ee-c1fb-11e7-ac44-005056305232 |
+-----------+------+------+-----------+--------------------------------------+
2 rows in set (0.00 sec)

(root@localhost) 10:58:34 [tablespace]> insert into qqq values(101);  --插入a的值101
Query OK, 1 row affected (0.02 sec)

(root@localhost) 10:59:07 [tablespace]> select * from  qqq;
+------+
| a    |
+------+
|    1 |
|    2 |
|    3 |
|    4 |
|   10 |
|   99 |
|  101 |
+------+
7 rows in set (0.00 sec)
  • slave节点
(root@localhost) 11:02:32 [tablespace]> select * from  qqq;
+------+
| a    |
+------+
|    1 |
|    2 |
|    3 |
|    4 |
|   10 |
|   99 |
|  101 |  --slave同步到值101
+------+
7 rows in set (0.00 sec)

(root@localhost) 11:03:08 [tablespace]>

3.5. GTID 与 Filename-Pos的对应

在 binlog 中,多了一个 GTID 的 event ,如下所示

(root@localhost) 11:43:24 [tablespace]> show binlog events;
+---------------+-----+----------------+-----------+-------------+--------------------------------------------------------------------+
| Log_name      | Pos | Event_type     | Server_id | End_log_pos | Info                                                               |
+---------------+-----+----------------+-----------+-------------+--------------------------------------------------------------------+
| binlog.000001 |   4 | Format_desc    |     88100 |         123 | Server ver: 5.7.18-log, Binlog ver: 4                              |
| binlog.000001 | 123 | Previous_gtids |     88100 |         154 |                                                                    |
| binlog.000001 | 154 | Gtid           |      8888 |         219 | SET @@SESSION.GTID_NEXT= '9dc847d8-bf72-11e7-9ec4-000c2998e4f1:55' | -- 一个GTID 的 event
| binlog.000001 | 219 | Query          |      8888 |         282 | BEGIN                                                              |
| binlog.000001 | 282 | Table_map      |      8888 |         334 | table_id: 227 (tablespace.qqq)                                     |
| binlog.000001 | 334 | Write_rows     |      8888 |         374 | table_id: 227 flags: STMT_END_F                                    |
| binlog.000001 | 374 | Xid            |      8888 |         405 | COMMIT /* xid=1407 */                                              |
+---------------+-----+----------------+-----------+-------------+--------------------------------------------------------------------+
7 rows in set (0.00 sec)

(root@localhost) 11:43:39 [tablespace]>

通过 扫描 binlog中的GTID值,就可以知道 GTIDFilename-Pos对应的关系 ,但是如果binlog非常大,扫描的量也是会很大的,所以在binlog开头部分有一个 Previous_gtids 的event,如下所示:

(root@localhost) 11:44:57 [tablespace]> show binlog events in 'binlog.000001' \G;
*************************** 1. row ***************************
   Log_name: binlog.000001
        Pos: 4
 Event_type: Format_desc
  Server_id: 88100
End_log_pos: 123
       Info: Server ver: 5.7.18-log, Binlog ver: 4
*************************** 2. row ***************************
   Log_name: binlog.000001
        Pos: 123
 Event_type: Previous_gtids      -- 表示在次之前,GTID运行到的范围是哪里
  Server_id: 88100
End_log_pos: 154
       Info:
*************************** 3. row ***************************
   Log_name: binlog.000001
        Pos: 154
 Event_type: Gtid
  Server_id: 8888
End_log_pos: 219
       Info: SET @@SESSION.GTID_NEXT= '9dc847d8-bf72-11e7-9ec4-000c2998e4f1:55'
*************************** 4. row ***************************
   Log_name: binlog.000001
        Pos: 219
 Event_type: Query
  Server_id: 8888
End_log_pos: 282
       Info: BEGIN
*************************** 5. row ***************************
   Log_name: binlog.000001
        Pos: 282
 Event_type: Table_map
  Server_id: 8888
End_log_pos: 334
       Info: table_id: 227 (tablespace.qqq)
*************************** 6. row ***************************
   Log_name: binlog.000001
        Pos: 334
 Event_type: Write_rows
  Server_id: 8888
End_log_pos: 374
       Info: table_id: 227 flags: STMT_END_F
*************************** 7. row ***************************
   Log_name: binlog.000001
        Pos: 374
 Event_type: Xid
  Server_id: 8888
End_log_pos: 405
       Info: COMMIT /* xid=1407 */
7 rows in set (0.01 sec)

ERROR:
No query specified

(root@localhost) 11:45:19 [tablespace]>

如果我要的GTIDPrevious_gtids 的大,就扫描当前文件,反之则扫描之前的文件,依次类推。因为binlog在rotate(rotate events)的时候,是知道当前最大的GTID的,可以将该值写入到下一个新的binlog的开头,即 Previous_gtid

3.6. GTID复制出错的处理

演示人为的在从机上误操作导致的复制失败,如何恢复?

3.6.1. 演示

  • 1.在从机上插入一条记录(模拟误操作)

    • slave节点

现在 Slave 上插入一条记录(现实中如果配置了readonly,在app中是无法插入的,app不会给root权限)

mysql root@localhost:(none)> select * from ttt.t1;
+---+
| a |  --a是主键
+---+
| 1 |
| 2 |
| 3 |
+---+
3 rows in set
Time: 0.012s
mysql root@localhost:(none)> insert into ttt.t1 values(null);
Query OK, 1 row affected
Time: 0.003s
mysql root@localhost:(none)> select * from ttt.t1;
+---+
| a |
+---+
| 1 |
| 2 |
| 3 |
| 4 |
+---+
4 rows in set
Time: 0.014s
mysql root@localhost:(none)>
  • 2.在主机上插入同样的记录

    • master节点
mysql root@localhost:ttt> select * from t1;
+---+
| a |
+---+
| 1 |
| 2 |
| 3 |
+---+
3 rows in set
Time: 0.009s
mysql root@localhost:ttt> insert into t1 values(null);
Query OK, 1 row affected
Time: 0.007s
mysql root@localhost:ttt>
  • 3.查看 Slave上的状态

    • slave节点
mysql root@localhost:(none)> show slave status \G;
***************************[ 1. row ]***************************
Slave_IO_State                | Waiting for master to send event
Master_Host                   | 192.168.88.88
Master_User                   | repl
Master_Port                   | 3306
Connect_Retry                 | 60
Master_Log_File               | binlog.000014
Read_Master_Log_Pos           | 6195
Relay_Log_File                | node2-relay-bin.000002
Relay_Log_Pos                 | 568
Relay_Master_Log_File         | binlog.000014
Slave_IO_Running              | Yes
Slave_SQL_Running             | No
Replicate_Do_DB               |
Replicate_Ignore_DB           |
Replicate_Do_Table            |
Replicate_Ignore_Table        |
Replicate_Wild_Do_Table       |
Replicate_Wild_Ignore_Table   |
Last_Errno                    | 1062
Last_Error                    | Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction '9dc847d8-bf72-11e7-9ec4-000c2998e4f1:74' at master log binlog.000014, end_log_pos 6164. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.
Skip_Counter                  | 0
Exec_Master_Log_Pos           | 5944
Relay_Log_Space               | 1026
Until_Condition               | None
Until_Log_File                |
Until_Log_Pos                 | 0
Master_SSL_Allowed            | No
Master_SSL_CA_File            |
Master_SSL_CA_Path            |
Master_SSL_Cert               |
Master_SSL_Cipher             |
Master_SSL_Key                |
Seconds_Behind_Master         | <null>
Master_SSL_Verify_Server_Cert | No
Last_IO_Errno                 | 0
Last_IO_Error                 |
Last_SQL_Errno                | 1062 --错误代码1062,主键重复
Last_SQL_Error                | Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction '9dc8
47d8-bf72-11e7-9ec4-000c2998e4f1:74' at master log binlog.000014, end_log_pos 6164. See error log and/or performance_schema.replication_applier_status_by_worker table for m
ore details about this failure or others, if any.
Replicate_Ignore_Server_Ids   |
Master_Server_Id              | 8888
Master_UUID                   | 9dc847d8-bf72-11e7-9ec4-000c2998e4f1
Master_Info_File              | mysql.slave_master_info
SQL_Delay                     | 0
SQL_Remaining_Delay           | <null>
Slave_SQL_Running_State       |
Master_Retry_Count            | 86400
Master_Bind                   |
Last_IO_Error_Timestamp       |
Last_SQL_Error_Timestamp      | 180219 16:07:45
Master_SSL_Crl                |
Master_SSL_Crlpath            |
Retrieved_Gtid_Set            | 9dc847d8-bf72-11e7-9ec4-000c2998e4f1:73-74
Executed_Gtid_Set             | 1951c7ee-c1fb-11e7-ac44-005056305232:1,
9dc847d8-bf72-11e7-9ec4-000c2998e4f1:1-73  --回放到这个事务时报错
Auto_Position                 | 0
Replicate_Rewrite_DB          |
Channel_Name                  |
Master_TLS_Version            |

其实从数据一致性看,目前主从数据是一致的,只是复制过来的日志在回放时,发现已有了该部分数据(Error:1062),从而引发了复制异常(SQL回放线程停止)。 我们只需要告诉MySQL, 跳过 这部分一样的GTID,继续复制,即可。

3.6.2. 处理复制错误

这里的 跳过 的方法很巧妙,步骤如下

  • 1.将Slave上的 gtid_next 指向 执行失败 的那个 gtid

    • 这里执行失败的 gtid 报错信息中已经给出: '9dc847d8-bf72-11e7-9ec4-000c2998e4f1:73-74'
    • 如果不看报错信息,可以看 Retrieved_Gtid_Set 和 Executed_Gtid_Set 的对比结果
  • 2.执行一个空的事物,即begin;commit;

    • 这样就把 失败的gtid 对应到了一个 空的事务 上,这个步骤即为 “跳过” 的意思
  • 3.将 gtid_next 设置(还原)为 automatic

操作:
- slave节点

mysql root@localhost:(none)> select @@gtid_next;    -- 当前为默认值,AUTOMATIC
+-------------+
| @@gtid_next |
+-------------+
| AUTOMATIC   |
+-------------+
1 row in set
Time: 0.010s

-- 步骤1 : 设置 gtid_next 为回放失败的gtid
mysql root@localhost:(none)> set  gtid_next="9dc847d8-bf72-11e7-9ec4-000c2998e4f1:74";
Query OK, 0 rows affected
Time: 0.001s

-- 步骤2 : 执行一个空的事物,让回放失败的gtid对应到这个空的事物
mysql root@localhost:(none)> begin;
Query OK, 0 rows affected
Time: 0.000s
mysql root@localhost:(none)> commit;
Query OK, 0 rows affected
Time: 0.000s

-- -- 步骤3 : 还原gtid_next为automatic
mysql root@localhost:(none)> set  gtid_next="automatic";
Query OK, 0 rows affected
Time: 0.000s

-- -- 步骤4 : 重新启动slave
mysql root@localhost:(none)> stop slave;
Query OK, 0 rows affected
Time: 0.001s
mysql root@localhost:(none)> start slave;
Query OK, 0 rows affected
Time: 0.046s
mysql root@localhost:(none)> show slave status \G;
***************************[ 1. row ]***************************
Slave_IO_State                | Waiting for master to send event
Master_Host                   | 192.168.88.88
Master_User                   | repl
Master_Port                   | 3306
Connect_Retry                 | 60
Master_Log_File               | binlog.000014
Read_Master_Log_Pos           | 6195
Relay_Log_File                | node2-relay-bin.000005
Relay_Log_Pos                 | 357
Relay_Master_Log_File         | binlog.000014
Slave_IO_Running              | Yes
Slave_SQL_Running             | Yes
Replicate_Do_DB               |
Replicate_Ignore_DB           |
Replicate_Do_Table            |
Replicate_Ignore_Table        |
Replicate_Wild_Do_Table       |
Replicate_Wild_Ignore_Table   |
Last_Errno                    | 0
Last_Error                    |
Skip_Counter                  | 0
1 row in set
Time: 0.010s
mysql root@localhost:(none)>

skip_gtid_errors.sh 脚本

#!/bin/bash
# line V1.0
# mail:gczheng@139.com
# data:2018-08-30
# script_name:skip_gtid_errors.sh  

user=root
pass='iforgot'
host='localhost'

for((i=1;i<1000;i++))
do
#sql21=`mysql -u${user} -p${pass} -h${host} -e "show slave status\G;" 2>/dev/null|grep -v Last_Error | grep 'executing transaction'| awk '{print $21}'`

Errno=`mysql -u${user} -p${pass} -h${host} -e "show slave status\G;" 2>/dev/null|grep 'Last_SQL_Errno' | awk -F ":" '{print $2}'`

Seconds=`mysql -u${user} -p${pass} -h${host} -e "show slave status\G;" 2>/dev/null|grep 'Seconds_Behind_Master' | awk -F ":" '{print $2}'`

Last_seen_transaction=`mysql -u${user} -p${pass} -h${host} -Ne "select LAST_SEEN_TRANSACTION from performance_schema.replication_applier_status_by_worker" 2>/dev/null`

if [ $Errno -eq 1061 ] || [$Errno -eq 1062 ] || [$Errno -eq 1217 ] || [$Errno -eq 1050 ] 
   then
        echo "----------------------"
    echo $(date +'%Y-%m-%d %H:%M:%S')  && echo "Last_seen_transaction:${Last_seen_transaction}" &&  echo -e "Last_Errno:\033[31m ${Errno}\033[0m"
        mysql -u${user} -p${pass} -h${host} -e "STOP SLAVE;SET @@SESSION.GTID_NEXT='${Last_seen_transaction}';BEGIN; COMMIT;SET @@SESSION.GTID_NEXT = AUTOMATIC;START SLAVE;" 2>/dev/null 
    else
        sleep 3
    echo "----------------------" 
    echo  $(date +'%Y-%m-%d %H:%M:%S') && echo  "Seconds_Behind_Master: ${Seconds} " &&  echo -e "Last_Errno:\033[32m ${Errno}\033[0m"
fi
done

1166598-20180830105643699-355159235.png

3.6.3. 测试复制

  • 1.Master端插入一个测试数据

    • **master节点*
mysql root@localhost:ttt> insert into t1 values(null);
Query OK, 1 row affected
Time: 0.007s
mysql root@localhost:ttt>
  • 2.查看Slave2上的数据是否同步

    • **slave节点*
mysql root@localhost:(none)> select * from ttt.t1;
+---+
| a |
+---+
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |   -- 立即同步过来的数据
+---+
5 rows in set
Time: 0.011s
mysql root@localhost:(none)>

至此,GTID复制出错的处理就完成了。

注意:这里仅仅是 跳过错误 ,和原来的 sql_slave_skip_counter (该功能在GTID下失效)功能类似,无法保证主从数据是一致的(需要人工介入进行确认,比如仅仅主键一样,其他列不一样)

如果出现了很多的GTID的错误,可能是从机上有大量的操作,建议 重新搭建主从复制
但还是要在源头上避免此类情况的发生,确保在从机上开启 read_only=1 ,并且避免人工的误操作。 最后强调: GTID基于事务 的复制,一致性要求很高, 强烈建议 在 Slave 上开启 read_only=1

3.7. 其他注意事项

  1. reset slave all
    使用该命令时 不会清空数据 ,仅仅是清空 show slave status\G里面的信息,所以在使用该命令之前,请 先记录show slave status\G的信息。
  2. GTID
    在开启GTID后,不能在一个事物中使用创建临时表的语句,需要使得autocommit=1; 才可以。
    在开启GTID后,不能使用 create table select ...的语法来创建表了,因为这其实是多个事物了,GTID没法对应

四. 级联复制

4.1. 级联复制的介绍

1166598-20180228135032290-1447539789.png

slave Bmaster A 上复制, slave Csalve B 上复制,此时 slave B 上就要开启 log_slave_updates 产生binlog ,没有 binlog 则无法传递给
salve c

master 配置log_bin参数是当变更时产生binlog, 复制关系中slave获取master的binlog写入relay-log中,slave自身不产生binlog

4.2. 级联复制的场景

4.2.1. 跨机房的复制

1166598-20180228121046261-750665349.png

在跨机房搭建复制时,如果 master 挂了,深圳slave 提升为 New Master ,此时 北京slave 是不需要去做 CHANGE MASTER 操作的。

缺点是复制的延迟会更大(跨机房的延迟本来就很难避免)。

4.2.2. 库的拆分

1166598-20180228142610593-1776077534.png

生产库 压力很大时或者需要做统计分析时,需要把 逻辑库拆分出去,就可以使用 级联复制 ,让 逻辑库 形成单独的库。

4.3. 级联复制测试

1166598-20180228144053428-1001737392.png

配置信息主库(master)从库(slave A)从库(slave B)从库(slave C)
主机node1.gczheng.comnode2.gczheng.comnode3.gczheng.comproxy.gczheng.com
IP192.168.88.88192.168.88.99192.168.88.100192.168.88.77
Port3306330633063306
MySQL版本MySQL5.7.18MySQL5.7.18MySQL5.7.18MySQL5.7.18
Server_ID88888899881008877
  • Master --> Slave A --> Slave C 是级联复制,Master到Slave A 和 Master到Slave B 之前已部署完,直接做 Slave A到Slave C 的级联复制

  • 1、在Master端新建一个DB名为 mts库 ,用于在 Slave3 上测试参数 replicate_do_db

    • master节点
(root@localhost) 14:54:28 [tablespace]> create database mts;
Query OK, 1 row affected (0.03 sec)

(root@localhost) 14:54:33 [tablespace]> use mts;
Database changed

(root@localhost) 14:55:34 [mts]> create table t(a int);
Query OK, 0 rows affected (0.07 sec)

(root@localhost) 14:55:38 [mts]> insert into t values(1),(2),(3);
Query OK, 3 rows affected (0.01 sec)
Records: 3  Duplicates: 0  Warnings: 0

(root@localhost) 14:56:10 [mts]> select * from mts.t;
+------+
| a    |
+------+
|    1 |
|    2 |
|    3 |
+------+
3 rows in set (0.00 sec)

(root@localhost) 14:56:22 [mts]>
  • 2、在Slave C上增加如下参数
[mysqld]
report-host=192.168.88.77 
# 只复制mts的库 
replicate_do_db=mts
  • 3、确认在slave A 已添加log_slave_updates,并建立通用账号'repl'@'192.168.88.%'
[mysqld]
log_slave_updates=1
  • 4、change master

    • Slave C 节点
(root@localhost) 16:19:58 [mts]> change master to master_host="192.168.88.99", master_port=3306, master_user='repl', master_password='123456', master_auto_position=1;
Query OK, 0 rows affected, 2 warnings (0.02 sec)

(root@localhost) 16:20:31 [mts]> start slave;
Query OK, 0 rows affected (0.03 sec)

(root@localhost) 16:20:38 [mts]> show slave status \G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.88.99
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: binlog.000035
          Read_Master_Log_Pos: 2707
               Relay_Log_File: proxy-relay-bin.000002
                Relay_Log_Pos: 405
        Relay_Master_Log_File: binlog.000035
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: mts
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 2707
              Relay_Log_Space: 612
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 8899
                  Master_UUID: 1951c7ee-c1fb-11e7-ac44-005056305232
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set:
            Executed_Gtid_Set: 1951c7ee-c1fb-11e7-ac44-005056305232:1-17,
9dc847d8-bf72-11e7-9ec4-000c2998e4f1:1-29:47-58
                Auto_Position: 1
         Replicate_Rewrite_DB:
                 Channel_Name:
           Master_TLS_Version:
1 row in set (0.00 sec)

ERROR:
No query specified

(root@localhost) 16:20:44 [mts]>

复制关系:Master-->Slave A-->Slave C ,且使用了基于GTID的复制


五.多源复制

5.1. 多源复制的介绍

多源复制官方文档

多源复制 是指多主一从的复制,从 MySQL5.7.6 才有的功能,如下图所示:

1166598-20180301111457423-764450486.png

在语法层面上,只是在原来的change master 的基础上,增加了 for channel 'channel_name'

5.2. 多源复制的演示

主机信息:

配置信息主库(master A)主库(master B)从库(slave )
主机node1.gczheng.comnode2.gczheng.comnode3.gczheng.com
IP192.168.88.88192.168.88.99192.168.88.100
Port330633063306
MySQL版本MySQL5.7.18MySQL5.7.18MySQL5.7.18
Server_ID8888889988100

master A 和 master B 统一创建同步user `repl@'192.168.88.%'' password为123456

    1. master准备数据
    • master A 节点
mysql root@localhost:(none)> create database dbn1;
Query OK, 1 row affected
Time: 0.011s
】
mysql root@localhost:(none)> use dbn1
You are now connected to database "dbn1" as user "root"
Time: 0.007s

mysql root@localhost:dbn1> create table tn1(a int auto_increment primary key);
Query OK, 0 rows affected
Time: 0.062s

mysql root@localhost:dbn1> insert into dbn1.tn1 values(null);
Query OK, 1 row affected
Time: 0.068s

mysql root@localhost:dbn1> insert into dbn1.tn1 values(null);
Query OK, 1 row affected
Time: 0.014s

mysql root@localhost:dbn1> select * from dbn1.tn1;
+---+
| a |
+---+
| 1 |
| 2 |
+---+
2 rows in set
Time: 0.010s

mysql root@localhost:dbn1> show master status \G;
***************************[ 1. row ]***************************
File              | binlog.000014
Position          | 3953
Binlog_Do_DB      |
Binlog_Ignore_DB  |
Executed_Gtid_Set | 9dc847d8-bf72-11e7-9ec4-000c2998e4f1:1-64 --执行到64
1 row in set
Time: 0.007s
mysql root@localhost:dbn1>
- **master B 节点**
mysql gcdb@localhost:(none)> create database dbn2;
Query OK, 1 row affected
Time: 0.010s

mysql gcdb@localhost:(none)> use dbn2;
You are now connected to database "dbn2" as user "gcdb"
Time: 0.015s

mysql gcdb@localhost:dbn2> create table tn2(a int auto_increment primary key);
Query OK, 0 rows affected
Time: 0.061s

mysql gcdb@localhost:dbn2> insert into dbn2.tn2 values(null);
Query OK, 1 row affected
Time: 0.015s

mysql gcdb@localhost:dbn2> insert into dbn2.tn2 values(null);
Query OK, 1 row affected
Time: 0.003s

mysql gcdb@localhost:dbn2> insert into dbn2.tn2 values(null);
Query OK, 1 row affected
Time: 0.003s

mysql gcdb@localhost:dbn2> insert into dbn2.tn2 values(null);
Query OK, 1 row affected
Time: 0.002s

mysql gcdb@localhost:dbn2> select * from  dbn2.tn2;
+---+
| a |
+---+
| 1 |
| 2 |
| 3 |
| 4 |
+---+
4 rows in set
Time: 0.012s

mysql gcdb@localhost:dbn2> show master status \G;
***************************[ 1. row ]***************************
File              | binlog.000001
Position          | 1687
Binlog_Do_DB      |
Binlog_Ignore_DB  |
Executed_Gtid_Set | 1951c7ee-c1fb-11e7-ac44-005056305232:1-7    --执行到7
1 row in set
Time: 0.012s
mysql gcdb@localhost:dbn2>
  • 2.mysqldump 备份导出导入

    • Slave 节点
[root@node3 ~]# mysqldump -ugcdb -piforgot -h192.168.88.99 -P 3306 -B dbn2 >/tmp/dbn2.sql
[root@node3 ~]# mysqldump -ugcdb -piforgot -h192.168.88.88 -P 3306 -B dbn1 >/tmp/dbn1.sql

[root@node3 tmp]# less dbn2.sql

-----------省略其他输出------------

--在恢复备份的时候,会自动执行该语句,就不需要我们手工跳过了,如果是mydumper之类的,需要手工跳过

SET @@GLOBAL.GTID_PURGED='1951c7ee-c1fb-11e7-ac44-005056305232:1-7';

------------省略其他输出------------

[root@node3 tmp]# mysql -uroot -piforgot < /tmp/dbn2.sql
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 1840 (HY000) at line 24: @@GLOBAL.GTID_PURGED can only be set when @@GLOBAL.GTID_EXECUTED is empty.

--还是之前的问题,需要在Slave上,先操作一下 reset master ,以清空 @@GLOBAL.GTID_EXECUTED 

[root@node3 tmp]# mysql -uroot -piforgot < /tmp/dbn2.sql
mysql: [Warning] Using a password on the command line interface can be insecure.

[root@node3 tmp]# mysql -uroot -piforgot < /tmp/dbn1.sql
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 1840 (HY000) at line 24: @@GLOBAL.GTID_PURGED can only be set when @@GLOBAL.GTID_EXECUTED is empty.

[root@node3 tmp]# mysql -uroot -piforgot < /tmp/dbn1.sql
mysql: [Warning] Using a password on the command line interface can be insecure.
[root@node3 tmp]#
  • 3 change master

先在Slave上设置需要复制的库,如果不设置的话,默认会同步系统的库(mysql),这样可能会复制出错(因为master amaster b上的mysql库中可能有相同的记录)。

  • Slave 节点
[mysqld]

# 只复制dbn1 和 dbn2,这个步骤很重要,且有多个时,必须分多行写

replicate_do_db=dbn1 
replicate_do_db=dbn2

#mysql5.7多源复制必须添加的参数(不加报错),5.7版本之前不用加

master_info_repository=TABLE
relay_log_info_repository=TABLE
(root@localhost) 13:57:46 [(none)]> change master to master_host='192.168.88.88', master_port=3306, master_user='repl', master_password='123456', master_auto_position=1 for channel 'ch1';
Query OK, 0 rows affected, 2 warnings (0.01 sec)

(root@localhost) 13:58:12 [(none)]> change master to master_host='192.168.88.99', master_port=3306, master_user='repl', master_password='123456', master_auto_position=1 for channel 'ch2';
Query OK, 0 rows affected, 2 warnings (0.01 sec)

(root@localhost) 13:59:19 [(none)]> start slave for channel 'ch1';
Query OK, 0 rows affected (0.00 sec)

(root@localhost) 13:59:59 [(none)]> start slave for channel 'ch2';
Query OK, 0 rows affected (0.00 sec)

(root@localhost) 14:39:13 [(none)]> show slave status for channel 'ch1' \G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.88.88
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: binlog.000014
          Read_Master_Log_Pos: 4349
               Relay_Log_File: node3-relay-bin-ch1.000002
                Relay_Log_Pos: 405
        Relay_Master_Log_File: binlog.000014
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: dbn1,dbn2
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 4349
              Relay_Log_Space: 616
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 8888
                  Master_UUID: 9dc847d8-bf72-11e7-9ec4-000c2998e4f1
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set:
            Executed_Gtid_Set: 9dc847d8-bf72-11e7-9ec4-000c2998e4f1:1-66
                Auto_Position: 1
         Replicate_Rewrite_DB:
                 Channel_Name: ch1
           Master_TLS_Version:
1 row in set (0.00 sec)

ERROR:
No query specified

(root@localhost) 14:41:49 [(none)]> show slave status for channel 'ch2' \G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.88.99
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: binlog.000001
          Read_Master_Log_Pos: 2083
               Relay_Log_File: node3-relay-bin-ch2.000002
                Relay_Log_Pos: 801
        Relay_Master_Log_File: binlog.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: dbn1,dbn2
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 2083
              Relay_Log_Space: 1012
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 8899
                  Master_UUID: 1951c7ee-c1fb-11e7-ac44-005056305232
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set: 1951c7ee-c1fb-11e7-ac44-005056305232:8-9
            Executed_Gtid_Set: 9dc847d8-bf72-11e7-9ec4-000c2998e4f1:1-66
                Auto_Position: 1
         Replicate_Rewrite_DB:
                 Channel_Name: ch2
           Master_TLS_Version:
1 row in set (0.00 sec)

ERROR:
No query specified

(root@localhost) 14:41:54 [(none)]>
  • 验证

    • master A 节点
mysql root@localhost:(none)> insert into dbn1.tn1 values(null);
Query OK, 1 row affected
Time: 0.013s
mysql root@localhost:(none)> insert into dbn1.tn1 values(null);
Query OK, 1 row affected
Time: 0.012s
mysql root@localhost:(none)>
  • master B 节点
mysql root@localhost:(none)> insert into dbn2.tn2 values(null);
Query OK, 1 row affected
Time: 0.003s
mysql root@localhost:(none)> insert into dbn2.tn2 values(null);
Query OK, 1 row affected
Time: 0.003s
mysql root@localhost:(none)>
  • slave A 节点
(root@localhost) 14:41:54 [(none)]> select * from dbn1.tn1;
+---+
| a |
+---+
| 1 |
| 2 |
| 3 |
| 4 |
+---+
4 rows in set (0.00 sec)

(root@localhost) 14:45:14 [(none)]> select * from dbn2.tn2;
+---+
| a |
+---+
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
+---+
6 rows in set (0.00 sec)

(root@localhost) 14:45:17 [(none)]>

至此,多源复制的主从搭建就完成了。

5.3.使用场景

如果 Master B上也有一个 dbn1 的库,会有问题么?

  • 如果不做额外的配置,是 会有错误 的;

  • 如果配置了 slave_skip_errors = ddl_exist_errors ,且 没有重复数据 话,复制关系还是正常的。

    这种操作可以起到 数据聚合的效果。将分库分表后的数据聚合在一起,以供其他应用进行分析(前提是数据不能有重复)。
    但是最合适的场景还是将不同的库进行复制。

  • 中间件的unique key

中间件可以保证分区键是唯一的(比如order_id),但是对于其他唯一索引来说,需要业务层去保证

转载于:https://www.cnblogs.com/gczheng/p/8480210.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值