027：高级复制

MySQL复制技术详解

最新推荐文章于 2025-08-30 14:55:44 发布

转载最新推荐文章于 2025-08-30 14:55:44 发布 · 126 阅读

0 ·

CC 4.0 BY-SA版权

原文链接：http://www.cnblogs.com/gczheng/p/8480210.html

一.半同步和无损复制
二. 并行复制（Multi-Threaded Slave）
- 2.1. MTS介绍
- 2.2. 动态调整复制线程数
三. GTID
四. 级联复制
五.多源复制

一.半同步和无损复制

主从复制基本上都是 异步复制, Master并不关Slave节点有没有获取到数据，所以复制效率很高，但是数据有可能会丢失。

从 MySQL5.5 开始，MySQL推出了semi-sync replication （半同步复制）
- 至少有一个Slave节点收到binlog后再返回（ IO线程接收到即可）
- 减少数据丢失风险
- 不能完全避免数据丢失
- 超时后，切换回异步复制
从 MySQL5.7.2 开始，MySQL推出了lossless semi-sync replication （无损复制）
- 二进制日志（binlog）先写远程（ IO线程接收到即可）
- 可保证数据完全不丢失

1.1. loss less / semi-sync replication插件安装

1、手工安装

mysql> INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so';
mysql> INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so';

2、写入配置文件

[mysqld]
plugin_dir=/usr/local/mysql/lib/plugin
plugin_load = "rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so"

上述操作仅仅是加载了插件，还 未启动 对应的功能，需要配置额外的参数：

[mysqld]
# 等同于 rpl_semi_sync_master_enabled = 1
loose_rpl_semi_sync_master_enabled = 1
# 等同于 rpl_semi_sync_slave_enabled = 1
loose_rpl_semi_sync_slave_enabled = 1
# 超时5秒后，则切换回异步方式
loose_rpl_semi_sync_master_timeout = 5000

使用 loose_ 前缀表示如果没有加载 semi_sync 的插件，则 忽略该参数
当Slave在Timeout后，又追上了Master了（ IO线程），则会 自动切换回半同步复制

注意：半同步复制 / 无损复制 在 主从 上都要 安装插件和开启功能。

注意：要保证任意时刻发生一台机器宕机都不丢失数据的前提是 master sync_binlog设置为1,slave sync_relay_log设置为1。

1.2. semi-sync replication

semi-sync replication 称为 半同步复制 ，在一个事务 提交（commit） 的过程时，在 InnoDB 层的 commit log 步骤后，Master节点需要收到 至少一个 Slave节点回复的 ACK （表示收到了binlog ）后，才能继续下一个事务；
如果在一定时间内（Timeout）内 没有收到ACK ，则 切换为异步模式 ，具体流程如下：

对应的配置参数如下：

[mysqld]
# 开启主的半同步复制
rpl_semi_sync_master_enabled=1
# 开启从的半同步复制
rpl_semi_sync_slave_enabled=1
# 超时1秒，切回异步
rpl_semi_sync_master_timeout=1000
# 至少收到 1 个 slave发回的ack
rpl_semi_sync_master_wait_for_slave_count=1

1.3. loss less semi-sync replication

loss less semi-sync replication 称为 无损复制 ，在一个事务提交（commit）的过程时，在 MySQL 层的write binlog 步骤后，Master节点需要收到 至少一个 Slave节点回复的 ACK （表示收到了binlog ）后，才能继续下一个事务；
如果在一定时间内（Timeout）内 没有收到ACK ，则 切换为异步模式 ，具体流程如下：

对应的配置参数如下：

[mysqld]
# 开启主的半同步复制
rpl_semi_sync_master_enabled=1
# 开启从的半同步复制
rpl_semi_sync_slave_enabled=1
# 超时1秒，切回异步
rpl_semi_sync_master_timeout=1000
[mysqld57]
# 控制 半同步复制 还是 无损复制 的参数
# - AFTER_SYNC 表示的是无损复制；（5.7 默认）
# - AFTER_COMMIT 表示的是半同步复制；
rpl_semi_sync_master_wait_point=AFTER_SYNC
# 至少收到 1 个 slave发回的ack
rpl_semi_sync_master_wait_for_slave_count=1

1.4. 半同步复制与无损复制的对比

ACK的时间点不同
- 半同步复制在InnoDB层的Commit Log后等待ACK，主从切换会有数据丢失风险。
- 无损复制在MySQL Server层的Write binlog后等待ACK，主从切换会有数据变多风险。
主从数据一致性
- 半同步复制意味着在Master节点上，这个刚刚提交的事务对数据库的修改，对其他事务是可见的。因此，如果在等待Slave ACK的时候crash了，那么会对其他事务出现幻读，数据丢失。
- 无损复制在write binlog完成后，就传输binlog，但还没有去写commit log，意味着当前这个事务对数据库的修改，其他事务也是不可见的。因此，不会出现幻读，数据丢失风险。
- 因此5.7.2引入了无损复制（after_sync）模式，带来的主要收益是解决 after_commit导致的master crash后数据丢失问题，因此在引入after_sync模式后，所有提交的数据已被复制，故障切换时数据一致性将得到提升。

参考- MySQL半同步复制数据一致性分析

1.5. 演示无损/半同步复制

master server

mysql root@localhost:employees> INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so';
Query OK, 0 rows affected
Time: 0.063s

mysql root@localhost:employees>  INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so';
Query OK, 0 rows affected
Time: 0.005s

mysql root@localhost:employees> set global rpl_semi_sync_master_enabled = 1;
Query OK, 0 rows affected
Time: 0.020s

mysql root@localhost:employees> set global rpl_semi_sync_master_timeout = 5000;
Query OK, 0 rows affected
Time: 0.001s

mysql root@localhost:employees> show global status like "%rpl%";
+--------------------------------------------+-------+
| Variable_name                              | Value |
+--------------------------------------------+-------+
| Rpl_semi_sync_master_clients               | 0     |
| Rpl_semi_sync_master_net_avg_wait_time     | 0     |
| Rpl_semi_sync_master_net_wait_time         | 0     |
| Rpl_semi_sync_master_net_waits             | 0     |
| Rpl_semi_sync_master_no_times              | 0     |
| Rpl_semi_sync_master_no_tx                 | 0     |
| Rpl_semi_sync_master_status                | ON    |   -- status ok
| Rpl_semi_sync_master_timefunc_failures     | 0     |
| Rpl_semi_sync_master_tx_avg_wait_time      | 0     |
| Rpl_semi_sync_master_tx_wait_time          | 0     |
| Rpl_semi_sync_master_tx_waits              | 0     |
| Rpl_semi_sync_master_wait_pos_backtraverse | 0     |
| Rpl_semi_sync_master_wait_sessions         | 0     |
| Rpl_semi_sync_master_yes_tx                | 0     |
| Rpl_semi_sync_slave_status                 | OFF   |
+--------------------------------------------+-------+
15 rows in set
Time: 0.013s
mysql root@localhost:employees>

slave server

(root@localhost) 09:59:14 [employees]> INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so';
Query OK, 0 rows affected (0.05 sec)

(root@localhost) 18:07:15 [employees]> INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so';
Query OK, 0 rows affected (0.00 sec)

(root@localhost) 18:07:29 [employees]> set global rpl_semi_sync_slave_enabled = 1;
Query OK, 0 rows affected (0.00 sec)

(root@localhost) 18:14:59 [employees]> show global variables like '%semi%';
+-------------------------------------------+------------+
| Variable_name                             | Value      |
+-------------------------------------------+------------+
| rpl_semi_sync_master_enabled              | OFF        |
| rpl_semi_sync_master_timeout              | 10000      |
| rpl_semi_sync_master_trace_level          | 32         |
| rpl_semi_sync_master_wait_for_slave_count | 1          |
| rpl_semi_sync_master_wait_no_slave        | ON         |
| rpl_semi_sync_master_wait_point           | AFTER_SYNC |
| rpl_semi_sync_slave_enabled               | ON         |
| rpl_semi_sync_slave_trace_level           | 32         |
+-------------------------------------------+------------+
8 rows in set (0.02 sec)

半复制切换异步同步过程的状态

master server

(root@localhost) 10:11:10 [tablespace]>  show global status like "%rpl%";
+--------------------------------------------+-------+
| Variable_name                              | Value |
+--------------------------------------------+-------+
| Rpl_semi_sync_master_clients               | 1     |  -- 半同步复制的client数量
| Rpl_semi_sync_master_net_avg_wait_time     | 0     |
| Rpl_semi_sync_master_net_wait_time         | 0     |  
| Rpl_semi_sync_master_net_waits             | 1     |  -- master总的等待slave的次数
| Rpl_semi_sync_master_no_times              | 0     |  -- 切成异步的次数（no = number of）
| Rpl_semi_sync_master_no_tx                 | 0     |  -- 切成异步后提交的事物数
| Rpl_semi_sync_master_status                | ON    |  -- 半同步复制的状态
| Rpl_semi_sync_master_timefunc_failures     | 0     |
| Rpl_semi_sync_master_tx_avg_wait_time      | 2209  |  -- master等待事物的平均时间
| Rpl_semi_sync_master_tx_wait_time          | 2209  |  -- master等待事物的总的时间
| Rpl_semi_sync_master_tx_waits              | 1     |  -- master等待事物的次数
| Rpl_semi_sync_master_wait_pos_backtraverse | 0     |
| Rpl_semi_sync_master_wait_sessions         | 0     |
| Rpl_semi_sync_master_yes_tx                | 1     |
| Rpl_semi_sync_slave_status                 | OFF   |
+--------------------------------------------+-------+
15 rows in set (0.03 sec)

slave server

(root@localhost) 13:47:48 [tablespace]> stop slave io_thread;  --停掉IO线程
Query OK, 0 rows affected (0.05 sec)

master server

(root@localhost) 14:48:11 [tablespace]>  show global status like "%rpl%";
+--------------------------------------------+-------+
| Variable_name                              | Value |
+--------------------------------------------+-------+
| Rpl_semi_sync_master_clients               | 1     |
| Rpl_semi_sync_master_net_avg_wait_time     | 0     |
| Rpl_semi_sync_master_net_wait_time         | 0     |
| Rpl_semi_sync_master_net_waits             | 1     |
| Rpl_semi_sync_master_no_times              | 0     |
| Rpl_semi_sync_master_no_tx                 | 0     |
| Rpl_semi_sync_master_status                | ON    |
| Rpl_semi_sync_master_timefunc_failures     | 0     |
| Rpl_semi_sync_master_tx_avg_wait_time      | 2209  |
| Rpl_semi_sync_master_tx_wait_time          | 2209  |
| Rpl_semi_sync_master_tx_waits              | 1     |
| Rpl_semi_sync_master_wait_pos_backtraverse | 0     |
| Rpl_semi_sync_master_wait_sessions         | 0     |
| Rpl_semi_sync_master_yes_tx                | 1     |
| Rpl_semi_sync_slave_status                 | OFF   |
+--------------------------------------------+-------+
15 rows in set (0.00 sec)

(root@localhost) 14:48:14 [tablespace]> insert into qqq values(99);  --插入sql
Query OK, 1 row affected (5.04 sec)    -- 等待5秒后，切成异步

(root@localhost) 14:49:28 [tablespace]>  show global status like "%rpl%";
+--------------------------------------------+-------+
| Variable_name                              | Value |
+--------------------------------------------+-------+
| Rpl_semi_sync_master_clients               | 0     |
| Rpl_semi_sync_master_net_avg_wait_time     | 0     |
| Rpl_semi_sync_master_net_wait_time         | 0     |
| Rpl_semi_sync_master_net_waits             | 1     | 
| Rpl_semi_sync_master_no_times              | 1     |  --切成异步的次数
| Rpl_semi_sync_master_no_tx                 | 1     |  --切成异步后的事物数
| Rpl_semi_sync_master_status                | OFF   |  --status 为off
| Rpl_semi_sync_master_timefunc_failures     | 0     |
| Rpl_semi_sync_master_tx_avg_wait_time      | 2209  |
| Rpl_semi_sync_master_tx_wait_time          | 2209  |
| Rpl_semi_sync_master_tx_waits              | 1     |
| Rpl_semi_sync_master_wait_pos_backtraverse | 0     |
| Rpl_semi_sync_master_wait_sessions         | 0     |
| Rpl_semi_sync_master_yes_tx                | 1     |
| Rpl_semi_sync_slave_status                 | OFF   |
+--------------------------------------------+-------+
15 rows in set (0.00 sec)

(root@localhost) 14:49:57 [tablespace]>

1.6. 两种复制方式的性能

备注：上图是Facebook的测试性能图；其中Y轴是QPS，X轴是并发数

蓝色的 Normal Slave 是 异步复制
- 性能很好，但是随着并发数的增长，性能有所下降
绿色的 Enhanced mysqlbinlog 是 无损复制
- 随着并发数的增长，性能几乎是线性增长的，在高并发下，性能会优于异步复制
紫色的 Normal Semi Slave 是半同步复制
- 性能较低

无损复制性能优于半同步复制的原因

等待ACK回包问题上，其实两种复制的开销是一样的，没有区别，都是网络的等待开销。
无损复制由于在 write binlog （commit 的第二步）后，需要等待ACK，后续的事务无法提交，这样就 堆积 了N多个需要落盘的事务（半同步复制由于已经提交了事务，没有堆积事务的效果），通过 组提交 机制一次 fsync的多个事务（半同步复制也有组提交，只是一次 fsync 的事务数没那么多），相当于提高了I/O性能 ;所以线程（事务）越多，效果越明显，以至于有上图中超过异步复制的效果。（无损复制的组提交比例比原版的高3~4倍）

产生上述测试效果的前提：测试用例是 IO Bound 的（比如数据量有 100G，而 buffer pool 只有 10G），且并发数足够多。

下面这两个参数不要去设置，设置了反而性能差

(root@localhost) 10:10:47 [tablespace]> show variables like "%binlog_group%";
+-----------------------------------------+-------+
| Variable_name                           | Value |
+-----------------------------------------+-------+
| binlog_group_commit_sync_delay          | 0     |
| binlog_group_commit_sync_no_delay_count | 0     |   -- 等待一组里面有多少事务我才提交
+-----------------------------------------+-------+
2 rows in set (0.01 sec)

(root@localhost) 10:10:50 [tablespace]> show variables like "%binlog_max%";
+-----------------------------+-------+
| Variable_name               | Value |
+-----------------------------+-------+
| binlog_max_flush_queue_time | 0     |               -- 等待多少时间后才进行组提交
+-----------------------------+-------+
1 row in set (0.00 sec)

(root@localhost) 10:11:10 [tablespace]>

1.7. rpl_semi_sync_master_wait_for_slave_count

该参数控制Master在收到 多少个 Slave的ACK 后，才可以继续commit。配置多个ACK和配置一个ACK的效果是类似的，因为他们是 并行执行 的（理论上来说不会有两倍的等待时间），取决于最慢的那个。

二. 并行复制（Multi-Threaded Slave）

2.1. MTS介绍

在官方文档中，并行复制的叫法为 Multi-Threaded Slave (MTS)

MySQL的并行复制基于组提交：

一个组提交中的事务都是可以并行执行的，因为既然处于组提交中，这意味着事务之间没有冲突（不会去更新同一行数据），否则不可能在同一个组里面。
Slave上开启并行复制，需要在配置文件中增加以下参数：

[mysqld]
slave-parallel-type=LOGICAL_CLOCK   
slave_preserve_commit_order=1
slave-parallel-workers=4

slave-parallel-type 参数
- DATABASE 基于库级别的并行复制，如果只有一个库，就还是串行（为了兼容5.6）。
- LOGICAL_CLOCK 基于逻辑时钟，主上怎么并行执行，从上也是怎么并行回放的。
slave-parallel-workers 并行复制的线程数，一般设置为一个组内提交的事务数，线上设置为32足够了
slave_preserve_commit_order Slave上commit的顺序保持一致，必须为1，否则可能会有GAP锁产生

2.2. 动态调整复制线程数

配置并行复制后，Slave节点可以看到4个 Coordinator 线程

mysql root@localhost:(none)> show processlist;
+----+-------------+-----------+--------+---------+--------+--------------------------------------------------------+------------------+
| Id | User        | Host      | db     | Command | Time   | State                                                  | Info             |
+----+-------------+-----------+--------+---------+--------+--------------------------------------------------------+------------------+
| 1  | system user |           | <null> | Connect | 148155 | Waiting for master to send event                       | <null>           |
| 2  | system user |           | <null> | Connect | 57792  | Slave has read all relay log; waiting for more updates | <null>           |
| 3  | system user |           | <null> | Connect | 57791  | Waiting for an event from Coordinator                  | <null>           |
| 4  | system user |           | <null> | Connect | 148155 | Waiting for an event from Coordinator                  | <null>           |
| 7  | system user |           | <null> | Connect | 148155 | Waiting for an event from Coordinator                  | <null>           |
| 8  | system user |           | <null> | Connect | 148155 | Waiting for an event from Coordinator                  | <null>           |
| 12 | root        | localhost | <null> | Query   | 0      | starting                                               | show processlist |
+----+-------------+-----------+--------+---------+--------+--------------------------------------------------------+------------------+
7 rows in set
Time: 0.019s
mysql root@localhost:(none)>

-- 动态调整方式如下：

mysql root@localhost:(none)> set global slave_parallel_workers=8;
Query OK, 0 rows affected
Time: 0.003s
mysql root@localhost:(none)> stop slave;  一定要重启一下slave才能有效
Query OK, 0 rows affected
Time: 0.038s
mysql root@localhost:(none)> start slave;
Query OK, 0 rows affected
Time: 0.080s
mysql root@localhost:(none)> show processlist;
+----+-------------+-----------+--------+---------+------+--------------------------------------------------------+------------------+
| Id | User        | Host      | db     | Command | Time | State                                                  | Info             |
+----+-------------+-----------+--------+---------+------+--------------------------------------------------------+------------------+
| 12 | root        | localhost | <null> | Query   | 0    | starting                                               | show processlist |
| 13 | system user |           | <null> | Connect | 3    | Waiting for master to send event                       | <null>           |
| 14 | system user |           | <null> | Connect | 3    | Slave has read all relay log; waiting for more updates | <null>           |
| 15 | system user |           | <null> | Connect | 3    | Waiting for an event from Coordinator                  | <null>           |
| 16 | system user |           | <null> | Connect | 3    | Waiting for an event from Coordinator                  | <null>           |
| 17 | system user |           | <null> | Connect | 3    | Waiting for an event from Coordinator                  | <null>           |
| 18 | system user |           | <null> | Connect | 3    | Waiting for an event from Coordinator                  | <null>           |
| 19 | system user |           | <null> | Connect | 3    | Waiting for an event from Coordinator                  | <null>           |
| 20 | system user |           | <null> | Connect | 3    | Waiting for an event from Coordinator                  | <null>           |
| 21 | system user |           | <null> | Connect | 3    | Waiting for an event from Coordinator                  | <null>           |
| 22 | system user |           | <null> | Connect | 3    | Waiting for an event from Coordinator                  | <null>           |
+----+-------------+-----------+--------+---------+------+--------------------------------------------------------+------------------+
11 rows in set
Time: 0.019s
mysql root@localhost:(none)>

特别注意：

这里的 并行复制 指的是 SQL Thread （回放线程），而非IO Thread （IO线程）
Waiting for master to send event 这个 State 在 show processlist 中只有一个，即只有一个 IO Thread

线上环境可以配置成两台Slave做无损复制（保证数据不丢），其他的Slave做异步复制（配置为只读，用于负载均衡），都指向同一台Master。

三. GTID

3.1. GTID的介绍

1.Global Transaction Id entifier -- 全局事物ID
2.GTID = Server_UUID + Transaction_ID
- Server_UUID 是全局唯一的
- Transaction_ID 是自增的
3.GTID 的作用是替代 Filename + Position

(root@localhost) 14:54:25 [tablespace]> show variables like "server_uuid";
+---------------+--------------------------------------+
| Variable_name | Value                                |
+---------------+--------------------------------------+
| server_uuid   | 9dc847d8-bf72-11e7-9ec4-000c2998e4f1 |
+---------------+--------------------------------------+
1 row in set (0.01 sec)

在MySQL中看到的 UUID ，实际是保存在 $DATADIR/auto.cnf 中的，且该文件是服务器初始化的时候自动生成的。

[root@node1 mysqldata]# cat /r2/mysqldata/auto.cnf
[auto]
server-uuid=9dc847d8-bf72-11e7-9ec4-000c2998e4f1
[root@node1 mysqldata]#

通过 冷备 做备份，拷贝 $DATADIR 时，记得要把备份中的 auto.cnf 给删除、

3.2. GTID的意义

未使用GTID
- 当 Master宕机 后，一个 Slave 被选举提升为New Master，如果需要重建复制关系，就需要把另外两个Slave的 CHANGE MASTER 指向 New Master ；
- 那问题来了，原来Slave是指向 Master 的 Filename_M + Position_M 的位置，现在要指向 New Master 上新的 Filename_N + Position_N 的位置，由于基于二进制日志复制 ，定位两个位置步骤繁琐; 此时两个Slave要继续重建复制关系（CHANGE MASTER）会比较麻烦。
使用GTID
- 和上面一样的场景，选举机制提升为New Master的GTID 执行到最新事务， 两个Slave需要重新指向 New Master ，由于 使用了GTID ，目前 Slave-A 获取到的事务对应GTID为GTID_A ， Slave-B 获取到的事务对应GTID为GTID_B;
- 此时 New Master 上GTID是存在 GTID_A 和 GTID_B，那两个Slave就可以直接使用 GTID_A 和 GTID_B 这两个GTID，通过指向 New Master接着重建复制；

3.3. GTID的配置

[mysqld]
log_bin = binlog
log_slave_updates = 1
gtid_mode = ON 
enforce-gtid-consistency = true

注意：
- 1.MySQL5.6 必须开启参数 log_slave_updates （5.6版本的限制）
- 2.MySQL5.6 升级到gtid模式需要停机重启
- 3.MySQL5.7 版本开始可以不开启 log_slave_updates
- 4.MySQL5.7.6 版本开始可以在线升级成gtid模式

3.4. 基于GTID的复制

3.4.1. 配置基于GTID的复制

配置信息	主库（master）	从库(slave)
主机	node1.gczheng.com	node3.gczheng.com
IP	192.168.88.88	192.168.88.100
Port	3306	3306
MySQL版本	MySQL5.7.18	MySQL5.7.18
Server_ID	8888	88100

数据库的备份和还原步骤略过，这里使用 mydumper备份，master上创建账号repl@'192.168.88.100',密码和权限同repl@'192.168.88.99'。

1.master节点的配置文件加入

[mysqld]
gtid_mode = ON 
enforce-gtid-consistency = true

2.slave 节点加入

[mysqld]
gtid_mode = ON 
enforce-gtid-consistency = true

3.查看metadata信息

[root@node3 alldb]# cat metadata
Started dump at: 2018-02-15 18:53:58                    --dump时间
SHOW MASTER STATUS:
    Log: binlog.000014                                  --Filename
    Pos: 1361                                           --Pos
    GTID:9dc847d8-bf72-11e7-9ec4-000c2998e4f1:1-52      --GTID时间

注意：metadata中的 GTID:9dc847d8-bf72-11e7-9ec4-000c2998e4f1:1-52 表示为：这个GTID对应的 事物 已在备份中了，Slave还原备份后，进行复制时要跳过这个GTID即对应的事物。

3.4.2. 跳过GTIDs

Slave节点

如果备份使用 mysqldump 进行备份，请跳过该步骤，因为在备份中已经存在下面这个语句：

--
-- GTID state at the beginning of the backup 
--
SET @@GLOBAL.GTID_PURGED='9dc847d8-bf72-11e7-9ec4-000c2998e4f1:1-52';

因为我们使用 mydumper 进行备份， myloader还原并不会帮我们执行上述语句，需要手工执行该语句，让 Slave 知道这个GTID需要跳过：

mysql> SET @@GLOBAL.GTID_PURGED='9dc847d8-bf72-11e7-9ec4-000c2998e4f1:1-52';
ERROR 1840 (HY000): @@GLOBAL.GTID_PURGED can only be set when @@GLOBAL.GTID_EXECUTED is empty.
-- 这里报出一个错误，意思为如果要设置 @@GLOBAL.GTID_PURGED ，则 @@GLOBAL.GTID_EXECUTED 必须为空。

需要使用 reset master 命令，将 @@GLOBAL.GTID_EXECUTED清空。

(root@localhost) 09:22:38 [tablespace]> reset master;
Query OK, 0 rows affected (0.07 sec)

(root@localhost) 09:23:18 [tablespace]> SET @@GLOBAL.GTID_PURGED='9dc847d8-bf72-11e7-9ec4-000c2998e4f1:1-52';
Query OK, 0 rows affected (0.02 sec)

至此，我们已经跳过了备份中存在的GTIDs所对应的事物

3.4.3. CHANGE MASTER

Slave节点

(root@localhost) 10:33:35 [tablespace]> change master to master_host="192.168.88.88", master_port=3306, master_user='repl',master_password='123456', master_auto_position=1;
Query OK, 0 rows affected, 2 warnings (0.04 sec)

(root@localhost) 10:33:46 [tablespace]> start slave;
Query OK, 0 rows affected (0.03 sec)

(root@localhost) 10:58:50 [tablespace]> show slave status  \G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.88.88
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: binlog.000014
          Read_Master_Log_Pos: 1893
               Relay_Log_File: node3-relay-bin.000007
                Relay_Log_Pos: 445
        Relay_Master_Log_File: binlog.000014
             Slave_IO_Running: Yes                          --IO线程ok
            Slave_SQL_Running: Yes                          --SQL线程ok
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 1893
              Relay_Log_Space: 2598
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 8888
                  Master_UUID: 9dc847d8-bf72-11e7-9ec4-000c2998e4f1
             Master_Info_File: /r2/mysqldata/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set: 9dc847d8-bf72-11e7-9ec4-000c2998e4f1:29-54   --获取到GTID
            Executed_Gtid_Set: 9dc847d8-bf72-11e7-9ec4-000c2998e4f1:29-54   --执行到GTID
                Auto_Position: 1
         Replicate_Rewrite_DB:
                 Channel_Name:
           Master_TLS_Version:
1 row in set (0.00 sec)

ERROR:
No query specified

3.4.4. 复制完成

master节点

(root@localhost) 10:57:09 [tablespace]> show slave hosts;
+-----------+------+------+-----------+--------------------------------------+
| Server_id | Host | Port | Master_id | Slave_UUID                           |
+-----------+------+------+-----------+--------------------------------------+
|     88100 |      | 3306 |      8888 | 5d4e3453-0fc5-11e8-b387-000c29c9f498 |  --新建slave
|      8899 |      | 3306 |      8888 | 1951c7ee-c1fb-11e7-ac44-005056305232 |
+-----------+------+------+-----------+--------------------------------------+
2 rows in set (0.00 sec)

(root@localhost) 10:58:34 [tablespace]> insert into qqq values(101);  --插入a的值101
Query OK, 1 row affected (0.02 sec)

(root@localhost) 10:59:07 [tablespace]> select * from  qqq;
+------+
| a    |
+------+
|    1 |
|    2 |
|    3 |
|    4 |
|   10 |
|   99 |
|  101 |
+------+
7 rows in set (0.00 sec)

slave节点

(root@localhost) 11:02:32 [tablespace]> select * from  qqq;
+------+
| a    |
+------+
|    1 |
|    2 |
|    3 |
|    4 |
|   10 |
|   99 |
|  101 |  --slave同步到值101
+------+
7 rows in set (0.00 sec)

(root@localhost) 11:03:08 [tablespace]>

3.5. GTID 与 Filename-Pos的对应

在 binlog 中，多了一个 GTID 的 event ，如下所示

(root@localhost) 11:43:24 [tablespace]> show binlog events;
+---------------+-----+----------------+-----------+-------------+--------------------------------------------------------------------+
| Log_name      | Pos | Event_type     | Server_id | End_log_pos | Info                                                               |
+---------------+-----+----------------+-----------+-------------+--------------------------------------------------------------------+
| binlog.000001 |   4 | Format_desc    |     88100 |         123 | Server ver: 5.7.18-log, Binlog ver: 4                              |
| binlog.000001 | 123 | Previous_gtids |     88100 |         154 |                                                                    |
| binlog.000001 | 154 | Gtid           |      8888 |         219 | SET @@SESSION.GTID_NEXT= '9dc847d8-bf72-11e7-9ec4-000c2998e4f1:55' | -- 一个GTID 的 event
| binlog.000001 | 219 | Query          |      8888 |         282 | BEGIN                                                              |
| binlog.000001 | 282 | Table_map      |      8888 |         334 | table_id: 227 (tablespace.qqq)                                     |
| binlog.000001 | 334 | Write_rows     |      8888 |         374 | table_id: 227 flags: STMT_END_F                                    |
| binlog.000001 | 374 | Xid            |      8888 |         405 | COMMIT /* xid=1407 */                                              |
+---------------+-----+----------------+-----------+-------------+--------------------------------------------------------------------+
7 rows in set (0.00 sec)

(root@localhost) 11:43:39 [tablespace]>

通过扫描 binlog中的GTID值，就可以知道 GTID与Filename-Pos对应的关系，但是如果binlog非常大，扫描的量也是会很大的，所以在binlog开头部分有一个 Previous_gtids 的event，如下所示：

(root@localhost) 11:44:57 [tablespace]> show binlog events in 'binlog.000001' \G;
*************************** 1. row ***************************
   Log_name: binlog.000001
        Pos: 4
 Event_type: Format_desc
  Server_id: 88100
End_log_pos: 123
       Info: Server ver: 5.7.18-log, Binlog ver: 4
*************************** 2. row ***************************
   Log_name: binlog.000001
        Pos: 123
 Event_type: Previous_gtids      -- 表示在次之前，GTID运行到的范围是哪里
  Server_id: 88100
End_log_pos: 154
       Info:
*************************** 3. row ***************************
   Log_name: binlog.000001
        Pos: 154
 Event_type: Gtid
  Server_id: 8888
End_log_pos: 219
       Info: SET @@SESSION.GTID_NEXT= '9dc847d8-bf72-11e7-9ec4-000c2998e4f1:55'
*************************** 4. row ***************************
   Log_name: binlog.000001
        Pos: 219
 Event_type: Query
  Server_id: 8888
End_log_pos: 282
       Info: BEGIN
*************************** 5. row ***************************
   Log_name: binlog.000001
        Pos: 282
 Event_type: Table_map
  Server_id: 8888
End_log_pos: 334
       Info: table_id: 227 (tablespace.qqq)
*************************** 6. row ***************************
   Log_name: binlog.000001
        Pos: 334
 Event_type: Write_rows
  Server_id: 8888
End_log_pos: 374
       Info: table_id: 227 flags: STMT_END_F
*************************** 7. row ***************************
   Log_name: binlog.000001
        Pos: 374
 Event_type: Xid
  Server_id: 8888
End_log_pos: 405
       Info: COMMIT /* xid=1407 */
7 rows in set (0.01 sec)

ERROR:
No query specified

(root@localhost) 11:45:19 [tablespace]>

如果我要的GTID比Previous_gtids 的大，就扫描当前文件，反之则扫描之前的文件，依次类推。因为binlog在rotate（rotate events）的时候，是知道当前最大的GTID的，可以将该值写入到下一个新的binlog的开头，即 Previous_gtid

3.6. GTID复制出错的处理

演示人为的在从机上误操作导致的复制失败，如何恢复？

3.6.1. 演示

1.在从机上插入一条记录（模拟误操作）
- slave节点

现在 Slave 上插入一条记录（现实中如果配置了readonly，在app中是无法插入的，app不会给root权限）

mysql root@localhost:(none)> select * from ttt.t1;
+---+
| a |  --a是主键
+---+
| 1 |
| 2 |
| 3 |
+---+
3 rows in set
Time: 0.012s
mysql root@localhost:(none)> insert into ttt.t1 values(null);
Query OK, 1 row affected
Time: 0.003s
mysql root@localhost:(none)> select * from ttt.t1;
+---+
| a |
+---+
| 1 |
| 2 |
| 3 |
| 4 |
+---+
4 rows in set
Time: 0.014s
mysql root@localhost:(none)>

2.在主机上插入同样的记录
- master节点

mysql root@localhost:ttt> select * from t1;
+---+
| a |
+---+
| 1 |
| 2 |
| 3 |
+---+
3 rows in set
Time: 0.009s
mysql root@localhost:ttt> insert into t1 values(null);
Query OK, 1 row affected
Time: 0.007s
mysql root@localhost:ttt>

3.查看 Slave上的状态
- slave节点

mysql root@localhost:(none)> show slave status \G;
***************************[ 1. row ]***************************
Slave_IO_State                | Waiting for master to send event
Master_Host                   | 192.168.88.88
Master_User                   | repl
Master_Port                   | 3306
Connect_Retry                 | 60
Master_Log_File               | binlog.000014
Read_Master_Log_Pos           | 6195
Relay_Log_File                | node2-relay-bin.000002
Relay_Log_Pos                 | 568
Relay_Master_Log_File         | binlog.000014
Slave_IO_Running              | Yes
Slave_SQL_Running             | No
Replicate_Do_DB               |
Replicate_Ignore_DB           |
Replicate_Do_Table            |
Replicate_Ignore_Table        |
Replicate_Wild_Do_Table       |
Replicate_Wild_Ignore_Table   |
Last_Errno                    | 1062
Last_Error                    | Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction '9dc847d8-bf72-11e7-9ec4-000c2998e4f1:74' at master log binlog.000014, end_log_pos 6164. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.
Skip_Counter                  | 0
Exec_Master_Log_Pos           | 5944
Relay_Log_Space               | 1026
Until_Condition               | None
Until_Log_File                |
Until_Log_Pos                 | 0
Master_SSL_Allowed            | No
Master_SSL_CA_File            |
Master_SSL_CA_Path            |
Master_SSL_Cert               |
Master_SSL_Cipher             |
Master_SSL_Key                |
Seconds_Behind_Master         | <null>
Master_SSL_Verify_Server_Cert | No
Last_IO_Errno                 | 0
Last_IO_Error                 |
Last_SQL_Errno                | 1062 --错误代码1062，主键重复
Last_SQL_Error                | Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction '9dc8
47d8-bf72-11e7-9ec4-000c2998e4f1:74' at master log binlog.000014, end_log_pos 6164. See error log and/or performance_schema.replication_applier_status_by_worker table for m
ore details about this failure or others, if any.
Replicate_Ignore_Server_Ids   |
Master_Server_Id              | 8888
Master_UUID                   | 9dc847d8-bf72-11e7-9ec4-000c2998e4f1
Master_Info_File              | mysql.slave_master_info
SQL_Delay                     | 0
SQL_Remaining_Delay           | <null>
Slave_SQL_Running_State       |
Master_Retry_Count            | 86400
Master_Bind                   |
Last_IO_Error_Timestamp       |
Last_SQL_Error_Timestamp      | 180219 16:07:45
Master_SSL_Crl                |
Master_SSL_Crlpath            |
Retrieved_Gtid_Set            | 9dc847d8-bf72-11e7-9ec4-000c2998e4f1:73-74
Executed_Gtid_Set             | 1951c7ee-c1fb-11e7-ac44-005056305232:1,
9dc847d8-bf72-11e7-9ec4-000c2998e4f1:1-73  --回放到这个事务时报错
Auto_Position                 | 0
Replicate_Rewrite_DB          |
Channel_Name                  |
Master_TLS_Version            |

其实从数据一致性看，目前主从数据是一致的，只是复制过来的日志在回放时，发现已有了该部分数据（Error:1062），从而引发了复制异常（SQL回放线程停止）。我们只需要告诉MySQL， 跳过 这部分一样的GTID，继续复制，即可。

3.6.2. 处理复制错误

这里的 跳过 的方法很巧妙，步骤如下

1.将Slave上的 gtid_next 指向 执行失败 的那个 gtid
- 这里执行失败的 gtid 报错信息中已经给出： '9dc847d8-bf72-11e7-9ec4-000c2998e4f1:73-74'
- 如果不看报错信息，可以看 Retrieved_Gtid_Set 和 Executed_Gtid_Set 的对比结果
2.执行一个空的事物，即begin;commit;
- 这样就把 失败的gtid 对应到了一个 空的事务 上，这个步骤即为 “跳过” 的意思
3.将 gtid_next 设置（还原）为 automatic

操作：
- slave节点

mysql root@localhost:(none)> select @@gtid_next;    -- 当前为默认值，AUTOMATIC
+-------------+
| @@gtid_next |
+-------------+
| AUTOMATIC   |
+-------------+
1 row in set
Time: 0.010s

-- 步骤1 ： 设置 gtid_next 为回放失败的gtid
mysql root@localhost:(none)> set  gtid_next="9dc847d8-bf72-11e7-9ec4-000c2998e4f1:74";
Query OK, 0 rows affected
Time: 0.001s

-- 步骤2 : 执行一个空的事物，让回放失败的gtid对应到这个空的事物
mysql root@localhost:(none)> begin;
Query OK, 0 rows affected
Time: 0.000s
mysql root@localhost:(none)> commit;
Query OK, 0 rows affected
Time: 0.000s

-- -- 步骤3 : 还原gtid_next为automatic
mysql root@localhost:(none)> set  gtid_next="automatic";
Query OK, 0 rows affected
Time: 0.000s

-- -- 步骤4 : 重新启动slave
mysql root@localhost:(none)> stop slave;
Query OK, 0 rows affected
Time: 0.001s
mysql root@localhost:(none)> start slave;
Query OK, 0 rows affected
Time: 0.046s
mysql root@localhost:(none)> show slave status \G;
***************************[ 1. row ]***************************
Slave_IO_State                | Waiting for master to send event
Master_Host                   | 192.168.88.88
Master_User                   | repl
Master_Port                   | 3306
Connect_Retry                 | 60
Master_Log_File               | binlog.000014
Read_Master_Log_Pos           | 6195
Relay_Log_File                | node2-relay-bin.000005
Relay_Log_Pos                 | 357
Relay_Master_Log_File         | binlog.000014
Slave_IO_Running              | Yes
Slave_SQL_Running             | Yes
Replicate_Do_DB               |
Replicate_Ignore_DB           |
Replicate_Do_Table            |
Replicate_Ignore_Table        |
Replicate_Wild_Do_Table       |
Replicate_Wild_Ignore_Table   |
Last_Errno                    | 0
Last_Error                    |
Skip_Counter                  | 0
1 row in set
Time: 0.010s
mysql root@localhost:(none)>

skip_gtid_errors.sh 脚本

#!/bin/bash
# line V1.0
# mail:gczheng@139.com
# data:2018-08-30
# script_name:skip_gtid_errors.sh  

user=root
pass='iforgot'
host='localhost'

for((i=1;i<1000;i++))
do
#sql21=`mysql -u${user} -p${pass} -h${host} -e "show slave status\G;" 2>/dev/null|grep -v Last_Error | grep 'executing transaction'| awk '{print $21}'`

Errno=`mysql -u${user} -p${pass} -h${host} -e "show slave status\G;" 2>/dev/null|grep 'Last_SQL_Errno' | awk -F ":" '{print $2}'`

Seconds=`mysql -u${user} -p${pass} -h${host} -e "show slave status\G;" 2>/dev/null|grep 'Seconds_Behind_Master' | awk -F ":" '{print $2}'`

Last_seen_transaction=`mysql -u${user} -p${pass} -h${host} -Ne "select LAST_SEEN_TRANSACTION from performance_schema.replication_applier_status_by_worker" 2>/dev/null`

if [ $Errno -eq 1061 ] || [$Errno -eq 1062 ] || [$Errno -eq 1217 ] || [$Errno -eq 1050 ] 
   then
        echo "----------------------"
    echo $(date +'%Y-%m-%d %H:%M:%S')  && echo "Last_seen_transaction：${Last_seen_transaction}" &&  echo -e "Last_Errno：\033[31m ${Errno}\033[0m"
        mysql -u${user} -p${pass} -h${host} -e "STOP SLAVE;SET @@SESSION.GTID_NEXT='${Last_seen_transaction}';BEGIN; COMMIT;SET @@SESSION.GTID_NEXT = AUTOMATIC;START SLAVE;" 2>/dev/null 
    else
        sleep 3
    echo "----------------------" 
    echo  $(date +'%Y-%m-%d %H:%M:%S') && echo  "Seconds_Behind_Master: ${Seconds} " &&  echo -e "Last_Errno：\033[32m ${Errno}\033[0m"
fi
done

3.6.3. 测试复制

1.Master端插入一个测试数据
- **master节点*

mysql root@localhost:ttt> insert into t1 values(null);
Query OK, 1 row affected
Time: 0.007s
mysql root@localhost:ttt>

2.查看Slave2上的数据是否同步
- **slave节点*

mysql root@localhost:(none)> select * from ttt.t1;
+---+
| a |
+---+
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |   -- 立即同步过来的数据
+---+
5 rows in set
Time: 0.011s
mysql root@localhost:(none)>

至此，GTID复制出错的处理就完成了。

注意：这里仅仅是 跳过错误 ，和原来的 sql_slave_skip_counter （该功能在GTID下失效）功能类似，无法保证主从数据是一致的（需要人工介入进行确认，比如仅仅主键一样，其他列不一样）

如果出现了很多的GTID的错误，可能是从机上有大量的操作，建议 重新搭建主从复制
但还是要在源头上避免此类情况的发生，确保在从机上开启 read_only=1 ，并且避免人工的误操作。最后强调： GTID 是 基于事务 的复制，一致性要求很高，强烈建议 在 Slave 上开启 read_only=1

3.7. 其他注意事项

reset slave all
使用该命令时 不会清空数据 ，仅仅是清空 show slave status\G里面的信息，所以在使用该命令之前，请 先记录show slave status\G的信息。
GTID
在开启GTID后，不能在一个事物中使用创建临时表的语句，需要使得autocommit=1; 才可以。
在开启GTID后，不能使用 create table select ...的语法来创建表了，因为这其实是多个事物了，GTID没法对应

四. 级联复制

4.1. 级联复制的介绍

slave B 从 master A 上复制， slave C 从 salve B 上复制，此时 slave B 上就要开启 log_slave_updates 产生binlog ，没有 binlog 则无法传递给
salve c 。

master 配置log_bin参数是当变更时产生binlog，复制关系中slave获取master的binlog写入relay-log中，slave自身不产生binlog

4.2. 级联复制的场景

4.2.1. 跨机房的复制

在跨机房搭建复制时，如果 master 挂了，深圳slave 提升为 New Master ，此时 北京slave 是不需要去做 CHANGE MASTER 操作的。

缺点是复制的延迟会更大（跨机房的延迟本来就很难避免）。

4.2.2. 库的拆分

当 生产库 压力很大时或者需要做统计分析时，需要把 逻辑库拆分出去，就可以使用 级联复制 ，让 逻辑库 形成单独的库。

4.3. 级联复制测试

配置信息	主库（master）	从库(slave A)	从库(slave B)	从库(slave C)
主机	node1.gczheng.com	node2.gczheng.com	node3.gczheng.com	proxy.gczheng.com
IP	192.168.88.88	192.168.88.99	192.168.88.100	192.168.88.77
Port	3306	3306	3306	3306
MySQL版本	MySQL5.7.18	MySQL5.7.18	MySQL5.7.18	MySQL5.7.18
Server_ID	8888	8899	88100	8877

Master --> Slave A --> Slave C 是级联复制，Master到Slave A 和 Master到Slave B 之前已部署完，直接做 Slave A到Slave C 的级联复制

1、在Master端新建一个DB名为 mts库 ，用于在 Slave3 上测试参数 replicate_do_db。
- master节点

(root@localhost) 14:54:28 [tablespace]> create database mts;
Query OK, 1 row affected (0.03 sec)

(root@localhost) 14:54:33 [tablespace]> use mts;
Database changed

(root@localhost) 14:55:34 [mts]> create table t(a int);
Query OK, 0 rows affected (0.07 sec)

(root@localhost) 14:55:38 [mts]> insert into t values(1),(2),(3);
Query OK, 3 rows affected (0.01 sec)
Records: 3  Duplicates: 0  Warnings: 0

(root@localhost) 14:56:10 [mts]> select * from mts.t;
+------+
| a    |
+------+
|    1 |
|    2 |
|    3 |
+------+
3 rows in set (0.00 sec)

(root@localhost) 14:56:22 [mts]>

2、在Slave C上增加如下参数

[mysqld]
report-host=192.168.88.77 
# 只复制mts的库 
replicate_do_db=mts

3、确认在slave A 已添加log_slave_updates，并建立通用账号'repl'@'192.168.88.%'

[mysqld]
log_slave_updates=1

4、change master
- Slave C 节点

(root@localhost) 16:19:58 [mts]> change master to master_host="192.168.88.99", master_port=3306, master_user='repl', master_password='123456', master_auto_position=1;
Query OK, 0 rows affected, 2 warnings (0.02 sec)

(root@localhost) 16:20:31 [mts]> start slave;
Query OK, 0 rows affected (0.03 sec)

(root@localhost) 16:20:38 [mts]> show slave status \G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.88.99
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: binlog.000035
          Read_Master_Log_Pos: 2707
               Relay_Log_File: proxy-relay-bin.000002
                Relay_Log_Pos: 405
        Relay_Master_Log_File: binlog.000035
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: mts
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 2707
              Relay_Log_Space: 612
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 8899
                  Master_UUID: 1951c7ee-c1fb-11e7-ac44-005056305232
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set:
            Executed_Gtid_Set: 1951c7ee-c1fb-11e7-ac44-005056305232:1-17,
9dc847d8-bf72-11e7-9ec4-000c2998e4f1:1-29:47-58
                Auto_Position: 1
         Replicate_Rewrite_DB:
                 Channel_Name:
           Master_TLS_Version:
1 row in set (0.00 sec)

ERROR:
No query specified

(root@localhost) 16:20:44 [mts]>

复制关系：Master-->Slave A-->Slave C ，且使用了基于GTID的复制；

五.多源复制

5.1. 多源复制的介绍

多源复制官方文档

多源复制 是指多主一从的复制，从 MySQL5.7.6 才有的功能，如下图所示：

在语法层面上，只是在原来的change master 的基础上，增加了 for channel 'channel_name'

5.2. 多源复制的演示

主机信息：

配置信息	主库（master A）	主库(master B)	从库(slave )
主机	node1.gczheng.com	node2.gczheng.com	node3.gczheng.com
IP	192.168.88.88	192.168.88.99	192.168.88.100
Port	3306	3306	3306
MySQL版本	MySQL5.7.18	MySQL5.7.18	MySQL5.7.18
Server_ID	8888	8899	88100

master A 和 master B 统一创建同步user `repl@'192.168.88.%'' password为123456

1. master准备数据
- master A 节点

mysql root@localhost:(none)> create database dbn1;
Query OK, 1 row affected
Time: 0.011s
】
mysql root@localhost:(none)> use dbn1
You are now connected to database "dbn1" as user "root"
Time: 0.007s

mysql root@localhost:dbn1> create table tn1(a int auto_increment primary key);
Query OK, 0 rows affected
Time: 0.062s

mysql root@localhost:dbn1> insert into dbn1.tn1 values(null);
Query OK, 1 row affected
Time: 0.068s

mysql root@localhost:dbn1> insert into dbn1.tn1 values(null);
Query OK, 1 row affected
Time: 0.014s

mysql root@localhost:dbn1> select * from dbn1.tn1;
+---+
| a |
+---+
| 1 |
| 2 |
+---+
2 rows in set
Time: 0.010s

mysql root@localhost:dbn1> show master status \G;
***************************[ 1. row ]***************************
File              | binlog.000014
Position          | 3953
Binlog_Do_DB      |
Binlog_Ignore_DB  |
Executed_Gtid_Set | 9dc847d8-bf72-11e7-9ec4-000c2998e4f1:1-64 --执行到64
1 row in set
Time: 0.007s
mysql root@localhost:dbn1>

- **master B 节点**

mysql gcdb@localhost:(none)> create database dbn2;
Query OK, 1 row affected
Time: 0.010s

mysql gcdb@localhost:(none)> use dbn2;
You are now connected to database "dbn2" as user "gcdb"
Time: 0.015s

mysql gcdb@localhost:dbn2> create table tn2(a int auto_increment primary key);
Query OK, 0 rows affected
Time: 0.061s

mysql gcdb@localhost:dbn2> insert into dbn2.tn2 values(null);
Query OK, 1 row affected
Time: 0.015s

mysql gcdb@localhost:dbn2> insert into dbn2.tn2 values(null);
Query OK, 1 row affected
Time: 0.003s

mysql gcdb@localhost:dbn2> insert into dbn2.tn2 values(null);
Query OK, 1 row affected
Time: 0.003s

mysql gcdb@localhost:dbn2> insert into dbn2.tn2 values(null);
Query OK, 1 row affected
Time: 0.002s

mysql gcdb@localhost:dbn2> select * from  dbn2.tn2;
+---+
| a |
+---+
| 1 |
| 2 |
| 3 |
| 4 |
+---+
4 rows in set
Time: 0.012s

mysql gcdb@localhost:dbn2> show master status \G;
***************************[ 1. row ]***************************
File              | binlog.000001
Position          | 1687
Binlog_Do_DB      |
Binlog_Ignore_DB  |
Executed_Gtid_Set | 1951c7ee-c1fb-11e7-ac44-005056305232:1-7    --执行到7
1 row in set
Time: 0.012s
mysql gcdb@localhost:dbn2>

2.mysqldump 备份导出导入
- Slave 节点

[root@node3 ~]# mysqldump -ugcdb -piforgot -h192.168.88.99 -P 3306 -B dbn2 >/tmp/dbn2.sql
[root@node3 ~]# mysqldump -ugcdb -piforgot -h192.168.88.88 -P 3306 -B dbn1 >/tmp/dbn1.sql

[root@node3 tmp]# less dbn2.sql

-----------省略其他输出------------

--在恢复备份的时候，会自动执行该语句，就不需要我们手工跳过了，如果是mydumper之类的，需要手工跳过

SET @@GLOBAL.GTID_PURGED='1951c7ee-c1fb-11e7-ac44-005056305232:1-7';

------------省略其他输出------------

[root@node3 tmp]# mysql -uroot -piforgot < /tmp/dbn2.sql
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 1840 (HY000) at line 24: @@GLOBAL.GTID_PURGED can only be set when @@GLOBAL.GTID_EXECUTED is empty.

--还是之前的问题，需要在Slave上，先操作一下 reset master ，以清空 @@GLOBAL.GTID_EXECUTED 

[root@node3 tmp]# mysql -uroot -piforgot < /tmp/dbn2.sql
mysql: [Warning] Using a password on the command line interface can be insecure.

[root@node3 tmp]# mysql -uroot -piforgot < /tmp/dbn1.sql
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 1840 (HY000) at line 24: @@GLOBAL.GTID_PURGED can only be set when @@GLOBAL.GTID_EXECUTED is empty.

[root@node3 tmp]# mysql -uroot -piforgot < /tmp/dbn1.sql
mysql: [Warning] Using a password on the command line interface can be insecure.
[root@node3 tmp]#

3 change master

先在Slave上设置需要复制的库，如果不设置的话，默认会同步系统的库（mysql），这样可能会复制出错（因为master a和master b上的mysql库中可能有相同的记录）。

Slave 节点

[mysqld]

# 只复制dbn1 和 dbn2，这个步骤很重要，且有多个时，必须分多行写

replicate_do_db=dbn1 
replicate_do_db=dbn2

#mysql5.7多源复制必须添加的参数(不加报错)，5.7版本之前不用加

master_info_repository=TABLE
relay_log_info_repository=TABLE

(root@localhost) 13:57:46 [(none)]> change master to master_host='192.168.88.88', master_port=3306, master_user='repl', master_password='123456', master_auto_position=1 for channel 'ch1';
Query OK, 0 rows affected, 2 warnings (0.01 sec)

(root@localhost) 13:58:12 [(none)]> change master to master_host='192.168.88.99', master_port=3306, master_user='repl', master_password='123456', master_auto_position=1 for channel 'ch2';
Query OK, 0 rows affected, 2 warnings (0.01 sec)

(root@localhost) 13:59:19 [(none)]> start slave for channel 'ch1';
Query OK, 0 rows affected (0.00 sec)

(root@localhost) 13:59:59 [(none)]> start slave for channel 'ch2';
Query OK, 0 rows affected (0.00 sec)

(root@localhost) 14:39:13 [(none)]> show slave status for channel 'ch1' \G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.88.88
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: binlog.000014
          Read_Master_Log_Pos: 4349
               Relay_Log_File: node3-relay-bin-ch1.000002
                Relay_Log_Pos: 405
        Relay_Master_Log_File: binlog.000014
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: dbn1,dbn2
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 4349
              Relay_Log_Space: 616
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 8888
                  Master_UUID: 9dc847d8-bf72-11e7-9ec4-000c2998e4f1
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set:
            Executed_Gtid_Set: 9dc847d8-bf72-11e7-9ec4-000c2998e4f1:1-66
                Auto_Position: 1
         Replicate_Rewrite_DB:
                 Channel_Name: ch1
           Master_TLS_Version:
1 row in set (0.00 sec)

ERROR:
No query specified

(root@localhost) 14:41:49 [(none)]> show slave status for channel 'ch2' \G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.88.99
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: binlog.000001
          Read_Master_Log_Pos: 2083
               Relay_Log_File: node3-relay-bin-ch2.000002
                Relay_Log_Pos: 801
        Relay_Master_Log_File: binlog.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: dbn1,dbn2
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 2083
              Relay_Log_Space: 1012
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 8899
                  Master_UUID: 1951c7ee-c1fb-11e7-ac44-005056305232
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set: 1951c7ee-c1fb-11e7-ac44-005056305232:8-9
            Executed_Gtid_Set: 9dc847d8-bf72-11e7-9ec4-000c2998e4f1:1-66
                Auto_Position: 1
         Replicate_Rewrite_DB:
                 Channel_Name: ch2
           Master_TLS_Version:
1 row in set (0.00 sec)

ERROR:
No query specified

(root@localhost) 14:41:54 [(none)]>

验证
- master A 节点

mysql root@localhost:(none)> insert into dbn1.tn1 values(null);
Query OK, 1 row affected
Time: 0.013s
mysql root@localhost:(none)> insert into dbn1.tn1 values(null);
Query OK, 1 row affected
Time: 0.012s
mysql root@localhost:(none)>

master B 节点

mysql root@localhost:(none)> insert into dbn2.tn2 values(null);
Query OK, 1 row affected
Time: 0.003s
mysql root@localhost:(none)> insert into dbn2.tn2 values(null);
Query OK, 1 row affected
Time: 0.003s
mysql root@localhost:(none)>

slave A 节点

(root@localhost) 14:41:54 [(none)]> select * from dbn1.tn1;
+---+
| a |
+---+
| 1 |
| 2 |
| 3 |
| 4 |
+---+
4 rows in set (0.00 sec)

(root@localhost) 14:45:14 [(none)]> select * from dbn2.tn2;
+---+
| a |
+---+
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
+---+
6 rows in set (0.00 sec)

(root@localhost) 14:45:17 [(none)]>

至此，多源复制的主从搭建就完成了。

5.3.使用场景

如果 Master B上也有一个 dbn1 的库，会有问题么？

如果不做额外的配置，是 会有错误 的；
如果配置了 slave_skip_errors = ddl_exist_errors ，且 没有重复数据 话，复制关系还是正常的。

这种操作可以起到 数据聚合的效果。将分库分表后的数据聚合在一起，以供其他应用进行分析（前提是数据不能有重复）。
但是最合适的场景还是将不同的库进行复制。
中间件的unique key

中间件可以保证分区键是唯一的（比如order_id），但是对于其他唯一索引来说，需要业务层去保证。

转载于:https://www.cnblogs.com/gczheng/p/8480210.html