作者介绍:老苏,10余年DBA工作运维经验,擅长Oracle、MySQL、PG、Mongodb数据库运维(如安装迁移,性能优化、故障应急处理等)
公众号:老苏畅谈运维
欢迎关注本人公众号,更多精彩与您分享。
背景说明
在很多情况下,由于一张表或几张表的数据不一致导致主从复制中断,或者在搭建从库时,发现漏掉了某些库或者表。如果数据量较大,重新搭建一个从库耗时会较长。那么有没有办法只针对有问题的表或者漏掉的库重新初始化,从而恢复主从同步呢?
当然是有方法的,针对表的恢复操作有以下几种可行的方式:
- 在主库锁定这张表做可传输表空间还原到从库
- 在主库锁定这张表导出数据导入到从库
- 使用备份恢复工具实现从库同步恢复
前面2种方法需要在导出数据过程中,需要进行锁表,考虑到对生产的影响尽可能减少,选择第三种方法来进行恢复。
环境说明
IP | 角色 | 需初始的表 |
---|---|---|
192.168.6.13 | 主库 | sbtest库下的sbtest1、sbtest2 |
192.168.6.14 | 从库 | sbtest库下的sbtest1、sbtest2 |
表的初始化恢复步骤
模拟从库出现表不一致
对主库进行加压测试数据
sysbench /usr/share/sysbench/oltp_read_write.lua --mysql-host=192.168.6.13 --mysql-port=3306 --mysql-user=root --mysql-password='root@MySQL888' --mysql-db=sbtest --db-driver=mysql \
--tables=10 --table-size=1000 --report-interval=10 --threads=4 --time=600 --db-ps-mode=disable --max-requests=0 --percentile=95 run
加压过程中,删除从库的表sbtest1和sbtest2
(root@localhost)[sbtest]20:00>show tables;
+------------------+
| Tables_in_sbtest |
+------------------+
| sbtest1 |
| sbtest10 |
| sbtest2 |
| sbtest3 |
| sbtest4 |
| sbtest5 |
| sbtest6 |
| sbtest7 |
| sbtest8 |
| sbtest9 |
+------------------+
10 rows in set (0.00 sec)
(root@localhost)[sbtest]20:01>drop table sbtest1;
Query OK, 0 rows affected (0.02 sec)
(root@localhost)[sbtest]20:01>drop table sbtest2;
Query OK, 0 rows affected (0.00 sec)
从库同步报错
--同步状态
(root@localhost)[sbtest]20:01>show slave status \G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for source to send event
Master_Host: 192.168.6.13
Master_User: svr_slave
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: binlog.000038
Read_Master_Log_Pos: 103341043
Relay_Log_File: relay.000087
Relay_Log_Pos: 8172423
Relay_Master_Log_File: binlog.000038
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1146
Last_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 3 failed executing transaction '86f14188-54fe-11ed-9f7a-0242c0a8060d:936591' at master log binlog.000038, end_log_pos 88970909. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.
--日志显示找不到表sbtest.sbtest1
2024-08-23T20:01:31.543732+08:00 114 [ERROR] [MY-010584] [Repl] Slave SQL for channel '': Worker 2 failed executing transaction '86f14188-54fe-11ed-9f7a-0242c0a8060d:936590' at master log binlog.000038, end_log_pos 88968604; Error executing row event: 'Table 'sbtest.sbtest1' doesn't exist', Error_code: MY-001146
2024-08-23T20:01:31.544719+08:00 115 [ERROR] [MY-010584] [Repl] Slave SQL for channel '': Worker 3 failed executing transaction '86f14188-54fe-11ed-9f7a-0242c0a8060d:936591' at master log binlog.000038, end_log_pos 88970909; Error executing row event: 'Table 'sbtest.sbtest1' doesn't exist', Error_code: MY-001146
2024-08-23T20:01:31.545923+08:00 112 [Warning] [MY-010584] [Repl] Slave SQL for channel '': ... The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state. A restart should restore consistency automatically, although using non-transactional storage for data or info tables or DDL queries could lead to problems. In such cases you have to examine your data (see documentation for details). Error_code: MY-001756
从库停止复制进程
(root@localhost)[sbtest]20:03>stop slave;
Query OK, 0 rows affected, 1 warning (0.00 sec)
(root@localhost)[sbtest]20:03>show slave status \G
*************************** 1. row ***************************
Slave_IO_State:
Master_Host: 192.168.6.13
Master_User: svr_slave
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: binlog.000038
Read_Master_Log_Pos: 198568806
Relay_Log_File: relay.000087
Relay_Log_Pos: 8172423
Relay_Master_Log_File: binlog.000038
Slave_IO_Running: No