MySQL 主从复制的一个BUG

本文揭示了一个MySQL主从复制中的潜在问题,即从库在列不完全匹配的情况下,会忽略主库中不存在的列,导致数据不一致。在主库进行插入、更新或删除操作时,从库并未严格按照列名对应,而是根据列的顺序复制数据,当从库缺少列时,相应数据会被忽略,且不报错。这一现象可能导致难以察觉的数据错误,影响数据库的一致性和可靠性。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

发现一个主从复制的BUG,从库会自动跳过对主库来说不存在的列,看例子。
主库有一个sam表,其中id是自增主键:

mysql> desc sam;
+-------+---------+------+-----+---------+----------------+
| Field | Type    | Null | Key | Default | Extra          |
+-------+---------+------+-----+---------+----------------+
| id    | int(11) | NO   | PRI | NULL    | auto_increment |
| col1  | int(11) | YES  | MUL | NULL    |                |
| col2  | int(11) | YES  |     | NULL    |                |
+-------+---------+------+-----+---------+----------------+
3 rows in set (0.00 sec)

从库也有一个sam表,但是我通过set sql_log_bin=0将主键列删除了,其余两列和主库保持一致,这种情况在生产环境也可能存在的,并不是纯粹的实验:

mysql> desc sam;
+-------+---------+------+-----+---------+-------+
| Field | Type    | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
| col1  | int(11) | YES  |     | NULL    |       |
| col2  | int(11) | YES  |     | NULL    |       |
+-------+---------+------+-----+---------+-------+
2 rows in set (0.00 sec)

此时主从两个库都没有数据,那么我在主库用以下方式插入两条数据:

mysql> insert into sam(col1,col2) values (2,3),(3,4);
Query OK, 2 rows affected (0.01 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> select * from sam;
+----+------+------+
| id | col1 | col2 |
+----+------+------+
|  1 |    2 |    3 |
|  2 |    3 |    4 |
+----+------+------+
2 rows in set (0.00 sec)

这时看从库,竟然复制正常,而且sam表的数据看起来非常诡异:

*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 127.0.0.1
                  Master_User: rep
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000001
          Read_Master_Log_Pos: 2706
               Relay_Log_File: relay-bin.000002
                Relay_Log_Pos: 2919
        Relay_Master_Log_File: mysql-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 2706
              Relay_Log_Space: 3120
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 1
                  Master_UUID: 0bf4e28a-1d6f-11ea-9a02-080027e07ac6
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 0bf4e28a-1d6f-11ea-9a02-080027e07ac6:1-8
            Executed_Gtid_Set: 0bf4e28a-1d6f-11ea-9a02-080027e07ac6:1-8
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec)

mysql> select * from sam;
+------+------+
| col1 | col2 |
+------+------+
|    1 |    2 |
|    2 |    3 |
+------+------+
2 rows in set (0.00 sec)

发现什么规律?从库col1和col2的数据对应的是主库id和col1的数据,也就是从库将主库col2的数据主动忽略掉了。
这无疑是造成数据不一致的风险点,而且从库复制也没有报错,会让人无法察觉。

为什么会这样呢?在relay log可以看到主库传过来的binlog是下面这样的:

# at 2575
#200103 16:09:30 server id 1  end_log_pos 2427 CRC32 0x6d4d2e2e 	GTID	last_committed=7	sequence_number=8	rbr_only=yes
/*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/;
SET @@SESSION.GTID_NEXT= '0bf4e28a-1d6f-11ea-9a02-080027e07ac6:8'/*!*/;
# at 2640
#200103 16:09:30 server id 1  end_log_pos 2498 CRC32 0x1d72f9b2 	Query	thread_id=15	exec_time=0	error_code=0
SET TIMESTAMP=1578038970/*!*/;
BEGIN
/*!*/;
# at 2711
#200103 16:09:30 server id 1  end_log_pos 2567 CRC32 0x7bbaa2e7 	Rows_query
# insert into sam(col1,col2) values (2,3),(3,4)
# at 2780
#200103 16:09:30 server id 1  end_log_pos 2614 CRC32 0x10cd76fc 	Table_map: `sam`.`sam` mapped to number 111
# at 2827
#200103 16:09:30 server id 1  end_log_pos 2675 CRC32 0x3a06bc1f 	Write_rows: table id 111 flags: STMT_END_F

BINLOG '
uvYOXh0BAAAARQAAAAcKAACAAC1pbnNlcnQgaW50byBzYW0oY29sMSxjb2wyKSB2YWx1ZXMgKDIs
MyksKDMsNCnnorp7
uvYOXhMBAAAALwAAADYKAAAAAG8AAAAAAAEAA3NhbQADc2FtAAMDAwMABvx2zRA=
uvYOXh4BAAAAPQAAAHMKAAAAAG8AAAAAAAEAAgAD//gBAAAAAgAAAAMAAAD4AgAAAAMAAAAEAAAA
H7wGOg==
'/*!*/;
### INSERT INTO `sam`.`sam`
### SET
###   @1=1 /* INT meta=0 nullable=0 is_null=0 */
###   @2=2 /* INT meta=0 nullable=1 is_null=0 */
###   @3=3 /* INT meta=0 nullable=1 is_null=0 */
### INSERT INTO `sam`.`sam`
### SET
###   @1=2 /* INT meta=0 nullable=0 is_null=0 */
###   @2=3 /* INT meta=0 nullable=1 is_null=0 */
###   @3=4 /* INT meta=0 nullable=1 is_null=0 */
# at 2888
#200103 16:09:30 server id 1  end_log_pos 2706 CRC32 0x8816743b 	Xid = 123
COMMIT/*!*/;

也就是说,binlog里是用@1代表第一列,@2代表第二列,@3代表第三列,而不管列名是什么。
那么再看从库执行了relay log之后生成的binlog:

# at 2260
#200103 16:09:30 server id 1  end_log_pos 2325 CRC32 0x616ca58a 	GTID	last_committed=7	sequence_number=8	rbr_only=yes
/*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/;
SET @@SESSION.GTID_NEXT= '0bf4e28a-1d6f-11ea-9a02-080027e07ac6:8'/*!*/;
# at 2325
#200103 16:09:30 server id 1  end_log_pos 2388 CRC32 0xf0d1f61d 	Query	thread_id=15	exec_time=0	error_code=0
SET TIMESTAMP=1578038970/*!*/;
SET @@session.sql_mode=524288/*!*/;
BEGIN
/*!*/;
# at 2388
#200103 16:09:30 server id 1  end_log_pos 2457 CRC32 0x0b0b77ea 	Rows_query
# insert into sam(col1,col2) values (2,3),(3,4)
# at 2457
#200103 16:09:30 server id 1  end_log_pos 2503 CRC32 0xe7af7737 	Table_map: `sam`.`sam` mapped to number 117
# at 2503
#200103 16:09:30 server id 1  end_log_pos 2556 CRC32 0xbb65d2dd 	Write_rows: table id 117 flags: STMT_END_F

BINLOG '
uvYOXh0BAAAARQAAAJkJAACAAC1pbnNlcnQgaW50byBzYW0oY29sMSxjb2wyKSB2YWx1ZXMgKDIs
MyksKDMsNCnqdwsL
uvYOXhMBAAAALgAAAMcJAAAAAHUAAAAAAAEAA3NhbQADc2FtAAIDAwADN3ev5w==
uvYOXh4BAAAANQAAAPwJAAAAAHUAAAAAAAEAAgAC//wBAAAAAgAAAPwCAAAAAwAAAN3SZbs=
'/*!*/;
### INSERT INTO `sam`.`sam`
### SET
###   @1=1 /* INT meta=0 nullable=1 is_null=0 */
###   @2=2 /* INT meta=0 nullable=1 is_null=0 */
### INSERT INTO `sam`.`sam`
### SET
###   @1=2 /* INT meta=0 nullable=1 is_null=0 */
###   @2=3 /* INT meta=0 nullable=1 is_null=0 */
# at 2556
#200103 16:09:30 server id 1  end_log_pos 2587 CRC32 0x65411b02 	Xid = 173
COMMIT/*!*/;

可以看到从库的binlog里并没有@3,只是机械地将主库binlog的@1和@2往自己的第一列和第二列里插入,而不管列名是否对应。

那么这是否跟自增键有关系呢?如果id不是自增键会是怎样?我们也捏造一个测试实验。
主库一个sam2表,并没有显示的主键列和自增键:

mysql> desc sam2;
+-------+---------+------+-----+---------+-------+
| Field | Type    | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
| id    | int(11) | YES  |     | NULL    |       |
| col1  | int(11) | YES  |     | NULL    |       |
| col2  | int(11) | YES  |     | NULL    |       |
+-------+---------+------+-----+---------+-------+
3 rows in set (0.00 sec)

从库也是一个sam2表,但少了id列:

mysql> desc sam2;
+-------+---------+------+-----+---------+-------+
| Field | Type    | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
| col1  | int(11) | YES  |     | NULL    |       |
| col2  | int(11) | YES  |     | NULL    |       |
+-------+---------+------+-----+---------+-------+
2 rows in set (0.00 sec)

我们用以下方式在主库插入sam2的一条数据:

mysql> insert into sam2 values(1,2,3);
Query OK, 1 row affected (0.01 sec)

mysql> select * from sam2;
+------+------+------+
| id   | col1 | col2 |
+------+------+------+
|    1 |    2 |    3 |
+------+------+------+
1 row in set (0.00 sec)

看一下从库,依然没有报错,并且是主库sam2表前两列的数据:

mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 127.0.0.1
                  Master_User: rep
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000001
          Read_Master_Log_Pos: 3203
               Relay_Log_File: relay-bin.000002
                Relay_Log_Pos: 3416
        Relay_Master_Log_File: mysql-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 3203
              Relay_Log_Space: 3617
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 1
                  Master_UUID: 0bf4e28a-1d6f-11ea-9a02-080027e07ac6
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 0bf4e28a-1d6f-11ea-9a02-080027e07ac6:1-10
            Executed_Gtid_Set: 0bf4e28a-1d6f-11ea-9a02-080027e07ac6:1-10
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec)

mysql> select * from sam2;
+------+------+
| col1 | col2 |
+------+------+
|    1 |    2 |
+------+------+
1 row in set (0.00 sec)

所以得知,从库复制insert操作,只是按照第一列第二列的顺序将主库的数据插入到从库的表,并不会依照列名一一对应,而且从库缺少了列,会直接忽略相关数据,而不报错。
我们也将从库的列改一下名字,看一下是否列名是否会影响复制:

mysql> alter table sam2 change col1 column1 int;
Query OK, 0 rows affected (0.02 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> alter table sam2 change col2 column3 int;
Query OK, 0 rows affected (0.01 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> desc sam2;
+---------+---------+------+-----+---------+-------+
| Field   | Type    | Null | Key | Default | Extra |
+---------+---------+------+-----+---------+-------+
| column1 | int(11) | YES  |     | NULL    |       |
| column3 | int(11) | YES  |     | NULL    |       |
+---------+---------+------+-----+---------+-------+
2 rows in set (0.00 sec)

主库插入数据:

mysql> insert into sam2 values(2,3,4);
Query OK, 1 row affected (0.01 sec)

mysql> select * from sam2;
+------+------+------+
| id   | col1 | col2 |
+------+------+------+
|    1 |    2 |    3 |
|    2 |    3 |    4 |
+------+------+------+
2 rows in set (0.00 sec)

从库复制同样没有报错:

mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 127.0.0.1
                  Master_User: rep
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000001
          Read_Master_Log_Pos: 3520
               Relay_Log_File: relay-bin.000002
                Relay_Log_Pos: 3733
        Relay_Master_Log_File: mysql-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 3520
              Relay_Log_Space: 3934
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 1
                  Master_UUID: 0bf4e28a-1d6f-11ea-9a02-080027e07ac6
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 0bf4e28a-1d6f-11ea-9a02-080027e07ac6:1-11
            Executed_Gtid_Set: 0bf4e28a-1d6f-11ea-9a02-080027e07ac6:1-11
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec)

mysql> select * from sam2;
+---------+---------+
| column1 | column3 |
+---------+---------+
|       1 |       2 |
|       2 |       3 |
+---------+---------+
2 rows in set (0.00 sec)

以上是insert的测试,那么对于update呢?
主库执行update:

mysql> update sam2 set id=3 where id=1 and col1=2 and col2=3;
Query OK, 1 row affected (0.01 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> select * from sam2;
+------+------+------+
| id   | col1 | col2 |
+------+------+------+
|    3 |    2 |    3 |
|    2 |    3 |    4 |
+------+------+------+
2 rows in set (0.00 sec)

从库依然没有报错:

mysql> select * from sam2;
+---------+---------+
| column1 | column3 |
+---------+---------+
|       3 |       2 |
|       2 |       3 |
+---------+---------+
2 rows in set (0.00 sec)

那么对于update,我们就可以拓展一下了,如果主库的id和col1是相同的数据,而col2是不同的数据,也就是整行数据是不一样,那么对于update在从库会是怎么样呢?例如
主库第二行和第三行的前两列是一样的:

mysql> select * from sam2;
+------+------+------+
| id   | col1 | col2 |
+------+------+------+
|    3 |    2 |    3 |
|    2 |    3 |    4 |
|    2 |    3 |    5 |
+------+------+------+
3 rows in set (0.00 sec)

自然从库的前两行也会是一样的:

+---------+---------+
| column1 | column3 |
+---------+---------+
|       3 |       2 |
|       2 |       3 |
|       2 |       3 |
+---------+---------+
3 rows in set (0.01 sec)

那么对于在主库执行update sam2 set col1=10 where id=2 and col1=3 and col2=4这样的语句,主库是只更新一行,但是从库会不会第二和第三行都更新呢?
答案是不会的,主库执行以后数据是:

mysql> select * from sam2;
+------+------+------+
| id   | col1 | col2 |
+------+------+------+
|    3 |    2 |    3 |
|    2 |   10 |    4 |
|    2 |    3 |    5 |
+------+------+------+
3 rows in set (0.00 sec)

从库复制后数据是:

mysql> select * from sam2;
+---------+---------+
| column1 | column3 |
+---------+---------+
|       3 |       2 |
|       2 |      10 |
|       2 |       3 |
+---------+---------+
3 rows in set (0.00 sec)

所以这时候从库复制还是遵从binlog的行格式,只更新一行数据。

经过测试,delete也不会报错。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值