今天其中一台游戏服务器的数据库mysql master当机, 系统变为只读模式,重启后进入安全模式,执行fsck后恢复正常。服务器起来之后mysql启动正常,但一台slave却一直出现同步错误。
登录后查看,发现以下错误:mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State:
Master_Host: 10.90.13.238
Master_User: slave
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000949
Read_Master_Log_Pos: 277562491
Relay_Log_File: mysql-relay-bin.001616
Relay_Log_Pos: 277562637
Relay_Master_Log_File: mysql-bin.000949
Slave_IO_Running: No
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 1
Exec_Master_Log_Pos: 277562491
Relay_Log_Space: 277562836
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 1236
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Client requested master to start replication from impossible position; the first event 'mysql-bin.000949' at 277562491, the last event read from './mysql-bin.000949' at 4, the last byte read from './mysql-bin.000949' at 4.'
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 4
1 row in set (0.00 sec)
错误为:
Got fatal error 1236 from master when reading data from binary log: 'Client requested master to start replication from impossible position; the first event 'mysql-bin.000949' at 277562491, the last event read from './mysql-bin.000949' at 4, the last byte read from './mysql-bin.000949' at 4.'
这个错误之前也遇到过,但没有具体记录下来,于是网上找资料。
参考了这几个资料:
出现这样的错误原因很简单,原本的slave在master当机前一直在执行同步的动作,当master当机重启mysql恢复之后,会重新开一个新的binlog继续写,但slave不知道发生了这件事,所以还在问上次同步的那个binlog文件和读到得那个位置。
要确定这个情况,我执行了如下的操作:
1. 检查master的位置mysql> show master status\G
*************************** 1. row ***************************
File: mysql-bin.000950
Position: 336492640
Binlog_Do_DB:
Binlog_Ignore_DB:
1 row in set (0.00 sec)
mysql> show master status;
2. 检查master上binlog的大小和最新的修改时间:[root@d1 ~]# ll /data/mysql/mysql-bin.*
-rw-rw---- 1 mysql mysql 1073742473 Nov 17 10:38 /data/mysql/mysql-bin.000944
-rw-rw---- 1 mysql mysql 1073742022 Nov 18 12:44 /data/mysql/mysql-bin.000945
-rw-rw---- 1 mysql mysql 1073745576 Nov 19 15:31 /data/mysql/mysql-bin.000946
-rw-rw---- 1 mysql mysql 1073745324 Nov 21 05:03 /data/mysql/mysql-bin.000947
-rw-rw---- 1 mysql mysql 1073742027 Nov 22 16:09 /data/mysql/mysql-bin.000948
-rw-rw---- 1 mysql mysql 277553623 Nov 23 05:07 /data/mysql/mysql-bin.000949
-rw-rw---- 1 mysql mysql 337157571 Nov 23 18:04 /data/mysql/mysql-bin.000950
-rw-rw---- 1 mysql mysql 133 Nov 23 08:06 /data/mysql/mysql-bin.index[root@d1 ~]# du /data/mysql/mysql-bin.* -sh
1.1G /data/mysql/mysql-bin.000944
1.1G /data/mysql/mysql-bin.000945
1.1G /data/mysql/mysql-bin.000946
1.1G /data/mysql/mysql-bin.000947
1.1G /data/mysql/mysql-bin.000948
265M /data/mysql/mysql-bin.000949
323M /data/mysql/mysql-bin.000950
4.0K /data/mysql/mysql-bin.index
从这里可以发现,000949是mysql在系统崩溃的时候最后写过的文件,在恢复之后重新建立了一个新的
000950,从时间和大小的条件可以判断,正常情况下mysql-bin.000949应该会写到1.1G的时候才会重新建立新的文件继续写,现在的情况是服务器宕机导致binlog crash了,所以mysql启动后会重新建立一个新的binlog文件。
3. 在slave上执行如下命令:mysql> stop slave
-> ;
Query OK, 0 rows affected (0.00 sec)
mysql> change master to master_host='10.90.13.238', master_user='slave' ,MASTER_PASSWORD='',MASTER_LOG_FILE='mysql-bin.000950',MASTER_LOG_POS=4;
Query OK, 0 rows affected (0.09 sec)
就是在mysql上重新指定新的binlog和它的初始位置。然后启动slave:mysql> start slave;
观察slave启动正常了mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 10.90.13.238
Master_User: slave
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000950
Read_Master_Log_Pos: 336968550
Relay_Log_File: mysql-relay-bin.000002
Relay_Log_Pos: 52752780
Relay_Master_Log_File: mysql-bin.000950
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 52752634
Relay_Log_Space: 336968852
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 31164
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 4
1 row in set (0.00 sec)
mysql>
本文介绍了一种常见的MySQL复制错误1236及其解决方法。通过调整slave节点的复制起点,成功解决了因主服务器意外重启而导致的同步问题。
5万+

被折叠的 条评论
为什么被折叠?



