Semisynch: Last_IO_Error: Fatal error: Failed to run 'after_queue_event' hook

解决了使用半同步复制时,从库向主库发送响应过程中出现错误导致从库I/O线程停止的问题。通过忽略发送响应返回值,确保即使发生错误也能避免I/O线程停止。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

Description:
When executing a replication workload with semisynch replication turned off and periodic
STOP SLAVE IO_THREAD thread, START SLAVE IO_THREAD (in order to simulate network
failures), the slave stopped with the following information in SHOW SLAVE STATUS:

mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State:
Master_Host: 127.0.0.1
Master_User: root
Master_Port: 19306
Connect_Retry: 1
Master_Log_File: master-bin.000001
Read_Master_Log_Pos: 1900343
Relay_Log_File: slave-relay-bin.000003
Relay_Log_Pos: 1283333
Relay_Master_Log_File: master-bin.000001
Slave_IO_Running: No
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 1900343
Relay_Log_Space: 1900793
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 1593
Last_IO_Error: Fatal error: Failed to run 'after_queue_event' hook
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
1 row in set (0.00 sec)

The slave error log contains:

090630 13:58:12 [ERROR] Set 'rpl_semi_sync_master_reply_log_file_pos' on master failed
090630 13:58:12 [ERROR] Slave I/O: Fatal error: Failed to run 'after_queue_event' hook,
Error_code: 1593
090630 13:58:12 [Note] Slave I/O thread exiting, read up to log 'master-bin.000001',
position 1900343

which implies that the master is at fault, however the master error log does not contain
any information about what happened.

Also, the error is reported as "Fatal", but START SLAVE IO_THREAD is able to reestablish
replication and Rpl_semi_sync_master_status converts back to ON, so how "Fatal" the error
was remains an open question.

How to repeat:
With the Random Query Generator from the mysql-test-extra-6.0 tree, mysql-test/gentest,
run:

$ perl runall.pl \
--basedir=/build/bzr/azalea-bugfixing \
--grammar=conf/replication_simple.yy \
--rpl_mode=statement \
--mem \
--mysqld=--plugin-dir=/build/bzr/azalea-bugfixing/plugin/semisync/.libs \

--mysqld=--plugin-load=rpl_semi_sync_master=libsemisync_master.so:rpl_semi_sync_slave=libsemisync_slave.so
\
--mysqld=--rpl_semi_sync_master_enabled=1 \
--mysqld=--rpl_semi_sync_slave_enabled=1 \
--duration=1800 \
--queries=100000000 \
--threads=10 \
--validator=ReplicationSlaveStatus \
--reporter=ReplicationSemiSync

The ReplicationSemiSync reporter causes the slave to stop for random periods of time and
then attempt to reestablish semisynchronous replication. The ReplicationSlaveStatus
monitors SHOW SLAVE STATUS and will terminate the test as soon as the error occurs,
usually within a minute (a few START|STOP SLAVE IO THREAD cycles).

Suggested fix:
Semisynchronous replication should survive IO thread issues without throwing error
messages.

The actual cause of each failure must be present if not in SHOW SLAVE STATUS, then at
least in the error log. If the slave error log talks about a failure on the master, the
error log on the master needs to contain an entry and debug information for that failure.

[30 Jun 2009 13:30] Zhenxing He

The error is on slave, not on master, maybe the error message 'Set
'rpl_semi_sync_master_reply_log_file_pos' on master failed' is misleading, better say
'Send semi-sync reply to master failed'. This error is mostly caused by connection
problem.

Fatal here means that the slave IO thread has to stop because of the error, but it's
possible to restart slave IO thread.

[30 Jun 2009 13:51] Philip Stoev

The test uses only STOP SLAVE IO_THREAD and it does not attempt to disrupt the connection
between the master and the slave in any other way. Can we have extra debugging
information in case this error happens so that this can be debugged further?

[5 Sep 2009 8:23] Bugs System

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

http://lists.mysql.com/commits/82521

2802 He Zhenxing 2009-09-05
Bug#45852 Semisynch: Last_IO_Error: Fatal error: Failed to run 'after_queue_event'
hook

Errors when send reply to master should never cause the IO thread
to stop, because master can fall back to async replication if it
does not get reply from slave.

The problem is fixed by deliberately ignore the return value of
slaveReply.
@ plugin/semisync/semisync_slave_plugin.cc
Deliberately ignore the return value of slaveReply so that errors
while sending slave reply will not cause the IO thread to stop.

[8 Sep 2009 4:41] Bugs System

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

http://lists.mysql.com/commits/82622

2802 He Zhenxing 2009-09-08
Bug#45852 Semisynch: Last_IO_Error: Fatal error: Failed to run 'after_queue_event'
hook

Errors when send reply to master should never cause the IO thread
to stop, because master can fall back to async replication if it
does not get reply from slave.

The problem is fixed by deliberately ignore the return value of
slaveReply.
@ plugin/semisync/semisync_slave.cc
Change the error message when send reply failed
@ plugin/semisync/semisync_slave_plugin.cc
Deliberately ignore the return value of slaveReply so that errors
while sending slave reply will not cause the IO thread to stop.

[8 Sep 2009 6:15] Bugs System

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

http://lists.mysql.com/commits/82625

2802 He Zhenxing 2009-09-08
Bug#45852 Semisynch: Last_IO_Error: Fatal error: Failed to run 'after_queue_event'
hook

Errors when send reply to master should never cause the IO thread
to stop, because master can fall back to async replication if it
does not get reply from slave.

The problem is fixed by deliberately ignoring the return value of
slaveReply.
@ plugin/semisync/semisync_slave.cc
Change the error message when send reply failed
@ plugin/semisync/semisync_slave_plugin.cc
Deliberately ignore the return value of slaveReply so that errors
while sending slave reply will not cause the IO thread to stop.

[26 Sep 2009 10:32] Bugs System

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

http://lists.mysql.com/commits/84715

2837 He Zhenxing 2009-09-26 [merge]
Merge BUG#45852 to 6.0-codebase-bugfixing

[30 Sep 2009 3:58] Bugs System

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

http://lists.mysql.com/commits/85120

3627 He Zhenxing 2009-09-30
Bug#45852 Semisynch: Last_IO_Error: Fatal error: Failed to run 'after_queue_event'
hook

Errors when send reply to master should never cause the IO thread
to stop, because master can fall back to async replication if it
does not get reply from slave.

The problem is fixed by deliberately ignoring the return value of
slaveReply.
@ plugin/semisync/semisync_slave_plugin.cc
Deliberately ignore the return value of slaveReply so that errors
while sending slave reply will not cause the IO thread to stop.

[9 Oct 2009 8:47] Bugs System

Pushed into 6.0.14-alpha (revid:alik@ibmvm-20091009083208-0o0f0i9w1sq3c1kn) (version
source revid:zhenxing.he@sun.com-20090930035829-z5k0sbc0s6pjx30e) (merge vers:
6.0.14-alpha) (pib:12)

[9 Oct 2009 10:43] Jon Stephens

Documented bugfix in the 6.0.14 changelog as follows:

When using semi-synchronous replication, an error raised on the
slave while sending a response to the master caused the slave IO
thread to stop.

Set status to NDI, waiting for push to 5.4.

[12 Oct 2009 12:57] Bugs System

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

http://lists.mysql.com/commits/86574

3115 He Zhenxing 2009-10-12
Backport Bug#45852 Semisynch: Last_IO_Error: Fatal error: Failed to run
'after_queue_event' hook

Errors when send reply to master should never cause the IO thread
to stop, because master can fall back to async replication if it
does not get reply from slave.

The problem is fixed by deliberately ignoring the return value of
slaveReply.

[27 Oct 2009 9:49] Bugs System

Pushed into 6.0.14-alpha (revid:alik@sun.com-20091027094604-9p7kplu1vd2cvcju) (version
source revid:zhenxing.he@sun.com-20091026140226-uhnqejkyqx1aeilc) (merge vers:
6.0.14-alpha) (pib:13)

[28 Oct 2009 6:37] Jon Stephens

Already documented for 6.0.14; re-closing.

[12 Nov 2009 8:18] Bugs System

Pushed into 5.5.0-beta (revid:alik@sun.com-20091110093229-0bh5hix780cyeicl) (version
source revid:alik@sun.com-20091027095744-rf45u3x3q5d1f5y0) (merge vers: 5.5.0-beta)
(pib:13)

[12 Nov 2009 14:27] Jon Stephens

Also documented in the 5.5.0 changelog; closed.

[18 Dec 2009 15:41] Paul DuBois

Removed 5.5.0 changelog entry. In 5.5, semisync replication first appears in 5.5.0, so
this bug affects no 5.5.x releases.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值