Retry队列是什么
当消息消费失败时,消费者会将消息发往retry队列,等待重试
Retry队列的命名方式为:%RETRY%GROUP_NAME,也就是说每个消费组都会有自己独立的Retry队列
生产逻辑
RequestCode:CONSUMER_SEND_MSG_BACK
脉络图如下:

Client端
Concurrently模式
在以下几种情况下会向Retry队列发送该条消息
- 当消息消费失败返回RECONSUME_LATER,并且是Clustering模式时
- 清理本地超时消息(已拉取缓存在内存中的消息)
public void processConsumeResult(
final ConsumeConcurrentlyStatus status,
final ConsumeConcurrentlyContext context,
final ConsumeRequest consumeRequest
) {
switch (this.defaultMQPushConsumer.getMessageModel()) {
case BROADCASTING:
for (int i = ackIndex + 1; i < consumeRequest.getMsgs().size(); i++) {
MessageExt msg = consumeRequest.getMsgs().get(i);
log.warn("BROADCASTING, the message consume failed, drop it, {}", msg.toString());
}
break;
case CLUSTERING:
List<MessageExt> msgBackFailed = new ArrayList<MessageExt>(consumeRequest.getMsgs().size());
for (int i = ackIndex + 1; i < consumeRequest.getMsgs().size(); i++) {
MessageExt msg = consumeRequest.getMsgs().get(i);
//发往retry Topic
boolean result = this.sendMessageBack(msg, context);
if (!result) {
msg.setReconsumeTimes(msg.getReconsumeTimes() + 1);
msgBackFailed.add(msg);
}
}
//发送失败,会进入本地尝试消费
if (!msgBackFailed.isEmpty()) {
consumeRequest.getMsgs().removeAll(msgBackFailed);
this.submitConsumeRequestLater(msgBackFailed, consumeRequest.getProcessQueue(), consumeRequest.getMessageQueue());
}
break;
default:
break;
}
}
疑问
为什么Brocasting模式下直接丢弃消息?连重试都不做么~~这么粗暴
Orderly模式
当消费端返回SUSPEND_CURRENT_QUEUE_A_MOMENT状态紧,会先在本地尝试重试,直到超过最大消费次数,才会往Retry队列中发送
processConsumeResult
case SUSPEND_CURRENT_QUEUE_A_MOMENT:
this.getConsumerStatsManager().incConsumeFailedTPS(consumerGroup, consumeRequest.getMessageQueue().getTopic(), msgs.size());
if (checkReconsumeTimes(msgs)) {
consumeRequest.getProcessQueue().makeMessageToCosumeAgain(msgs);
this.submitConsumeRequestLater(
consumeRequest.getProcessQueue(),
consumeRequest.getMessageQueue(),
context.getSuspendCurrentQueueTimeMillis());
continueConsume = false;
} else {
commitOffset = consumeRequest.getProcessQueue().commit();
}
break;
private boolean checkReconsumeTimes(List<MessageExt> msgs) {
boolean suspend = false;
if (msgs != null && !msgs.isEmpty()) {
for (MessageExt msg : msgs) {
if (msg.getReconsumeTimes() >= getMaxReconsumeTimes()) {
MessageAccessor.setReconsumeTime(msg, String.valueOf(msg.getReconsumeTimes()));
//当超过最大重试消费次数时,尝试往RetryTopic发送消息
if (!sendMessageBack(msg)) {
suspend = true;
msg.setReconsumeTimes(msg.getReconsumeTimes() + 1);
}
} else {
suspend = true;
msg.setReconsumeTimes(msg.getReconsumeTimes() + 1);
}
}
}
return suspend;
}
直接发送
当client调用MQClientAPIImpl#consumerSendMessageBack(RequestCode=CONSUMER_SEND_MSG_BACK)异常时,client会直接构建消息发往RetryTopic(此时的RequestCode=SEND_MESSAGE)
保存逻辑
Broker在收到CONSUMER_SEND_MSG_BACK请求时,会创建RetryTopic(默认queue=1),并且判断是否已经超过最大重试消费次数,如果超了,则会将此消息丢到DLQ里,如果没超,则会将消息写入SCHEDULE_XXXX(默认delayLevel=ReConsumeTimes+3(最小level=3,也就是10秒)
也就是说,消息其实不是直接发往Retry队列的,而是经过SCHEDULE_XXXX中转了一把,这是为了避免由于消费端消费能力有限,导致的消费异常,通过错峰延迟,提高消费成功率。
详情见SendMessageProcessor#consumerSendMsgBack
消费逻辑
在consumer启动时(Clustering模式下),会同时注册%RETRY%GROUP_NAME这个TOPIC的消息监听
private void copySubscription() throws MQClientException {
try {
Map<String, String> sub = this.defaultMQPushConsumer.getSubscription();
switch (this.defaultMQPushConsumer.getMessageModel()) {
case BROADCASTING:
break;
case CLUSTERING:
//订阅Retry队列
final String retryTopic = MixAll.getRetryTopic(this.defaultMQPushConsumer.getConsumerGroup());
SubscriptionData subscriptionData = FilterAPI.buildSubscriptionData(this.defaultMQPushConsumer.getConsumerGroup(),
retryTopic, SubscriptionData.SUB_ALL);
this.rebalanceImpl.getSubscriptionInner().put(retryTopic, subscriptionData);
break;
default:
break;
}
} catch (Exception e) {
throw new MQClientException("subscription exception", e);
}
}
当达到delayLevel延迟时间时,broker会将消息发往Retry队列,消费端就可以正常进行消费的消费重试
优化
- Brocasting模式下,不应该直接丢弃消息,至少应该本地重试几次
- sendMsgBack,是将整条消息重新发送到broker,如果消息体比较大且这种重试消息比较多的话,会增大带宽成本,可以改为通过发送msgId等关键字段,由broker本地进行消息查询发往Retry队列
RocketMQ的Retry队列用于处理消费失败的消息,每个消费组都有独立的Retry队列。生产逻辑中,Concurrently模式下消费失败会发送至Retry队列,Orderly模式则在本地尝试重试。消费逻辑中,消息先被写入SCHEDULE_XXXX队列,延迟一定时间后发送到Retry队列供消费者重试。目前Broadcasting模式下直接丢弃消息,存在优化空间。
1906

被折叠的 条评论
为什么被折叠?



