原文地址:http://hscarb.github.io/rocketmq/20250131-rocketmq-transactional-message.html
RocketMQ 事务消息原理详解 & 源码解析
1. 背景
在当下的互联网时代,微服务架构兴起,业务量也达到了惊人的量级。消息队列作为微服务架构解耦、流量削峰、异步处理等的重要组件,成为了微服务架构中不可或缺的一部分。
事务指的是一系列操作,要么全部成功,要么全部失败,事务在业务系统中也有大规模的应用。当涉及到事务相关的系统模块时,普通消息无法满足“本地操作和消息发送”要么全部成功,要么全部失败的需求,因此有了事务消息的需求。
RocketMQ 在 4.3.0 版本开始支持分布式事务消息。RocketMQ 的事务消息在普通消息基础上,支持二阶段的提交能力。将二阶段提交和本地事务绑定,实现全局提交结果的一致性。
本文将基于 RocketMQ 5.3.x 源码,分析 RocketMQ 事务消息的实现原理。
2. 使用示例
本示例使用 RocketMQ 4.x 的 Java 客户端实现。
2.1 创建事务 Topic
事先创建好一个 Topic 用作事务消息接收。RocketMQ 5.x 版本之后,需要在创建 Topic 时指定消息类型,这里创建一个 TRANSACTION
类型的 Topic。
./bin/mqadmin updatetopic -n localhost:9876 -t TopicTest1234 -c DefaultCluster -a +message.type=TRANSACTION
2.2 实现事务消息本地执行和回查逻辑
然后需要实现 TransactionListener
接口,该接口有两个方法:
executeLocalTransaction
:执行本地事务,这个方法中填写本地事务逻辑,返回LocalTransactionState
枚举值,表示本地事务的状态。checkLocalTransaction
:检查本地事务,返回LocalTransactionState
枚举值,表示本地事务的状态。
如果执行本地事务的操作直接返回 LocalTransactionState.COMMIT_MESSAGE
或 LocalTransactionState.ROLLBACK_MESSAGE
,则不会调用 checkLocalTransaction
方法。如果返回 LocalTransactionState.UNKNOW
,表示本地事务暂时没有执行完,结果未知,则会在后续调用 checkLocalTransaction
方法来检查本地事务执行的状态。
public class TransactionListenerImpl implements TransactionListener {
private AtomicInteger transactionIndex = new AtomicInteger(0);
private ConcurrentHashMap<String, Integer> localTrans = new ConcurrentHashMap<>();
@Override
public LocalTransactionState executeLocalTransaction(Message msg, Object arg) {
// 用作示例,模拟 3 种本地事务的执行结果
int value = transactionIndex.getAndIncrement();
int status = value % 3;
localTrans.put(msg.getTransactionId(), status);
// 故意返回 UNKNOW,模拟本地事务未执行完,需要执行事务状态检查
return LocalTransactionState.UNKNOW;
}
@Override
public LocalTransactionState checkLocalTransaction(MessageExt msg) {
// 根据本地事务随机模拟的 3 种执行结果,返回对应的本地事务状态
Integer status = localTrans.get(msg.getTransactionId());
if (null != status) {
switch (status) {
case 0:
return LocalTransactionState.UNKNOW;
case 1:
return LocalTransactionState.COMMIT_MESSAGE;
case 2:
return LocalTransactionState.ROLLBACK_MESSAGE;
}
}
return LocalTransactionState.COMMIT_MESSAGE;
}
}
2.3 事务消息生产者
然后创建一个 TransactionMQProducer
实例,并设置 TransactionListener
。
public static void main(String[] args) throws MQClientException, InterruptedException {
TransactionListener transactionListener = new TransactionListenerImpl();
TransactionMQProducer producer = new TransactionMQProducer(PRODUCER_GROUP, Arrays.asList(TOPIC));
producer.setNamesrvAddr(DEFAULT_NAMESRVADDR);
// 本地事务执行状态回查线程池
ExecutorService executorService = new ThreadPoolExecutor(2, 5, 100, TimeUnit.SECONDS, new ArrayBlockingQueue<>(2000), r -> {
Thread thread = new Thread(r);
thread.setName("client-transaction-msg-check-thread");
return thread;
});
producer.setExecutorService(executorService);
// 设置之前定义的本地事务执行和回查监听器实例
producer.setTransactionListener(transactionListener);
producer.start();
String[] tags = new String[] {"TagA", "TagB", "TagC", "TagD", "TagE"};
for (int i = 0; i < MESSAGE_COUNT; i++) {
try {
Message msg =
new Message(TOPIC, tags[i % tags.length], "KEY" + i,
("Hello RocketMQ " + i).getBytes(RemotingHelper.DEFAULT_CHARSET));
SendResult sendResult = producer.sendMessageInTransaction(msg, null);
System.out.printf("%s%n", sendResult);
Thread.sleep(10);
} catch (MQClientException | UnsupportedEncodingException e) {
e.printStackTrace();
}
}
// ...
}
2.4 消费者
最后启动一个消费者来消费事务消息。
public static void main(String[] args) throws InterruptedException, MQClientException {
DefaultMQPushConsumer consumer = new DefaultMQPushConsumer(CONSUMER_GROUP);
// Uncomment the following line while debugging, namesrvAddr should be set to your local address
consumer.setNamesrvAddr(NAMESRV_ADDR);
consumer.subscribe(TOPIC, "*");
consumer.setConsumeFromWhere(ConsumeFromWhere.CONSUME_FROM_FIRST_OFFSET);
consumer.registerMessageListener(new MessageListenerConcurrently() {
@Override
public ConsumeConcurrentlyStatus consumeMessage(List<MessageExt> msgs, ConsumeConcurrentlyContext context) {
System.out.printf("%s Receive New Messages: %s %n", Thread.currentThread().getName(), msgs);
return ConsumeConcurrentlyStatus.CONSUME_SUCCESS;
}
});
consumer.start();
System.out.printf("Consumer Started.%n");
}
2.5 运行结果
生产者的运行日志如下:
SendResult [sendStatus=SEND_OK, msgId=C0A80109803418B4AAC25D18010D0000, offsetMsgId=null, messageQueue=MessageQueue [topic=TopicTest1234, brokerName=broker-a, queueId=0], queueOffset=0, recallHandle=null]
SendResult [sendStatus=SEND_OK, msgId=C0A80109803418B4AAC25D1801BE0001, offsetMsgId=null, messageQueue=MessageQueue [topic=TopicTest1234, brokerName=broker-a, queueId=1], queueOffset=1, recallHandle=null]
SendResult [sendStatus=SEND_OK, msgId=C0A80109803418B4AAC25D1801CD0002, offsetMsgId=null, messageQueue=MessageQueue [topic=TopicTest1234, brokerName=broker-a, queueId=2], queueOffset=2, recallHandle=null]
SendResult [sendStatus=SEND_OK, msgId=C0A80109803418B4AAC25D1801DA0003, offsetMsgId=null, messageQueue=MessageQueue [topic=TopicTest1234, brokerName=broker-a, queueId=3], queueOffset=3, recallHandle=null]
SendResult [sendStatus=SEND_OK, msgId=C0A80109803418B4AAC25D1801E70004, offsetMsgId=null, messageQueue=MessageQueue [topic=TopicTest1234, brokerName=broker-a, queueId=0], queueOffset=4, recallHandle=null]
SendResult [sendStatus=SEND_OK, msgId=C0A80109803418B4AAC25D1801F50005, offsetMsgId=null, messageQueue=MessageQueue [topic=TopicTest1234, brokerName=broker-a, queueId=1], queueOffset=5, recallHandle=null]
SendResult [sendStatus=SEND_OK, msgId=C0A80109803418B4AAC25D1802020006, offsetMsgId=null, messageQueue=MessageQueue [topic=TopicTest1234, brokerName=broker-a, queueId=2], queueOffset=6, recallHandle=null]
SendResult [sendStatus=SEND_OK, msgId=C0A80109803418B4AAC25D18020F0007, offsetMsgId=null, messageQueue=MessageQueue [topic=TopicTest1234, brokerName=broker-a, queueId=3], queueOffset=7, recallHandle=null]
SendResult [sendStatus=SEND_OK, msgId=C0A80109803418B4AAC25D18021D0008, offsetMsgId=null, messageQueue=MessageQueue [topic=TopicTest1234, brokerName=broker-a, queueId=0], queueOffset=8, recallHandle=null]
SendResult [sendStatus=SEND_OK, msgId=C0A80109803418B4AAC25D18022B0009, offsetMsgId=null, messageQueue=MessageQueue [topic=TopicTest1234, brokerName=broker-a, queueId=1], queueOffset=9, recallHandle=null]
消费者运行日志如下:
Consumer Started.
ConsumeMessageThread_CID_JODIE_1_1 Receive New Messages: [MessageExt [brokerName=broker-a, queueId=0, storeSize=420, queueOffset=0, sysFlag=8, bornTimestamp=1737222654439, bornHost=/127.0.0.1:5290, storeTimestamp=1737222675592, storeHost=/127.0.0.1:10911, msgId=7F00000100002A9F000000104A169B49, commitLogOffset=69962472265, bodyCRC=601994070, reconsumeTimes=0, preparedTransactionOffset=69962469259, toString()=Message{topic='TopicTest1234', flag=0, properties={CONSUME_START_TIME=1737222710178, MSG_REGION=DefaultRegion, UNIQ_KEY=C0A80109803418B4AAC25D1801E70004, CLUSTER=DefaultCluster, PGROUP=please_rename_unique_group_name, MIN_OFFSET=0, __transactionId__=C0A80109803418B4AAC25D1801E70004, TAGS=TagE, TRAN_MSG=true, KEYS=KEY4, WAIT=true, TRACE_ON=true, TRANSACTION_CHECK_TIMES=1, REAL_TOPIC=TopicTest1234, MAX_OFFSET=1, REAL_QID=0}, body=[72, 101, 108, 108, 111, 32, 82, 111, 99, 107, 101, 116, 77, 81, 32, 52], transactionId='C0A80109803418B4AAC25D1801E70004'}]]
ConsumeMessageThread_CID_JODIE_1_2 Receive New Messages: [MessageExt [brokerName=broker-a, queueId=1, storeSize=420, queueOffset=0, sysFlag=8, bornTimestamp=1737222654398, bornHost=/127.0.0.1:5290, storeTimestamp=1737222675591, storeHost=/127.0.0.1:10911, msgId=7F00000100002A9F000000104A1699A5, commitLogOffset=69962471845, bodyCRC=1401636825, reconsumeTimes=0, preparedTransactionOffset=69962467966, toString()=Message{topic='TopicTest1234', flag=0, properties={CONSUME_START_TIME=1737222710178, MSG_REGION=DefaultRegion, UNIQ_KEY=C0A80109803418B4AAC25D1801BE0001, CLUSTER=DefaultCluster, PGROUP=please_rename_unique_group_name, MIN_OFFSET=0, __transactionId__=C0A80109803418B4AAC25D1801BE0001, TAGS=TagB, TRAN_MSG=true, KEYS=KEY1, WAIT=true, TRACE_ON=true, TRANSACTION_CHECK_TIMES=1, REAL_TOPIC=TopicTest1234, MAX_OFFSET=1, REAL_QID=1}, body=[72, 101, 108, 108, 111, 32, 82, 111, 99, 107, 101, 116, 77, 81, 32, 49], transactionId='C0A80109803418B4AAC25D1801BE0001'}]]
ConsumeMessageThread_CID_JODIE_1_3 Receive New Messages: [MessageExt [brokerName=broker-a, queueId=3, storeSize=420, queueOffset=0, sysFlag=8, bornTimestamp=1737222654479, bornHost=/127.0.0.1:5290, storeTimestamp=1737222675592, storeHost=/127.0.0.1:10911, msgId=7F00000100002A9F000000104A169CED, commitLogOffset=69962472685, bodyCRC=988340972, reconsumeTimes=0, preparedTransactionOffset=69962470552, toString()=Message{topic='TopicTest1234', flag=0, properties={CONSUME_START_TIME=1737222710178, MSG_REGION=DefaultRegion, UNIQ_KEY=C0A80109803418B4AAC25D18020F0007, CLUSTER=DefaultCluster, PGROUP=please_rename_unique_group_name, MIN_OFFSET=0, __transactionId__=C0A80109803418B4AAC25D18020F0007, TAGS=TagC, TRAN_MSG=true, KEYS=KEY7, WAIT=true, TRACE_ON=true, TRANSACTION_CHECK_TIMES=1, REAL_TOPIC=TopicTest1234, MAX_OFFSET=1, REAL_QID=3}, body=[72, 101, 108, 108, 111, 32, 82, 111, 99, 107, 101, 116, 77, 81, 32, 55], transactionId='C0A80109803418B4AAC25D18020F0007'}]]
只收到 3 条消息,因为发送时模拟了 3 种本地事务的执行结果,只有 3 条消息是 COMMIT_MESSAGE
状态,4 条消息是 UNKNOW
状态,3 条消息是 ROLLBACK_MESSAGE
状态。 UNKNOWN
状态的消息会继续进行回查,ROLLBACK_MESSAGE
状态的消息会被丢弃。
3. 概要设计
RocketMQ 的事务实现方式为二阶段提交:
- 先将原始消息以事务半消息的形式发送到服务端,对消费者不可见。
- 然后生产者执行本地事务,根据本地事务执行结果来复原或丢弃事务半消息。
- 如果本地事务执行结果未知,则服务端对生产者进行定期回查本地事务执行状态。
- 根据回查的本地事务执行结果,服务端将事务半消息复原或丢弃。
整个流程如下图所示:
- 生产者(Sender)发送消息到 RocketMQ 服务端(Server),除了原本的信息外,向消息属性中添加了事务消息标识,表示一个事务半消息(Half message)。
- 我们并不希望事务半消息被消费者消费到,所以它被保存在服务端一个专门用来保存半消息的特殊 Topic 中,消息持久化成功之后,向生产者返回 Ack 确认消息已经发送成功。
- 生产者开始执行本地事务逻辑。
- 生产者将本地事务执行结果上报给服务端。如果本地事务无法马上执行完,应上报 Unknown;执行成功则上报 Commit,执行失败上报 Rollback。服务端收到确认结果后处理逻辑如下:
- Unknown:暂不处理,服务端在