场景: flink 消费 rabbitmq 数据 写入到 starrocks
技术实现 : 使用官方给的rabbitmq connector 案例
-
遇到的问题: 通过web ui 无论怎么设置并行度 mq source 并行度都为 1 导致rabbitmq 大量堆积
优化后的方案: 使用 RichParalleSourceFunction 增大consumer 消费数量
- 核心代码: 使用 rabbitmq-client 集成到 sourcefunction 中
public class MQParalleSource extends RichParallelSourceFunction<String> {
private String userName;
private String password;
private String virtualHost;
private String hostName;
private int portNumber;
private String queueName;
private String exchangeName;
private int qos;
private transient Connection conn;
private volatile String msg;
public MQParalleSource(String userName, String password, String virtualHost, String hostName, int portNumber, String queueName, String exchangeName, int qos) {
this.userName = userName;
this.password = password;
this.virtualHost = virtualHost;
this.hostName = hostName;
this.portNumber = portNumber;
this.queueName = queueName;
this.exchangeName = exchangeName;
this.qos = qos;
}
@Override
public void run(SourceContext<String> ctx) throws Exception {
//
int subtaskId = 0;
subtaskId = getRuntimeContext().getIndexOfThisSubtask() + 1;
Channel channel = conn.createChannel(Math.abs(subtaskId));
channel.queueBind(queueName, exchangeName, "#");
channel.basicQos(qos);
while (true) {
GetResponse getResponse = channel.basicGet(queueName, true);
if (getResponse != null) {
String message = new String(getResponse.getBody(), "UTF-8");
ctx.collect(message);
} else {
ctx.collect("{\n" +
" \"col1\": \"000000000\",\n" +
" \"col2\": \"1\"\n" +
"}");
}
}
}
@Override
public void open(Configuration parameters) throws Exception {
super.open(parameters);
ConnectionFactory factory = new ConnectionFactory();
factory.setUsername(userName);
factory.setPassword(password);
factory.setVirtualHost(virtualHost);
factory.setHost(hostName);
factory.setPort(portNumber);
conn = factory.newConnection();
}
@Override
public void cancel() {
}
@Override
public void close() throws Exception {
}
}
- 难点 使用rabbitmq-client 中的 channel 的basicConsume方法 监听消费时 ctx.collect(message); 发送下游算子会抛异常
flink the source context has been closed already
由于对flink 源码了解有限 ,笔者使用channel 的 basicGet 方法 生产换验证增加并行度 消费QOS 也会增大 。而且消息不在发生长时间的堆积