Flink sql实现topN聚合结果写入kafka (Flink 1.11.0版本)

本文介绍了在Flink 1.11.0版本中使用SQL进行聚合操作并将结果写入Kafka时遇到的问题及解决方案。Flink的AppendStreamTableSink要求只有插入变更,但聚合操作会产生Retract消息。通过自定义KafkaTableSinkBase和KafkaTableSourceSinkFactoryBase类,以及修改JsonFormatFactory.java源码,实现了聚合数据写入Kafka。测试过程展示了数据在Kafka中的更新和可能出现的重复数据问题,建议在后续处理中进行去重。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

先跟鸡哥打个广告 ,博客地址: https://me.youkuaiyun.com/weixin_47482194

写的博客很有水平的,上了几次官网推荐了。

 

步入正题,在大家接触Flink SQL的时候,肯定绕不过kafka,在写入kafka的时候,不晓得大家有没有遇到问题?如下:

 Exception in thread "main" org.apache.flink.table.api.TableException: AppendStreamTableSink requires that Table has only insert changes.

额,开什么玩笑。。最基础的 select count(*) from table 这种语句都不支持的吗????

官网的解释是:这个问题是因Flink内部Retract机制导致,在没有考虑对Chanage log全链路支持之前,无法在Kafka这样的Append only的消息队列增加对Retract/Upsert的支持。

好在table可以转变stream,这是下面的代码(我这里是分组取的topn):

如果大家嫌弃还要连接kafka麻烦的话,可以直接source生产数据替代读取kafka。

public class FlinkTopN2Doris {


    private static final String KAFKA_SQL = "CREATE TABLE kafka_table (" +
            " category_id STRING," +
            " user_id STRING ," +
            " item_id STRING ," +
            " behavior STRING ," +
            " ts STRING ," +
//            " proctime as PROCTIME() ," +
            " row_ts AS TO_TIMESTAMP(FROM_UNIXTIME(cast(ts AS BIGINT), 'yyyy-MM-dd HH:mm:ss'))," +
            " WATERMARK FOR row_ts AS row_ts - INTERVAL '5' SECOND " +
            ") WITH (" +
            " 'connector' = 'kafka'," +
            " 'topic' = 'flink_test'," +
            " 'properties.bootstrap.servers' = '192.168.12.188:9092'," +
            " 'properties.group.id' = 'test1'," +
            " 'format' = 'json'," +
            " 'scan.startup.mode' = 'earliest-offset'" +
            ")";

    private static final String SINK_KAFKA_SQL = "CREATE TABLE kafka_table2 (" +
            " ts STRING," +
            " user_id STRING ," +
            " behavior STRING ," +
            "row_num BIGINT " +
            ") WITH (" +
            " 'connector' = 'kafka'," +
            " 'topic' = 'flink_test2'," +
            " 'properties.bootstrap.servers' = '192.168.12.188:9092'," +
            " 'properties.group.id' = 'test1'," +
            " 'format' = 'json'," +
            " 'scan.startup.mode' = 'earliest-offset'" +
            ")";

    private static final String PRINT_SQL = "create table sink_print (" +
            "  p_count BIGINT ," +
            "  b STRING " +
            ") with ('connector' = 'print' )";


    private static final String PRINT_SQL2 = "create table sink_print2 (" +
            "  a STRING," +
            "  b STRING," +
            "  c STRING," +
            "  d BIGINT " +
            ") with ('connector' = 'print' )";

    public static void main(String[] args) throws Exception {
        StreamExecutionEnvironment bsEnv = StreamExecutionEnvironment.getExec
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值