flink学习笔记3-flink窗口

flink窗口概述

将无界数据流划分成一个个的有界流,并在有界流中进行计算

flink时间分类

  • 处理时间Processing time:当前算子获取到流中数据的时间戳,
  • 事件时间Event time: 最原始数据中自身携带的时间,如从数据库读取数据,而数据库表中有一个字段为updatetime,那么这个字段就可以作为事件时间
  • 获取时间Ingestion time: flink从source中获取数据的时间戳

flink设置时间特征(全局设置)

时间特征的默认值为处理时间

  • env.setStreamTimeCharacteristic(TimeCharacteristic.ProcessingTime);
  • env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);

窗口分类

  1. countWindow:计数窗口,当处理的消息数量达到设定值时,触发窗口计算
  2. timeWindow:时间窗口,当达到设定的时间值时,触发窗口计算

每种窗口又分为:Keyed Window 和 Non-Keyed Windows

timeWindow 又分为根据处理时间、事件时间、获取时间的窗口

计数窗口

  1. keyed计数窗口

     public static void countWindow() throws Exception {
         
         KeyedStream<Tuple2<String, Integer>, Tuple> keyBy = null;
         //当某个key中的消息数达到10个时,触发窗口操作;
         keyBy.countWindow(10)
     }
    
  2. Non-Keyed计数窗口

     public static void countWindowAll() throws Exception {
         DataStreamSource<String> source = null;
         //只要获取到10条数据就触发窗口操作
         source.countWindowAll(10)
     }
    

时间窗口细分

  1. 滚动窗口:滚动处理时间窗口、滚动事件时间窗口(还包括keyed 和 none keyed)
  2. 滑动窗口:滑动处理时间窗口、滑动事件时间窗口(还包括keyed 和 none keyed)
  3. 会话窗口:会话处理时间窗口、会话事件时间窗口(还包括keyed 和 none keyed)

1. Non-Keyed 时间窗口

代码如下:

public class Test{
    public static void timetWindowAll() throws Exception {
            //创建flink流执行的环境,获取环境对象
            StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
    
            //添加一个输入流,这里是让程序监控本机9999端口,可以在本机安装nc程序,然后在控制台执行nc -lk 9999
            DataStreamSource<String> source = env.socketTextStream("localhost", 9999);
    
            //如果没有设置时间特征,则默认为处理时间
            source.timeWindowAll(Time.seconds(10));
    
            //设置为滚动窗口,时间特征为处理时间
            source.windowAll(TumblingProcessingTimeWindows.of(Time.seconds(10)));
            //设置为滚动窗口,时间特征为事件时间
            source.windowAll(TumblingEventTimeWindows.of(Time.seconds(10)));
    
            //设置为滑动窗口,时间特征为事件时间
            source.windowAll(SlidingEventTimeWindows.of(Time.seconds(10), Time.seconds(5)));
            //设置为滑动窗口,时间特征为处理时间
            source.windowAll(SlidingProcessingTimeWindows.of(Time.seconds(10), Time.seconds(5)));
    
            //静态会话窗口,时间特征为事件时间
            source.windowAll(EventTimeSessionWindows.withGap(Time.minutes(10)));
            //动态会话窗口,时间特征为事件时间
            source.windowAll(EventTimeSessionWindows.withDynamicGap(new SessionWindowTimeGapExtractor(){
                @Override
                public long extract(Object element) {
                    return 0;
                }
            }));
            //静态会话窗口,时间特征为处理时间
            source.windowAll(ProcessingTimeSessionWindows.withGap(Time.minutes(10)));
            //动态会话窗口,时间特征为处理时间
            source.windowAll(ProcessingTimeSessionWindows.withDynamicGap(new SessionWindowTimeGapExtractor(){
                @Override
                public long extract(Object element) {
                    return 0;
                }
            }));
    
            env.execute();
        }
}

2. Keyed 时间窗口

代码如下:

public class Test{    
    public static void timetWindow() throws Exception {
        //创建flink流执行的环境,获取环境对象
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();

        //添加一个输入流,这里是让程序监控本机9999端口,可以在本机安装nc程序,然后在控制台执行nc -lk 9999
        DataStreamSource<String> source = env.socketTextStream("localhost", 9999);

        KeyedStream<Tuple2<String, Integer>, Tuple> keyBy = source.flatMap(new FlatMapFunction<String, Tuple2<String, Integer>>() {
            @Override
            public void flatMap(String value, Collector<Tuple2<String, Integer>> out) throws Exception {
                out.collect(Tuple2.of(value.split(",")[0], Integer.valueOf(value.split(",")[1])));
            }
        }).keyBy(0);

        //如果没有设置时间特征,则默认为处理时间
        keyBy.timeWindow(Time.seconds(10));

        //设置为滚动窗口,时间特征为处理时间
        keyBy.window(TumblingProcessingTimeWindows.of(Time.seconds(10)));
        //设置为滚动窗口,时间特征为事件时间
        keyBy.window(TumblingEventTimeWindows.of(Time.seconds(10)));

        //设置为滑动窗口,时间特征为事件时间
        keyBy.window(SlidingEventTimeWindows.of(Time.seconds(10), Time.seconds(5)));
        //设置为滑动窗口,时间特征为处理时间
        keyBy.window(SlidingProcessingTimeWindows.of(Time.seconds(10), Time.seconds(5)));

        //静态会话窗口,时间特征为事件时间
        keyBy.window(EventTimeSessionWindows.withGap(Time.minutes(10)));
        //动态会话窗口,时间特征为事件时间
        keyBy.window(EventTimeSessionWindows.withDynamicGap(new SessionWindowTimeGapExtractor(){
            @Override
            public long extract(Object element) {
                return 0;
            }
        }));
        //静态会话窗口,时间特征为处理时间
        keyBy.window(ProcessingTimeSessionWindows.withGap(Time.minutes(10)));
        //动态会话窗口,时间特征为处理时间
        keyBy.window(ProcessingTimeSessionWindows.withDynamicGap(new SessionWindowTimeGapExtractor(){
            @Override
            public long extract(Object element) {
                return 0;
            }
        }));
        env.execute();
    }
}

窗口算子Window Functions

在窗口触发后,负责对窗口内的元素进行计算的函数

Window Function分类

  1. 增量聚合: 窗口不维护原始数据,只维护中间结果,每次基于中间结果和增量数据进行聚合。
  2. 全量聚合: 窗口需要维护全部原始数据,窗口触发进行全量聚合。

使用全量聚合函数时,窗口需要维护全部的原始数据,会占用更多内存消耗更多性能,因此可以使用全量聚合函数结合增量聚合函数的方式,结合使用时增量聚合函数负责维护中间结果,全量聚合函数维护经过增量函数处理过的数据

增量聚合窗口函数
  1. ReduceFunction:WindowedStream → DataStream,对keyed窗口流进行聚合计算,输入输出类型一样
  2. AggregateFunction:WindowedStream → DataStream,感觉比reducfunction功能更强大点
  3. flink自带的AggregateFunction函数
  • windowedStream.sum(“key”);
  • windowedStream.min(“key”);
  • windowedStream.max(“key”);
  • windowedStream.minBy(“key”);
  • windowedStream.maxBy(“key”);
ReduceFunction代码如下:
public class Test{
    public static void ReduceFunction() throws Exception {
            //创建flink流执行的环境,获取环境对象
            StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
    
            //添加一个输入流,这里是让程序监控本机9999端口,可以在本机安装nc程序,然后在控制台执行nc -lk 9999
            DataStreamSource<String> source = env.socketTextStream("localhost", 9999);
            SingleOutputStreamOperator<Tuple2<String, Integer>> map = source.flatMap(new FlatMapFunction<String, Tuple2<String, Integer>>() {
                @Override
                public void flatMap(String value, Collector<Tuple2<String, Integer>> out) throws Exception {
                    out.collect(Tuple2.of(value.split(",")[0], Integer.valueOf(value.split(",")[1])));
                }
            });
    
    
            //操作keyed类型的Windows  求key相同时的sum
            KeyedStream<Tuple2<String, Integer>, Tuple> keyBy = map.keyBy(0);
            keyBy.window(TumblingProcessingTimeWindows.of(Time.seconds(10)))
                    .reduce(new ReduceFunction<Tuple2<String, Integer>>() {
                        //因为keyby了,所有进入下列方法中的元素,key都是相同的,可以理解成对每个key new了一个reduce类进行执行如下方法
                        public Tuple2<String, Integer> reduce(Tuple2<String, Integer> v1, Tuple2<String, Integer> v2) throws Exception {
                            return Tuple2.of(v1.f0,v1.f1+v1.f1 );
                        }
                    }).print();
            
            //操作none keyed类型的window 求整个窗口的所有key的sum
            map.windowAll(TumblingProcessingTimeWindows.of(Time.seconds(10)))
                    .reduce(new ReduceFunction<Tuple2<String, Integer>>() {
                        //因为是none key,所以下面的方法中的f0字段是所有key
                        public Tuple2<String, Integer> reduce(Tuple2<String, Integer> v1, Tuple2<String, Integer> v2) throws Exception {
                            return Tuple2.of(v1.f0,v1.f1+v1.f1 );
                        }
                    }).print();
            
    
            env.execute();
        }
}
AggregateFunction代码如下:
public class Test{
    public static void AggregateFunction() throws Exception {
            //创建flink流执行的环境,获取环境对象
            StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
    
            //添加一个输入流,这里是让程序监控本机9999端口,可以在本机安装nc程序,然后在控制台执行nc -lk 9999
            DataStreamSource<String> source = env.socketTextStream("localhost", 9999);
            SingleOutputStreamOperator<Tuple2<String, Integer>> map = source.flatMap(new FlatMapFunction<String, Tuple2<String, Integer>>() {
                @Override
                public void flatMap(String value, Collector<Tuple2<String, Integer>> out) throws Exception {
                    out.collect(Tuple2.of(value.split(",")[0], Integer.valueOf(value.split(",")[1])));
                }
            });
    
    
            //操作keyed类型的Windows  求key相同时的value对应的平均值
            KeyedStream<Tuple2<String, Integer>, Tuple> keyBy = map.keyBy(0);
            keyBy.window(TumblingProcessingTimeWindows.of(Time.seconds(10)))
                   .aggregate(new AggregateFunction<Tuple2<String,Integer>, Tuple2<Integer,Integer>, Double>() {
                       @Override
                       public Tuple2<Integer, Integer> createAccumulator() {
                           return Tuple2.of(0, 0);
                       }
    
                       @Override
                       public Tuple2<Integer, Integer> add(Tuple2<String, Integer> value, Tuple2<Integer, Integer> accumulator) {
                           return Tuple2.of(accumulator.f0++, accumulator.f1+value.f1);
                       }
    
                       @Override
                       public Double getResult(Tuple2<Integer, Integer> accumulator) {
                           return ((double) accumulator.f1)/accumulator.f0;
                       }
    
                       @Override
                       public Tuple2<Integer, Integer> merge(Tuple2<Integer, Integer> a, Tuple2<Integer, Integer> b) {
                           return new Tuple2<>(a.f0 + b.f0, a.f1 + b.f1);
                       }
                   }).print();
    
            //操作none keyed类型的Windows  求所有key的value的平均值
            map.windowAll(TumblingProcessingTimeWindows.of(Time.seconds(10)))
                    .aggregate(new AggregateFunction<Tuple2<String,Integer>, Tuple2<Integer,Integer>, Double>(){
                        @Override
                        public Tuple2<Integer, Integer> createAccumulator() {
                            return Tuple2.of(0, 0);
                        }
    
                        @Override
                        public Tuple2<Integer, Integer> add(Tuple2<String, Integer> value, Tuple2<Integer, Integer> accumulator) {
                            return Tuple2.of(accumulator.f0++, accumulator.f1+value.f1);
                        }
    
                        @Override
                        public Double getResult(Tuple2<Integer, Integer> accumulator) {
                            return ((double) accumulator.f1)/accumulator.f0;
                        }
    
                        @Override
                        public Tuple2<Integer, Integer> merge(Tuple2<Integer, Integer> a, Tuple2<Integer, Integer> b) {
                            return new Tuple2<>(a.f0 + b.f0, a.f1 + b.f1);
                        }
                    }).print();
    
    
            env.execute();
        }
}
全量聚合窗口函数
  1. ApplyFunction:WindowedStream → DataStream,对Window流进行计算
  2. ProcessWindowFunction:WindowedStream → DataStream,该方法底层调用的是apply方法

代码如下:

public class Test{
    public static void ApplyFunctionAndProcessWindowFunction() throws Exception {
        
        //创建flink流执行的环境,获取环境对象
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        
        //添加一个输入流,这里是让程序监控本机9999端口,可以在本机安装nc程序,然后在控制台执行nc -lk 9999
        DataStreamSource<String> source = env.socketTextStream("localhost", 9999);
        SingleOutputStreamOperator<Tuple2<String, Integer>> map = source.flatMap(new FlatMapFunction<String, Tuple2<String, Integer>>() {
            @Override
            public void flatMap(String value, Collector<Tuple2<String, Integer>> out) throws Exception {
                out.collect(Tuple2.of(value.split(",")[0], Integer.valueOf(value.split(",")[1])));
            }
        });
        
        
        //操作keyed类型的Windows  求key相同时的sum
        KeyedStream<Tuple2<String, Integer>, Tuple> keyBy = map.keyBy(0);
        keyBy.window(TumblingProcessingTimeWindows.of(Time.seconds(10)))
                .apply(new WindowFunction<Tuple2<String,Integer>, Integer, Tuple, TimeWindow>() {
            @Override
            public void apply(Tuple tuple, TimeWindow window, Iterable<Tuple2<String, Integer>> values, Collector<Integer> out) throws Exception {
                int sum = 0;
                for (Tuple2<String, Integer> t: values) {
                    sum += t.f1;
                }
                out.collect(sum);
            }
        }).print();
        
        keyBy.window(TumblingProcessingTimeWindows.of(Time.seconds(10)))
                .process(new ProcessWindowFunction<Tuple2<String,Integer>, Integer, Tuple, TimeWindow>() {
                    @Override
                    public void process(Tuple tuple, Context context, Iterable<Tuple2<String, Integer>> values, Collector<Integer> out) throws Exception {
                        int sum = 0;
                        for (Tuple2<String, Integer> t: values) {
                            sum += t.f1;
                        }
                        out.collect(sum);
                    }
                }).print();
        
        //操作none keyed类型的window 求整个窗口的所有key的sum
        map.windowAll(TumblingProcessingTimeWindows.of(Time.seconds(10)))
                .apply(new AllWindowFunction<Tuple2<String,Integer>, Integer, TimeWindow>() {
                    @Override
                    public void apply(TimeWindow window, Iterable<Tuple2<String, Integer>> values, Collector<Integer> out) throws Exception {
                        int sum = 0;
                        for (Tuple2<String, Integer> t: values) {
                            sum += t.f1;
                        }
                        out.collect(sum);
                    }
                }).print();
        map.windowAll(TumblingProcessingTimeWindows.of(Time.seconds(10)))
                .process(new ProcessAllWindowFunction<Tuple2<String,Integer>, Integer, TimeWindow>() {
                    @Override
                    public void process(Context context, Iterable<Tuple2<String, Integer>> values, Collector<Integer> out) throws Exception {
                        int sum = 0;
                        for (Tuple2<String, Integer> t: values) {
                            sum += t.f1;
                        }
                        out.collect(sum);
                    }
                }).print();
        
        env.execute(); 
    }
}
增量聚合函数配合全量聚合函数使用
ReduceFunction配合ProcessWindowFunction
public class Test{
    public static void ReduceFunctionAndProcessWindowFunction() throws Exception {
            //创建flink流执行的环境,获取环境对象
            StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
    
            //添加一个输入流,这里是让程序监控本机9999端口,可以在本机安装nc程序,然后在控制台执行nc -lk 9999
            DataStreamSource<String> source = env.socketTextStream("localhost", 9999);
            SingleOutputStreamOperator<Tuple2<String, Integer>> map = source.flatMap(new FlatMapFunction<String, Tuple2<String, Integer>>() {
                @Override
                public void flatMap(String value, Collector<Tuple2<String, Integer>> out) throws Exception {
                    out.collect(Tuple2.of(value.split(",")[0], Integer.valueOf(value.split(",")[1])));
                }
            });
    
    
            //操作keyed类型的Windows  求key相同时的sum
            KeyedStream<Tuple2<String, Integer>, Tuple> keyBy = map.keyBy(0);
            keyBy.window(TumblingProcessingTimeWindows.of(Time.seconds(10)))
                    .reduce(new ReduceFunction<Tuple2<String, Integer>>() {
                        //将分组后的数据进行累计,假设流中的数据为【1,2,3,4,5】
                        //则第一次执行时value1为流中第一个元素 value2为流中第二个元素
                        //第N次执行时,value1为上次执行此方法的返回值  value2为第N个元素
                        @Override
                        public Tuple2<String, Integer> reduce(Tuple2<String, Integer> value1, Tuple2<String, Integer> value2) throws Exception {
                            return Tuple2.of(value1.f0, value1.f1+value2.f1);
                        }
                    }, new ProcessWindowFunction<Tuple2<String,Integer>, Integer, Tuple, TimeWindow>() {
                        //因为先调用了reduce方法,因此到process方法时,数据就只有一个了,即elements集合中有且只有一个元素
                        //如果没有调用reduce方法,那么process方法中的elements,将会是本窗口的所有数据
                        @Override
                        public void process(Tuple tuple, Context context, Iterable<Tuple2<String, Integer>> elements, Collector<Integer> out) throws Exception {
                            Tuple2<String, Integer> next = elements.iterator().next();
                            out.collect(next.f1);
                        }
                    }).print();
    
            //操作none keyed类型的window 求整个窗口的所有key的sum
            map.windowAll(TumblingProcessingTimeWindows.of(Time.seconds(10)))
                    .reduce(new ReduceFunction<Tuple2<String, Integer>>() {
                        @Override
                        public Tuple2<String, Integer> reduce(Tuple2<String, Integer> value1, Tuple2<String, Integer> value2) throws Exception {
                            return Tuple2.of("", value1.f1+value2.f1);
                        }
                    }, new ProcessAllWindowFunction<Tuple2<String,Integer>, Integer, TimeWindow>() {
                        @Override
                        public void process(Context context, Iterable<Tuple2<String, Integer>> elements, Collector<Integer> out) throws Exception {
                            Tuple2<String, Integer> next = elements.iterator().next();
                            out.collect(next.f1);
                        }
                    }).print();
    
    
            env.execute();
        }
}
AggregateFunction配合ProcessWindowFunction
public class Test{
    public static void AggregateFunctionAndProcessWindowFunction() throws Exception {
            //创建flink流执行的环境,获取环境对象
            StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
    
            //添加一个输入流,这里是让程序监控本机9999端口,可以在本机安装nc程序,然后在控制台执行nc -lk 9999
            DataStreamSource<String> source = env.socketTextStream("localhost", 9999);
            SingleOutputStreamOperator<Tuple2<String, Integer>> map = source.flatMap(new FlatMapFunction<String, Tuple2<String, Integer>>() {
                @Override
                public void flatMap(String value, Collector<Tuple2<String, Integer>> out) throws Exception {
                    out.collect(Tuple2.of(value.split(",")[0], Integer.valueOf(value.split(",")[1])));
                }
            });
    
    
            //操作keyed类型的Windows  求key相同时的value对应的平均值
            KeyedStream<Tuple2<String, Integer>, Tuple> keyBy = map.keyBy(0);
            keyBy.window(TumblingProcessingTimeWindows.of(Time.seconds(10)))
                    .aggregate(new AggregateFunction<Tuple2<String, Integer>, Tuple3<String, Integer, Integer>, Tuple2<String, Integer>>() {
                        @Override
                        public Tuple3<String, Integer, Integer> createAccumulator() {
                            return Tuple3.of("", 0, 0);
                        }
    
                        @Override
                        public Tuple3<String, Integer, Integer> add(Tuple2<String, Integer> value, Tuple3<String, Integer, Integer> accumulator) {
                            return Tuple3.of(value.f0, accumulator.f1++, accumulator.f2 + value.f1);
                        }
    
                        @Override
                        public Tuple2<String, Integer> getResult(Tuple3<String, Integer, Integer> accumulator) {
                            return Tuple2.of(accumulator.f0, accumulator.f2 / accumulator.f1);
                        }
    
                        @Override
                        public Tuple3<String, Integer, Integer> merge(Tuple3<String, Integer, Integer> a, Tuple3<String, Integer, Integer> b) {
                            return Tuple3.of(a.f0, a.f1 + b.f1, a.f2 + b.f2);
                        }
                    }, new ProcessWindowFunction<Tuple2<String,Integer>, Tuple2<String,Integer>, Tuple, TimeWindow>() {
                        //本来这里不需要processwindowfunction就已经完成功能了,这里只是为了学习使用AggregateFunction优化ProcessWindowFunction
                        @Override
                        public void process(Tuple tuple, Context context, Iterable<Tuple2<String, Integer>> elements, Collector<Tuple2<String, Integer>> out) throws Exception {
                            Tuple2<String, Integer> next = elements.iterator().next();
                            out.collect(next);
                        }
                    }).print();
    
    
    
            env.execute();
        }
}

事件时间和水印

如果窗口使用的是事件时间Event time时(stream.window(XXXXEventTimeWindows.of())),必须要为窗口设置水印,如果是处理时间Processing time,则设置水印操作是非必须的,一般只会为事件时间设置水印

为什么需要为eventtime事件时间设置水印?

  • 需要告诉flink消息中哪个字段是eventtime字段
  • 为窗口延迟执行设置时间。如果消息的eventtime属于上一个已经触发计算的窗口,即消息迟到了,如何处理呢?为窗口设置延迟触发时间,让迟到的消息也能在窗口中进行计算

水印的分类

  1. 定期水印:With Periodic Watermarks,即定期生成水印,常用BoundedOutOfOrdernessTimestampExtractor实现类
  2. 断点水印:With Punctuated Watermarks,目前还没搞明白
    ####定期水印
    代码如下:
public class Test{
    public static void assignTimestampsAndWatermarks() throws Exception {
            //创建flink流执行的环境,获取环境对象
            StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
    
            //添加一个输入流,这里是让程序监控本机9999端口,可以在本机安装nc程序,然后在控制台执行nc -lk 9999
            DataStreamSource<String> source = env.socketTextStream("localhost", 9999);
            SingleOutputStreamOperator<Tuple2<String, Integer>> map = source.flatMap(new FlatMapFunction<String, Tuple2<String, Integer>>() {
                @Override
                public void flatMap(String value, Collector<Tuple2<String, Integer>> out) throws Exception {
                    out.collect(Tuple2.of(value.split(",")[0], Integer.valueOf(value.split(",")[1])));
                }
            })
            //如果窗口聚合函数使用的是事件时间Event time,则需要为窗口设置水印
            //常用BoundedOutOfOrdernessTimestampExtractor实现类,构造方法的参数为消息超时时间,
            // 即允许消息迟到5秒,如果窗口时间为10秒,那么窗口正真结束的时间为15秒、30秒、45秒、60秒而不是10秒、20秒、30秒
            .assignTimestampsAndWatermarks(new BoundedOutOfOrdernessTimestampExtractor<Tuple2<String, Integer>>(Time.seconds(5)) {
                @Override
                //为消息设置哪个字段是eventtime,
                public long extractTimestamp(Tuple2<String, Integer> element) {
                    return element.f1;
                }
            });
    
            //操作keyed类型的Windows  求key相同时的value对应的平均值
            KeyedStream<Tuple2<String, Integer>, Tuple> keyBy = map.keyBy(0);
            keyBy.window(TumblingEventTimeWindows.of(Time.seconds(10)))
                    .aggregate(new AggregateFunction<Tuple2<String,Integer>, Tuple2<Integer,Integer>, Double>() {
                        @Override
                        public Tuple2<Integer, Integer> createAccumulator() {
                            return Tuple2.of(0, 0);
                        }
    
                        @Override
                        public Tuple2<Integer, Integer> add(Tuple2<String, Integer> value, Tuple2<Integer, Integer> accumulator) {
                            return Tuple2.of(accumulator.f0++, accumulator.f1+value.f1);
                        }
    
                        @Override
                        public Double getResult(Tuple2<Integer, Integer> accumulator) {
                            return ((double) accumulator.f1)/accumulator.f0;
                        }
    
                        @Override
                        public Tuple2<Integer, Integer> merge(Tuple2<Integer, Integer> a, Tuple2<Integer, Integer> b) {
                            return new Tuple2<>(a.f0 + b.f0, a.f1 + b.f1);
                        }
                    }).print();
            env.execute();
        }
}

窗口触发器

触发窗口执行窗口函数,每个窗口分配器都有默认的触发器,如eventtime window分配器默认的触发器为EventTimeTrigger,可以为一个窗口定义多个触发器
自定义触发器:

  • 继承Trigger类,并实现其抽象方法
  • 设置触发器

示例代码如下

public class Test{
    public static void trigger(){
        //创建flink流执行的环境,获取环境对象
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();

        //添加一个输入流,这里是让程序监控本机9999端口,可以在本机安装nc程序,然后在控制台执行nc -lk 9999
        DataStreamSource<String> source = env.socketTextStream("localhost", 9999);
        SingleOutputStreamOperator<Tuple2<String, Integer>> map = source.flatMap(new FlatMapFunction<String, Tuple2<String, Integer>>() {
            @Override
            public void flatMap(String value, Collector<Tuple2<String, Integer>> out) throws Exception {
                out.collect(Tuple2.of(value.split(",")[0], Integer.valueOf(value.split(",")[1])));
            }
        });


        KeyedStream<Tuple2<String, Integer>, Tuple> keyBy = map.keyBy(0);
        keyBy.window(TumblingEventTimeWindows.of(Time.seconds(10)))
                //设置触发器
                .trigger(new MyTrigger())
                .reduce(new ReduceFunction<Tuple2<String, Integer>>() {
                    @Override
                    public Tuple2<String, Integer> reduce(Tuple2<String, Integer> value1, Tuple2<String, Integer> value2) throws Exception {
                        return null;
                    }
                })
                .print();
    }    
    /**
    * 自定义触发器
     * TriggerResult.CONTINUE:什么都不做
     * TriggerResult.FIRE:触发计算
     * TriggerResult.PURGE:清除窗口中的元素,并删除窗口
     * TriggerResult.FIRE_AND_PURGE:触发计算,再清除窗口中的元素,并删除窗口
     *
     */
    private static class MyTrigger extends Trigger<Object, TimeWindow>{

        //消息添加到窗口时触发
        @Override
        public TriggerResult onElement(Object element, long timestamp, TimeWindow window, TriggerContext ctx) throws Exception {
            return TriggerResult.PURGE;
        }
        //处理时间触发时
        @Override
        public TriggerResult onProcessingTime(long time, TimeWindow window, TriggerContext ctx) throws Exception {
            return null;
        }
        //事件时间触发时
        @Override
        public TriggerResult onEventTime(long time, TimeWindow window, TriggerContext ctx) throws Exception {
            return null;
        }
        //删除窗口后执行
        @Override
        public void clear(TimeWindow window, TriggerContext ctx) throws Exception {

        }
        //合并窗口状态
        @Override
        public void onMerge(TimeWindow window, OnMergeContext ctx) throws Exception {
            super.onMerge(window, ctx);
        }
    }
}
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值