flink-java版学习-6-reduce聚合

本文介绍了Apache Flink中的reduce聚合操作与Hive的map-reduce框架中reduce操作的相似性。通过示例代码展示了如何使用Flink进行数据流处理,包括将文本数据转换为SensrReading对象,然后按ID分组并使用reduce获取每个传感器的最大温度。在数据乱序的情况下,调整数据集以获得更直观的结果。文章重点讨论了keyBy+Max算子与reduce算子在多字段聚合能力上的区别。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

 

reduce聚合操作和hive的map-reduce框架中的reduce操作类似

个人理解:keyBy+Max这种算子和reduce算子的区别在于,reduce可以多字段聚合,前者只能单字段

reduce通过滚动计算取得当前最大值

由于之前数据集有时间乱序,出来的结果不够直观,这里调整下数据集

sensor_1,1547718200,34.8
sensor_1,1547718288,34.8
sensor_1,1547719200,37.8
sensor_1,1547719280,39.8
sensor_6,1547718201,35.5
sensor_7,1547718214,35.3
sensor_10,1547718234,15.8
sensor_1,1547719999,55.8

测试代码:

package com.shihuo.apitest_transform;

import com.shihuo.com.shihuo.apitest_beans.SensrReading;
import org.apache.flink.api.common.functions.MapFunction;
import org.apache.flink.api.common.functions.ReduceFunction;
import org.apache.flink.api.java.tuple.Tuple;
import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.datastream.KeyedStream;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;

public class TransformTest3_Reduce {
    public static void main(String[] args) throws Exception {
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        env.setParallelism(1);

        // 从文件读取数据
        String inputPath = "/Users/wangyuhang/Desktop/FlinkTutorial/src/main/resources/sensor.txt";
        DataStream<String> stringDataStream = env.readTextFile(inputPath);

        //转换成sensorreading类型
        DataStream<SensrReading> dataStream = stringDataStream.map(new MapFunction<String, SensrReading>() {
            @Override
            public SensrReading map(String value) throws Exception {
                String[] fields = value.split(",");
                return new SensrReading(fields[0],new Long(fields[1]),new Double(fields[2]));
            }
        });


        //分组
        KeyedStream<SensrReading, Tuple> keyedStream = dataStream.keyBy("id");
        //reduce聚合,取最大的温度值,以及当前最新的时间戳
        keyedStream.reduce(new ReduceFunction<SensrReading>() {
            @Override
            public SensrReading reduce(SensrReading value1, SensrReading value2) throws Exception {
                return new SensrReading(value1.getId(),value2.getTimestamp(),Math.max(value1.getTemperature(),value2.getTemperature()));
            }
        });

//        keyedStream.reduce((curState,newData) -> {
//            return new SensrReading(curState.getId(),newData.getTimestamp(),Math.max(curState.getTemperature(),newData.getTemperature()));
//        });

        keyedStream.print();
        env.execute();
    }
}

输出结果如下:

SensrReading{id='sensor_1', timestamp=1547718200, temperature=34.8}
SensrReading{id='sensor_1', timestamp=1547718288, temperature=34.8}
SensrReading{id='sensor_1', timestamp=1547719200, temperature=37.8}
SensrReading{id='sensor_1', timestamp=1547719280, temperature=39.8}
SensrReading{id='sensor_6', timestamp=1547718201, temperature=35.5}
SensrReading{id='sensor_7', timestamp=1547718214, temperature=35.3}
SensrReading{id='sensor_10', timestamp=1547718234, temperature=15.8}
SensrReading{id='sensor_1', timestamp=1547719999, temperature=55.8}

结果主要体现在前2个数据上

评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值