
flink
flink相关
zjc4j
打工人
展开
-
关于flink中的OutputTag报错
报错 Exception in thread "main" org.apache.flink.api.common.functions.InvalidTypesException: Could not determine TypeInformation for the OutputTag type. The most common reason is forgetting to make the OutputTag an anonymous inner class. It is also not possi原创 2022-01-10 14:22:27 · 1633 阅读 · 0 评论 -
flink TableAPI和SQL的hello world
代码 import com.zjc.bean.SensorReading; import org.apache.flink.streaming.api.datastream.DataStream; import org.apache.flink.streaming.api.datastream.DataStreamSource; import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; import org.a原创 2021-12-02 22:41:29 · 428 阅读 · 0 评论 -
Flink之project
作用 主要是用于获取元组中指定字段的值,注意只能用于元组,不能用在其他数据类型。 代码 public static void main(String[] args) throws Exception { StreamExecutionEnvironment executionEnvironment = StreamExecutionEnvironment.getExecutionEnvironment(); executionEnvironment.setParall原创 2021-11-27 20:42:01 · 1262 阅读 · 0 评论 -
flink中滚动聚合算子-max和maxBy的区别
代码 public class Test2_RollingAggregation { public static void main(String[] args) throws Exception { StreamExecutionEnvironment executionEnvironment = StreamExecutionEnvironment.getExecutionEnvironment(); executionEnvironment.setParalle原创 2021-11-27 16:58:06 · 553 阅读 · 0 评论 -
flink自定义数据源小小案例-Java版
案例 每秒种输出10个传感器上面的温度值。 代码 import com.zjc4j.bean.SensorReading; import org.apache.flink.streaming.api.datastream.DataStream; import org.apache.flink.streaming.api.datastream.DataStreamSource; import org.apache.flink.streaming.api.environment.StreamExecutionE原创 2021-11-27 15:32:20 · 1484 阅读 · 0 评论 -
flink-wordcount-批处理版-java版
代码 import org.apache.flink.api.common.functions.FlatMapFunction; import org.apache.flink.api.java.DataSet; import org.apache.flink.api.java.ExecutionEnvironment; import org.apache.flink.api.java.operators.AggregateOperator; import org.apache.flink.api.jav原创 2021-11-24 19:38:54 · 1157 阅读 · 0 评论 -
实时统计每小时内的网站PV
代码 未优化: package com.zjc.flow_analysis import org.apache.flink.api.common.functions.AggregateFunction import org.apache.flink.streaming.api.TimeCharacteristic import org.apache.flink.streaming.api.scala._ import org.apache.flink.streaming.api.scala.functio原创 2021-09-16 21:06:29 · 253 阅读 · 0 评论 -
每隔5秒,输出最近10分钟内访问量最多的前N个URL(考虑迟到数据,迟到数据的三重保障,重点是分析!!!)
代码 package com.zjc.flow_analysis import org.apache.flink.api.common.functions.AggregateFunction import org.apache.flink.api.common.state.{ListStateDescriptor, MapState, MapStateDescriptor} import org.apache.flink.api.java.tuple.Tuple import org.apache.fl原创 2021-09-16 20:40:59 · 405 阅读 · 0 评论 -
每隔5分钟输出最近一小时内点击量最多的前N个商品(SQL实现版)
代码 package com.zjc.flow_analysis.hotitems_analysis import org.apache.flink.api.common.serialization.SimpleStringSchema import org.apache.flink.streaming.api.TimeCharacteristic import org.apache.flink.streaming.api.scala._ import org.apache.flink.table.api原创 2021-09-15 20:45:51 · 582 阅读 · 0 评论 -
每隔5分钟输出最近一小时内点击量最多的前N个商品(Table API + SQL实现版)
代码 package com.zjc.flow_analysis.hotitems_analysis import org.apache.flink.api.common.serialization.SimpleStringSchema import org.apache.flink.streaming.api.TimeCharacteristic import org.apache.flink.streaming.api.scala._ import org.apache.flink.table.api原创 2021-09-15 20:06:12 · 357 阅读 · 0 评论 -
每隔5分钟输出最近一小时内点击量最多的前N个商品(flink+kafka)
需求 每隔5分钟输出最近一小时内点击量最多的前N个商品。 样例数据,分别代表(用户id,商品id,类别id,行为,时间戳): 543462,1715,1464116,pv,1511658000 实现 用到的技术:flink、kafka、zookeeper。 HotItems.scala 实现具体业务。 package com.zjc.hotitems_analysis import org.apache.flink.api.common.functions.AggregateFunction impor原创 2021-08-09 16:47:09 · 1562 阅读 · 1 评论