Flink开发笔记二
1. WindowFunction
2. Watermark
2.1. 分布式的Watermark
2.2. DataStream和其他流之间的转换
2.3. Window的抽象概念
2.4. Window Trigger
2.5. Window Evictor
2.6. 1.11版本Watermark写法
val dataStream: DataStream[LoginEvent] = inputStream
.map {
data => {
val arr: Array[String] = data.split(",")
LoginEvent(arr(0).toLong, arr(1), arr(2), arr(3).toLong)
}
}
.assignTimestampsAndWatermarks {
WatermarkStrategy
.forBoundedOutOfOrderness[LoginEvent](Duration.ofSeconds(3))
.withTimestampAssigner(new SerializableTimestampAssigner[LoginEvent] {
override def extractTimestamp(element: LoginEvent, recordTimestamp: Long): Long = element.timestamp * 1000L
})
}
- 注意: scala2.11有点问题,需要配置
参考链接
3. State 状态管理
3.1. 算子状态(operator state)
3.2. 键控状态(keyed state)
3.3. 状态编程
val inputStream: DataStream[String] = env.socketTextStream("192.168.1.27", 9999)
val dataStream: DataStream[SensorReading] = inputStream.map(
data => {
val arr: Array[String] = data.split(",")
SensorReading(arr(0), arr(1).toLong, arr(2).toDouble)
}
)
// 需求:对于传感器温度值跳变,超过“10”度进行报警
val warningStream: DataStream[(String, Double, Double)] = dataStream
.keyBy(_.id)
// 方式1 .flatMap(new TimeChangeAlert(10.0))
// 方式2 (R, S) 有状态的函数(只能在keyBy之后使用) fun: (T, Option[S]) => (TraversableOnce[R], Option[S])
.flatMapWithState[(String, Double, Double), Double] {
case (data: SensorReading, None) => (List.empty, Some(data.temperature))
case