单词计数demo
单词计数
实时统计每隔1秒统计最近2秒单词出现的次数
实时统计-scala
代码
/导入scala版本的
import org.apache.flink.streaming.api.scala.{
DataStream, StreamExecutionEnvironment}
import org.apache.flink.streaming.api.windowing.time.Time
object onLine {
def main(args: Array[String]): Unit = {
val environment: StreamExecutionEnvironment = StreamExecutionEnvironment.getExecutionEnvironment
val socketDS: DataStream[String] = environment.socketTextStream("node01",9999)
//必须导入隐式转换包
import org.apache.flink.api.scala._
val result: DataStream[(String,Int)] = socketDS.flatMap(x=>x.split(" "))
.map(x=>(x,1))
.keyBy(0) //按照下标为0的单词进行分组
.timeWindow(Time.seconds(2),Time.seconds(1)) //每个2s处理1s的数据
.sum(1)
result.print()
environment.execute("FlinkStream")
}
}
输入
结果
前面的数字表示线程号
也可以选择打成jar包到yarn中运行
flink run -m yarn-cluster -yn 2 -yjm 1024 -ytm 1024 -c com.wordcount.onLine original-flink_study-1.0-SNAPSHOT.jar
实时统计-java
import org.apache.flink.api.common.functions.FlatMapFunction;
import org.apache.flink.streaming.api.datastream.*;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment