一、所需命令及代码
linux命令
[root@localhost ~]# nc -lk 7777
-bash: nc: command not found
yum install -y nc
[root@localhost ~]# nc -lk 7777
hello world nihao are you
代码
package com.robert.flink.wcTest;
import org.apache.flink.api.common.functions.FlatMapFunction;
import org.apache.flink.api.common.typeinfo.Types;
import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
public class FlinStreamSocketWordCountPrintTest {
public static void main(String[] args) throws Exception {
//get runtime environment
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
//read content into dataset from file
DataStream<String> stringDataSet = env.socketTextStream("192.168.56.10",7777);
//count word appeared times
DataStream<Tuple2<String, Integer>> sum = stringDataSet.flatMap((FlatMapFunction<String, Tuple2<String, Integer>>) (value, collector) -> {
String[] words = value.split(" ");
for (String word : words) {
collector.collect(new Tuple2<>(word, 1));
}
}).returns(Types.TUPLE(Types.STRING, Types.INT)).keyBy(0).sum(1);
//print content of dataset
sum.print();
env.execute();
}
}
二、流程及效果
流程:1.先运行以上linux命令 中的nc -lk 7777 如果没有nc则yum安装
2.然后运行java代码开启flink的socket流监听
然后输入数据
效果:输入单词 以空格分开 一次换行打印一次 分组统计出现次数如下