1、Flink安装与部署
环境准备:Centos7(关闭防火墙)
jdk1.8.0_211
flink-1.9.0-bin-scala_2.12
nc工具(yum install nmap-ncat.x86_64)
1.1、安装jdk
下载jdk-8u211-linux-x64.tar.gz
解压: tar -zxvf jdk-8u211-linux-x64.tar.gz
配置环境变量:vi /etc/profile
export JAVA_HOME=/home/jdk1.8.0_211
export CLASSPATH=.:${JAVA_HOME}/jre/lib/rt.jar:${JAVA_HOME}/lib/dt.jar:${JAVA_HOME}/lib/tools.jar
export PATH=$PATH:${JAVA_HOME}/bin
1.2、安装Flink
下载flink-1.9.0-bin-scala_2.12.tgz
解压:tar -zxvf flink-1.9.0-bin-scala_2.12.tgz
启动flink:
[root@flink bin]# ./start-cluster.sh
Starting cluster.
Starting standalonesession daemon on host flink.
Starting taskexecutor daemon on host flink.
在/home/flink-1.9.0/log目录下可以看到名为flink-root-standalonesession-0-vostro.log的日志文件
浏览器可以访问:http://192.168.244.136:8081/#/overview
1.3、简单使用
监听9000端口:
[root@flink flink-1.9.0]# ./bin/flink run examples/streaming/SocketWindowWordCount.jar --port 9000
连接9000端口,输入单词
[root@flink flink-1.9.0]# nc -l 9000
hello world
hello flink
刷新页面,任务多了一个
2、Flink开发
2.1、创建maven工程,类型为jar,名称为SocketWordCount
2.2、引入依赖
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.geostar.geosmarter</groupId>
<artifactId>SocketWordCount</artifactId>
<version>1.0.0</version>
<packaging>jar</packaging>
<name>SocketWordCount</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<flink.version>1.7.0</flink.version>
<java.version>1.8</java.version>
<scala.binary.version>2.11</scala.binary.version>
<maven.compiler.source>${java.version}</maven.compiler.source>
<maven.compiler.target>${java.version}</maven.compiler.target>
</properties>
<dependencies>
<!-- Apache Flink dependencies -->
<!-- These dependencies are provided, because they should not be packaged into the JAR file. -->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-java</artifactId>
<version>${flink.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-streaming-java_${scala.binary.version}</artifactId>
<version>${flink.version}</version>
<scope>provided</scope>
</dependency>
<!-- Add connector dependencies here. They must be in the default scope (compile). -->
<!-- <dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-kafka-0.10_${scala.binary.version}</artifactId>
<version>${flink.version}</version>
</dependency> -->
<!-- Add logging framework, to produce console output when running in the IDE. -->
<!-- These dependencies are excluded from the application JAR by default. -->
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
<version>1.7.7</version>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
<version>1.2.17</version>
<scope>runtime</scope>
</dependency>
</dependencies>
<build>
<plugins>
<!-- Java Compiler -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.1</version>
<configuration>
<source>${java.version}</source>
<target>${java.version}</target>
</configuration>
</plugin>
<!-- We use the maven-shade plugin to create a fat jar that contains all necessary dependencies. -->
<!-- Change the value of <mainClass>...</mainClass> if your program entry point changes. -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.0.0</version>
<executions>
<!-- Run shade goal on package phase -->
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<artifactSet>
<excludes>
<exclude>org.apache.flink:force-shading</exclude>
<exclude>com.google.code.findbugs:jsr305</exclude>
<exclude>org.slf4j:*</exclude>
<exclude>log4j:*</exclude>
</excludes>
</artifactSet>
<filters>
<filter>
<!-- Do not copy the signatures in the META-INF folder.
Otherwise, this might cause SecurityExceptions when using the JAR. -->
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<mainClass>com.bolingcavalry.StreamingJob</mainClass>
</transformer>
</transformers>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
<pluginManagement>
<plugins>
<!-- This improves the out-of-the-box experience in Eclipse by resolving some warnings. -->
<plugin>
<groupId>org.eclipse.m2e</groupId>
<artifactId>lifecycle-mapping</artifactId>
<version>1.0.0</version>
<configuration>
<lifecycleMappingMetadata>
<pluginExecutions>
<pluginExecution>
<pluginExecutionFilter>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<versionRange>[3.0.0,)</versionRange>
<goals>
<goal>shade</goal>
</goals>
</pluginExecutionFilter>
<action>
<ignore/>
</action>
</pluginExecution>
<pluginExecution>
<pluginExecutionFilter>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<versionRange>[3.1,)</versionRange>
<goals>
<goal>testCompile</goal>
<goal>compile</goal>
</goals>
</pluginExecutionFilter>
<action>
<ignore/>
</action>
</pluginExecution>
</pluginExecutions>
</lifecycleMappingMetadata>
</configuration>
</plugin>
</plugins>
</pluginManagement>
</build>
</project>
2.3、创建StreamingJob.java
package com.geostar.geosmarter.socketwordcount;
import org.apache.flink.api.common.functions.FlatMapFunction;
import org.apache.flink.api.common.functions.ReduceFunction;
import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.api.windowing.time.Time;
import org.apache.flink.util.Collector;
public class StreamingJob {
public static void main(String[] args) throws Exception {
//环境信息
final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
//数据来源是本机9999端口,换行符分隔,您也可以考虑将hostname和port参数通过main方法的入参传入
DataStream<String> text = env.socketTextStream("localhost", 9999, "\n");
//通过text对象转换得到新的DataStream对象,
//转换逻辑是分隔每个字符串,取得的所有单词都创建一个WordWithCount对象
DataStream<WordWithCount> windowCounts = text.flatMap(new FlatMapFunction<String, WordWithCount>() {
@Override
public void flatMap(String s, Collector<WordWithCount> collector) throws Exception {
for(String word : s.split("\\s")){
collector.collect(new WordWithCount(word, 1L));
}
}
})
.keyBy("word")//key为word字段
.timeWindow(Time.seconds(5)) //五秒一次的翻滚时间窗口
.reduce(new ReduceFunction<WordWithCount>() { //reduce策略
@Override
public WordWithCount reduce(WordWithCount a, WordWithCount b) throws Exception {
return new WordWithCount(a.word, a.count+b.count);
}
});
//单线程输出结果
windowCounts.print().setParallelism(1);
// 执行
env.execute("Flink Streaming Java API Skeleton");
}
}
2.4、编程成jar包
2.5、回到flink工作台输入:nc -l 9999
2.6、提交jar,创建任务
2.7、输入一些英文字母,结束输入。
2.8、查看任务
备注:参考https://blog.youkuaiyun.com/boling_cavalry/article/details/85059168