Flink针对标准的流处理和批处理提供了两种关系型API:Table API 和 SQL。Table API 可以直接进行select、filter、join等操作;Flink SQL则是基于Apache Calcite实现标准的SQL,和SQL语言一致,适合大部分开发人员。
Flink Table API和SQL 捆绑在Flink-Table依赖中,如果要使用需要添加一下依赖:
以Flink 1.7.2为例
<!--java-->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-table_2.11</artifactId>
<version>1.7.2</version>
</dependency>
<!--scala-->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-scala_2.11</artifactId>
<version>1.7.2</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-streaming-scala_2.11</artifactId>
<version>1.7.2</version>
</dependency>
Table API和SQL的基本使用
一、首先需要创建一个TableEnvironment。TableEnvironment可以实现以下功能:
- 通过内部目录创建表
- 通过外部目录创建表
- 执行sql查询
- 注册用户自定义的Fuction
- 将DataStream和DataSet 转换成 Table
流数据查询
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
StreamTableEnvironment tableEnv = TableEnvironment.getTableEnvironment(env);
批数据查询
ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
BatchTableEnvironment tableEnv = BatchTableEnvironment.getTableEnvironment(env);
二、通过获取到的TableEnvironment对象创建Table对象,有两种类型的Table对象:输入Table(TableSource)和输出Table(TableSink)
TableSource
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
StreamTableEnvironment tableEnv = TableEnvironment.getTableEnvironment(env);
//CsvTableSource: 文件路径、字段名、字段类型
TableSource csvSource = new CsvTableSource("path",new String[]{
"name","age"},new TypeInformation[]{
Types.STRING,Types.INT});
//注册一个TableSource,称为CsvTable
tableEnv.registerTableSource("CsvTable", csvSource);
TableSink
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
StreamTableEnvironment tableEnv = TableEnvironment.getTableEnvironment(env);
//通过TableSink把数据写到外部
//创建一个TableSink
TableSink csvSink = new CsvTableSink("path","字段之间的格式 ,");
//定义字段名和类型
String[] fieldNames = {
"cid", "cname", "revsum"};
TypeInformation[] filedTypes = {
Types.INT, Types.STRING, Types.INT};
//注册一个TableSink
tableEnv.