需求
实时流需要和维护表Join做属性的扩展.
Spark-Streaming可以 stream join hive表.
flink没发现这个功能,所以将维度表放在ES上.
maven依赖
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<flink.version>1.6.2</flink.version>
<fastjson.version>1.2.47</fastjson.version>
<elasticsearch.version>6.3.0</elasticsearch.version>
<guava.version>25.1-jre</guava.version>
</properties>
...
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>fastjson</artifactId>
<version>${fastjson.version}</version>
</dependency>
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-high-level-client</artifactId>
<version>${elasticsearch.version}</version>
</dependency>
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>${guava.version}</version>
</dependency>
说明:因为根据关键字读取ES表的数据,且采用guava做缓存,减少多次拉取ES的次数.
新建类AsyncEsDataRequest继承RichAsyncFunction类.
package com.tc.flink.demo.es;
import com.alibaba.fastjson.JSONObject;
import com.google.common.cache.Cache;
import com.google.common.cache.CacheBuilder;
import com.google.common.cache.RemovalListener;
import com.google.common.cache.RemovalNotification;
import com.tc.flink.util.CommonUtil;
import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.streaming.api.functions.async.ResultFuture;
import org.apache.flink.streaming.api.functions.async.RichAsyncFunction;
import org.elasticsearch.action.ActionListener;
import org.elasticsearch.action.search.SearchRequest;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.client