文章目录
异步 I/O 是为了解决 flink 与外部系统 (REST SERVER/Hbase/Mysql 等) 进行频繁交互时的延时而提出的一个特性。
官方文档见
- https://ci.apache.org/projects/flink/flink-docs-master/dev/stream/operators/asyncio.html
- https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65870673
- https://docs.google.com/document/d/1Lr9UYXEz6s6R_3PWg3bZQLF3upGaNEkc0rQCFSzaYDI/edit#
中文博客也有很多介绍:
- http://wuchong.me/blog/2017/05/17/flink-internals-async-io/
- https://blog.icocoro.me/2019/05/26/1905-apache-flinkv2-asyncio/
这里只提供代码示例。
先简单说一下业务逻辑:有一个 scoreDataStream 流,流中是 Score 信息,需要通过 Score 中的 stu_id http 请求获取 Student 信息,然后组合输出。代码中使用了 httpasyncclient 实现回调。
- AsyncHttpRequest 算子,继承 RichAsyncFunction<IN, OUT>
import com.google.gson.Gson;
import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.streaming.api.functions.async.ResultFuture;
import org.apache.flink.streaming.api.functions.async.RichAsyncFunction;
import org.apache.flink.util.Preconditions;
import org.apache.http.HttpEntity;
import org.apache.http.HttpResponse;
import org.apache.http.client.config.RequestConfig;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.concurrent.FutureCallback;
import org.apache.http.impl.nio.client.CloseableHttpAsyncClient;
import org.apache.http.impl.nio.client.HttpAsyncClients;
import org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager;
import org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor;
import org.apache.http.impl.nio.reactor.IOReactorConfig;
import org.apache.http.nio.reactor.ConnectingIOReactor;
import org.apache.http.util.EntityUtils;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.io.IOException;
import java.util.Collections;
import java.util.concurrent.CancellationException;
/**
* An implementation of the 'AsyncFunction' that sends requests and sets the callback.
*/
class AsyncHttpRequest extends RichAsyncFunction<Score, Tuple2<Score,Student>> {
/** The database specific client that can issue concurrent requests with callbacks */
private transient CloseableHttpAsyncClient client;
// 连接超时 ms(三次握手建立连接的时间)
private int connectionTimeOut;
// socket 超时 ms (http 请求返回结果的时间)
private int socketTimeOut;
// 从连接池中获取 connection 的超时时间(默认不限制,连接用完后会阻塞在这里)
private int connectionRequestTimeOut = -1