Flink-Connector-Redis 使用指南-优快云博客

Flink-Connector-Redis 使用指南

【免费下载链接】flink-connector-redis Asynchronous connector based on the Lettuce, supporting sql join and sink, query caching and debugging. 项目地址: https://gitcode.com/gh_mirrors/fl/flink-connector-redis

项目介绍

Flink-Connector-Redis 是基于 bahir-flink 二次开发的异步 Redis 连接器，使用 Lettuce 替换 Jedis，将同步读写改为异步读写，大幅提升性能。该项目支持 Table/SQL API，提供了 select/维表 join 查询支持，并增加了关联查询缓存、整行保存功能、限流功能等特性。

主要特性包括：

使用 Lettuce 替换 Jedis，同步读写改为异步读写，性能大幅提升
增加 Table/SQL API，支持 select/维表 join 查询
增加关联查询缓存（支持增量与全量）
支持整行保存功能，用于多字段的维表关联查询
增加限流功能，用于 Flink SQL 在线调试
支持 Flink 高版本（1.12、1.13、1.14+）
统一过期策略等功能
支持 Flink CDC 删除及其他 RowKind.DELETE 操作
支持 select 查询

安装部署

工程直接引用

项目依赖 Lettuce(6.2.1) 及 netty-transport-native-epoll(4.1.82.Final)，如果 Flink 环境中已有这两个包，则使用 flink-connector-redis-1.4.3.jar，否则使用 flink-connector-redis-1.4.3-jar-with-dependencies.jar。

在 Maven 项目中添加依赖：

<dependency>
    <groupId>io.github.jeff-zou</groupId>
    <artifactId>flink-connector-redis</artifactId>
    <version>1.4.3</version>
</dependency>

自行打包

使用 Maven 打包命令：mvn package，将生成的包放入 Flink lib 目录中即可，无需其他设置。

配置参数说明

主要参数

字段	默认值	类型	说明
connector	(none)	String	必须设置为 `redis`
host	(none)	String	Redis IP 地址
port	6379	Integer	Redis 端口
password	null	String	Redis 密码，如果没有设置则为 null
database	0	Integer	默认使用 db0
timeout	2000	Integer	连接超时时间，单位 ms
cluster-nodes	(none)	String	集群模式下的节点地址，格式：ip:port,ip:port
command	(none)	String	Redis 命令类型
redis-mode	(none)	Integer	Redis 模式：single、cluster、sentinel
lookup.cache.max-rows	-1	Integer	查询缓存大小
lookup.cache.ttl	-1	Integer	查询缓存过期时间（秒）
lookup.cache.load-all	false	Boolean	是否开启全量缓存
max.retries	1	Integer	写入/查询失败重试次数
value.data.structure	column	String	值数据结构：column 或 row
set.if.absent	false	Boolean	仅在 key 不存在时才写入
io.pool.size	(none)	Integer	Lettuce 的 IO 线程池大小
event.pool.size	(none)	Integer	Lettuce 的事件线程池大小

Redis 命令对应关系

command 值	写入操作	查询操作	维表关联	删除操作
set	set	get	get	del
hset	hset	hget	hget	hdel
rpush	rpush	lrange
lpush	lpush	lrange
incrBy	incrBy	get	get	写入相对值
hincrBy	hincrBy	hget	hget	写入相对值
sadd	sadd	srandmember		srem
zadd	zadd	zscore	zscore	zrem

值数据结构配置

value.data.structure = column（默认）

无需通过 primary key 来映射 Redis 中的 Key，直接由 DDL 中的字段顺序来决定：

-- username 为 key, passport 为 value
create table sink_redis(username VARCHAR, passport VARCHAR) with ('command'='set')

-- name 为 map 结构的 key, subject 为 field, score 为 value
create table sink_redis(name VARCHAR, subject VARCHAR, score VARCHAR) with ('command'='hset')

value.data.structure = row

整行内容保存至 value 并以 '\01' 分割：

-- key: username, value: username\01passport
create table sink_redis(username VARCHAR, passport VARCHAR) with ('command'='set')

-- key: name, field: subject, value: name\01subject\01score
create table sink_redis(name VARCHAR, subject VARCHAR, score VARCHAR) with ('command'='hset')

数据类型转换

Flink 类型	Redis 转换方式
CHAR/VARCHAR/String	String
BOOLEAN	String.valueOf(boolean)/Boolean.valueOf(String)
BINARY/VARBINARY	Base64 编解码
DECIMAL	BigDecimal.toString/DecimalData.fromBigDecimal
TINYINT/SMALLINT/INTEGER	对应包装类的 valueOf 方法
DATE	从 epoch 开始的天数（int）
TIME	从 0 点开始的毫秒数（int）
BIGINT	Long.valueOf/String.valueOf
FLOAT/DOUBLE	Float/Double 的 valueOf 方法
TIMESTAMP	从 epoch 开始的毫秒数（long）

使用示例

维表查询示例

-- 创建 Redis 表
create table sink_redis(name varchar, level varchar, age varchar) with (
    'connector'='redis', 
    'host'='10.11.80.147',
    'port'='7001', 
    'redis-mode'='single',
    'password'='******',
    'command'='hset'
);

-- 插入数据到 Redis
insert into sink_redis select * from (values ('3', '3', '100'));

-- 创建维表
create table dim_table (name varchar, level varchar, age varchar) with (
    'connector'='redis',
    'host'='10.11.80.147',
    'port'='7001', 
    'redis-mode'='single', 
    'password'='*****',
    'command'='hget', 
    'lookup.cache.max-rows'='10', 
    'lookup.cache.ttl'='10', 
    'max-retries'='3'
);

-- 创建数据源表
create table source_table (username varchar, level varchar, proctime as procTime()) with (
    'connector'='datagen',  
    'rows-per-second'='1',  
    'fields.username.kind'='sequence',  
    'fields.username.start'='1',  
    'fields.username.end'='10', 
    'fields.level.kind'='sequence',  
    'fields.level.start'='1',  
    'fields.level.end'='10'
);

-- 创建结果表
create table sink_table(username varchar, level varchar, age varchar) with ('connector'='print');

-- 执行关联查询
insert into sink_table
select
    s.username,
    s.level,
    d.age
from
    source_table s
left join dim_table for system_time as of s.proctime as d on
    d.name = s.username
    and d.level = s.level;

多字段维表关联查询

-- 创建表并启用整行保存
create table sink_redis(uid VARCHAR, score double, score2 double) with (
    'connector' = 'redis',
    'host' = '10.11.69.176',
    'port' = '6379',
    'redis-mode' = 'single',
    'password' = '****',
    'command' = 'SET',
    'value.data.structure' = 'row'
);

-- 写入测试数据
insert into sink_redis select * from (values ('1', 10.3, 10.1));

-- 创建关联查询表
create table join_table with ('command'='get', 'value.data.structure'='row') like sink_redis;

-- 创建结果表
create table result_table(uid VARCHAR, username VARCHAR, score double, score2 double) with ('connector'='print');

-- 创建源表
create table source_table(uid VARCHAR, username VARCHAR, proc_time as procTime()) with (
    'connector'='datagen', 
    'fields.uid.kind'='sequence', 
    'fields.uid.start'='1', 
    'fields.uid.end'='2'
);

-- 执行关联查询
insert into result_table
select
    s.uid,
    s.username,
    j.score, -- 来自维表
    j.score2 -- 来自维表
from
    source_table as s
join join_table for system_time as of s.proc_time as j on
    j.uid = s.uid;

Redis Cluster 写入示例

StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
EnvironmentSettings environmentSettings = EnvironmentSettings.newInstance().useBlinkPlanner().inStreamingMode().build();
StreamTableEnvironment tEnv = StreamTableEnvironment.create(env, environmentSettings);

String ddl = "create table sink_redis(username VARCHAR, passport VARCHAR) with ( 'connector'='redis', " +
              "'cluster-nodes'='10.11.80.147:7000,10.11.80.147:7001','redis-mode'='cluster','password'='******','command'='set')";

tEnv.executeSql(ddl);
String sql = " insert into sink_redis select * from (values ('test', 'test11'))";
TableResult tableResult = tEnv.executeSql(sql);
tableResult.getJobClient().get().getJobExecutionResult().get();

开发环境要求

IDE: IntelliJ IDEA
代码格式: google-java-format + Save Actions
Flink 版本: 1.12/1.13/1.14+
JDK: 1.8
Lettuce: 6.2.1

注意事项

Redis 不支持两段提交，无法实现精确一次语义
使用前请确保 Redis 服务正常运行
生产环境建议配置合适的连接池参数和重试机制
维表查询建议根据业务需求配置合适的缓存策略

问题排查

如果遇到连接问题，请检查：

Redis 服务是否正常运行
网络连接是否通畅
认证信息是否正确
防火墙设置是否正确

对于性能问题，可以调整：

IO 线程池大小
事件线程池大小
查询缓存配置
重试次数和超时时间

创作声明：本文部分内容由AI辅助生成（AIGC），仅供参考