文章目录
Flink Kafka Connector分析
1 FlinkKafkaConsumer
Flink kafka consumer connector 是用来消费kafka数据到flink系统的连接器,作为flink系统的一个source存在。目前flink支持的kafka版本有0.8、0.9、0.10、0.11以及2.0+。由于目前我们使用的kafka版本是0.10.0.1,所以接下来主要基于0.10来分析。
上图是FlinkKafkaConsumer的继承关系图,我们关心的FlinkKafkaConsumer010是继承自FlinkKafkaConsumer09,并且FlinkKafkaConsumer010是一个RichFunction。它的基础类是FlinkKafkaConsumerBase,接下来我们开始分析FlinkKafkaConsumerBase
1.1 FlinkKafkaConsumerBase
FlinkKafkaConsumerBase是FlinkKafkaConsumer的基础类,也是非常核心的一个类,它除了继承自RichParallelSourceFunction,还实现了CheckpointedFunction和CheckpointListener这两个接口,主要用于checkpoint的快照保存和恢复以及快照完成后执行的回调。
public abstract class FlinkKafkaConsumerBase<T> extends RichParallelSourceFunction<T> implements
CheckpointListener,
ResultTypeQueryable<T>,
CheckpointedFunction
下面来看看FlinkKafkaConsumerBase里定义了哪些属性?
//FlinkKafkaConsumerBase
//pendingOffsetsToCommit中最多保存100个checkoint,超过会删除最旧的。
public static final int MAX_NUM_PENDING_CHECKPOINTS = 100;
//partition discovery开关,默认关闭
public static final long PARTITION_DISCOVERY_DISABLED = Long.MIN_VALUE;
//metrics开关
public static final String KEY_DISABLE_METRICS = "flink.disable-metrics";
//partition discovery 间隔时间的key
public static final String KEY_PARTITION_DISCOVERY_INTERVAL_MILLIS = "flink.partition-discovery.interval-millis";
//partition-offset state的名字
private static final String OFFSETS_STATE_NAME = "topic-partition-offset-states";
//topic的描述,即:topic的名称
private final KafkaTopicsDescriptor topicsDescriptor;
//kafka消息反序列化的schema
protected final KafkaDeserializationSchema<T> deserializer;
//从指定offset位置开始订阅topic的分区
private Map<KafkaTopicPartition, Long> subscribedPartitionsToStartOffsets;
//周期性watermark分配器
private SerializedValue<AssignerWithPeriodicWatermarks<T>> periodicWatermarkAssigner;
//间隙性watermark分配器
private SerializedValue<AssignerWithPunctuatedWatermarks<T>> punctuatedWatermarkAssigner;
//基于checkpoint提交offset开关,默认打开
private boolean enableCommitOnCheckpoints = true;
//使用当前topic描述去过滤不匹配的分区(基于快照恢复时)
private boolean filterRestoredPartitionsWithCurrentTopicsDescriptor = true;
//offset提交模式:关闭/基于checkpoint提交/基于kafka周期性提交
private OffsetCommitMode offsetCommitMode;
//partition discovery 间隔时间
private final long discoveryIntervalMillis;
//从指定offset位置开始订阅topic的分区的模式:EARLIEST/LATEST/GROUP_OFFSETS/SPECIFIC_OFFSETS/TIMESTAMP,其中0.10.0.1版本不支持TIMESTAMP
private StartupMode startupMode = StartupMode.GROUP_OFFSETS;
//从指定特殊的offset位置开始订阅topic的分区
private Map<KafkaTopicPartition, Long> specificStartupOffsets;
//从指定特殊的时间对应的offset位置开始订阅topic的分区
private Long startupOffsetsTimestamp;
//记录正在进行的快照(即partition-offset的state)
private final LinkedMap pendingOffsetsToCommit = new LinkedMap();
//用于从kafka 拉取数据
private transient volatile AbstractFetcher<T, ?> kafkaFetcher;
//用于分区实时发现
private transient volatile AbstractPartitionDiscoverer partitionDiscoverer;
//基于该state恢复
private transient volatile TreeMap<KafkaTopicPartition, Long> restoredState;
//保存的partition-offset state,为union state
private transient ListState<Tuple2<KafkaTopicPartition, Long>> unionOffsetStates;
//是否支持从老的状态恢复,老状态是指Flink 1.1 or 1.2.的状态
private boolean restoredFromOldState;
//分区实时发现的线程
private transient volatile Thread discoveryLoopThread;
//运行标志
private volatile boolean running = true;
//mrtrics
private final boolean useMetrics;
private transient Counter successfulCommits;
private transient Counter failedCommits;
private transient KafkaCommitCallback offsetCommitCallback;
FlinkKafkaConsumerBase中暴露给用户的api包括assignTimestampsAndWatermarks、setStartFromEarliest、setStartFromLatest、disableFilterRestoredPartitionsWithSubscribedTopics等,主要用于watermark的设置以及从指定位置开始消费数据。
FlinkKafkaConsumerBase中最先执行的方法是initializeState,主要用于状态的初始化:
(1) 从上下文获取stateStore
(2) 从stateStore中获取状态,包括老状态(应该是为了兼容1.2版本)和新状态,如果有老状态,迁移到unionOffsetStates 中
(3) 将unionOffsetStates 插入到restoredState,用于恢复状态。
//FlinkKafkaConsumerBase
public final void initializeState(FunctionInitializationContext context) throws Exception {
OperatorStateStore stateStore = context.getOperatorStateStore();
//获取老版本的状态
ListState<Tuple2<KafkaTopicPartition, Long>> oldRoundRobinListState =
stateStore.getSerializableListState(DefaultOperatorStateBackend.DEFAULT_OPERATOR_STATE_NAME);
//从stateStore中获取状态,如果没有会创建。
this.unionOffsetStates = stateStore.getUnionListState(new ListStateDescriptor<>(
OFFSETS_STATE_NAME,
TypeInformation.of(new TypeHint<Tuple2<KafkaTopicPartition, Long>>() {
})));
if (context.isRestored() && !restoredFromOldState) {
restoredState = new TreeMap<>(new KafkaTopicPartition.Comparator());
// 迁移老的状态到unionOffsetStates
for (Tuple2<KafkaTopicPartition, Long> kafkaOffset : oldRoundRobinListState.get()) {
restoredFromOldState = true;
unionOffsetStates.add(kafkaOffset);
}
oldRoundRobinListState.clear();
//存在老的状态并且partition discovery没有关闭
if (restoredFromOldState && discoveryIntervalMillis != PARTITION_DISCOVERY_DISABLED) {
//抛异常
}
// 将unionOffsetStates中的状态插入restoredState中
for (Tuple2<KafkaTopicPartition, Long> kafkaOffset : unionOffsetStates.get()) {
restoredState.put(kafkaOffset.f0, kafkaOffset.f1);
}
//log
} else {
//log
}
}
FlinkKafkaConsumerBase的open方法主要用于初始化,执行于initializeState方法之后,下面结合代码来看看具体流程:
(1) 初始化offsetCommitMode,在开启了checkpoint的情况下,如果enableCommitOnCheckpoint开启,则为ON_CHECKPOINTS,否则为DISABLED;如果未开启checkpoint,开启了自动提交offset,则为KAFKA_PERIODIC,否则为DISABLED。最后如果offsetCommitMode为ON_CHECKPOINTS或DISABLED,enable.auto.commit将被设置为false,具体是在adjustAutoCommitConfig方法中实现,比较简单。offsetCommitMode初始化完成后,接着初始化partitionDiscoverer,010创建的是Kafka010PartitionDiscoverer,然后调用AbstractPartitionDiscoverer的open方法,主要是初始化kafka consumer。
offsetCommitMode | 描述 |
---|---|
DISABLED | 不开启offset提交 |
ON_CHECKPOINTS | 确保checkpoint完成以后再提交offset到kafka |
KAFKA_PERIODIC | 周期性的自动提交offset到kafka |
(2) 查找topic对应的所有分区,并初始化每个分区的消费位点,保存到subscribedPartitionsToStartOffsets中。
startupMode | 描述 |
---|---|
GROUP_OFFSETS | 从该组最后提交的offset位置开始消费 |
EARLIEST | 从开始位置开始消费 |
LATEST | 从结束位置开始消费 |
TIMESTAMP | 从指定时间戳对应的提交的offset位置开始消费 |
SPECIFIC_OFFSETS | 从指定的offset位置开始消费 |
//FlinkKafkaConsumerBase
public void open(Configuration configuration) throws Exception {
// 初始化offsetCommitMode
this.offsetCommitMode = OffsetCommitModes.fromConfiguration(
getIsAutoCommitEnabled(),
enableCommitOnCheckpoints,
((StreamingRuntimeContext) getRuntimeContext()).isCheckpointingEnabled());
// 初始化partitionDiscoverer,并调用open方法
this.partitionDiscoverer = createPartitionDiscoverer(
topicsDescriptor,
getRuntimeContext().getIndexOfThisSubtask(),
getRuntimeContext().getNumberOfParallelSubtasks());
this.partitionDiscoverer.open();
subscribedPartitionsToStartOffsets = new HashMap<>();
// 查找topic的所有分区,后面介绍PartitionDiscoverer时会详细介绍
final List<KafkaTopicPartition> allPartitions = partitionDiscoverer.discoverPartitions();
// 如果从状态恢复
if (restoredState != null) {
for (KafkaTopicPartition partition : allPartitions) {
if (!restoredState.containsKey(partition)) {
//如果恢复的状态不包含该分区,则默认以EARLIEST开始消费,并插入restoredState
restoredState.put(partition, KafkaTopicPartitionStateSentinel.EARLIEST_OFFSET);
}
}
//遍历restoredState
for (Map.Entry<KafkaTopicPartition, Long> restoredStateEntry : restoredState.entrySet()) {
if (!restoredFromOldState) {
//不从老状态恢复,则分配给该task订阅的分区和offset,并插入subscribedPartitionsToStartOffsets
if (KafkaTopicPartitionAssigner.assign(
restoredStateEntry.getKey(), getRuntimeContext().getNumberOfParallelSubtasks())
== getRuntimeContext().getIndexOfThisSubtask()){
subscribedPartitionsToStartOffsets.put(restoredStateEntry.getKey(), restoredStateEntry.getValue());
}
} else {
// 老状态直接插入subscribedPartitionsToStartOffsets
subscribedPartitionsToStartOffsets.put(restoredStateEntry.getKey(), restoredStateEntry.getValue());
}
}
//过滤不匹配的分区
if (filterRestoredPartitionsWithCurrentTopicsDescriptor) {
subscribedPartitionsToStartOffsets.entrySet().removeIf(entry -> {
if (!topicsDescriptor.isMatchingTopic(entry.getKey().getTopic())) {
//log
return true;
}
return false;
});
}
//log
} else {
// 不从状态恢复
switch (startupMode) {
//特殊offset恢复
case SPECIFIC_OFFSETS:
if (specificStartupOffsets == null) {
// throw IllegalStateException
}
for (KafkaTopicPartition seedPartition : allPartitions) {
Long specificOffset = specificStartupOffsets.get(seedPartition);
if (specificOffset != null) {
//从specificStartupOffsets中获取分区对应的offset并不为空,并插入subscribedPartitionsToStartOffsets,用于从此位置订阅分区。
subscribedPartitionsToStartOffsets.put(seedPartition, specificOffset - 1);
} else {
//从specificStartupOffsets中获取分区对应的offset为空,默认以GROUP_OFFSET方式订阅
subscribedPartitionsToStartOffsets.put(seedPartition, KafkaTopicPartitionStateSentinel.GROUP_OFFSET);
}
}
break;
//指定时间戳对应的offset恢复,适用于0.10.2+版本的kafka
case TIMESTAMP:
if (startupOffsetsTimestamp == null) {
//throw IllegalStateException
}
//查找时间戳对应的partition-offset,如果对应的offset为空,默认以LATEST_OFFSET方式订阅
for (Map.Entry<KafkaTopicPartition, Long> partitionToOffset
: fetchOffsetsWithTimestamp(allPartitions, startupOffsetsTimestamp).entrySet()) {
subscribedPartitionsToStartOffsets.put(
partitionToOffset.getKey(),
(partitionToOffset.getValue() == null)
? KafkaTopicPartitionStateSentinel.LATEST_OFFSET
: partitionToOffset.getValue() - 1);
}
break;
//否则以GROUP_OFFSET方式订阅
default:
for (KafkaTopicPartition seedPartition : allPartitions) {
subscribedPartitionsToStartOffsets.put(seedPartition, startupMode.getStateSentinel());
}
}
//log
}
}
初始化完成后,接下来就到了核心的run方法了,这才是实际的执行逻辑,下面来看看run方法做了些什么:
(1) 初始化metric,初始化offsetCommitCallback,创建kafkaFetcher
(2) 如果开启了PARTITION_DISCOVERY,启动partition discovery线程和fetch loop,否则仅启动fetch loop
//FlinkKafkaConsumerBase
public void run(SourceContext<T> sourceContext) throws Exception {
if (subscribedPartitionsToStartOffsets == null) {
//throw Exception
}
// 初始化successfulCommits和failedCommits这两个metric,省略
//获取当前subtask的index
final int subtaskIndex = this.getRuntimeContext().getIndexOfThisSubtask();
//初始化offsetCommitCallback,提交offset的回调函数,省略