1.概述
kafka客户端通过KafkaProducer对象,然后指定具体的Topic和具体的消息即可将消息发送到Topic的某个分区,如果需要控制消息的路有规则,则需要指定分区的实现函数和分区的key。
2.重要类讲解
Cluster:代表一个当前kafka集群的nodes,topics和partitions子集
NetworkClient:一个针对异步请求/应答的网络IO 的网络客户端。这是一个内部类,用来实现用户层面的生产消费者客户端。非线程安全的。
Sender:一个后台线程,主要负责发送生产请求到kafka集群。该线程会更新kafka集群的metadata,将produce Request发送到正确的节点。
RecordAccumulator:该类内部维护的是一个ConcurrentMap<TopicPartition, Deque<ProducerBatch>> batches队列,将records追加到MemoryRecords实例中,用于发送到server端。
ProducerBatch:一批准备发送的消息。内部维护了一个MemoryRecordsBuilder,MemoryRecordsBuilder内部维护了一个MemoryRecords。
MemoryRecords:用一个byteBuffer支撑的Records的实现。producer传输数据和broker写入文件数据都是用这个传输。
3.源码
3.1 Producer实例化过程
producer = new KafkaProducer<>(props);
1) new Selector传递给NetworkClient
2) new NetworkClient
3) new Sender
4) new KafkaThread并将构建的send对象,当做该线程的runnable。并启动该线程。
5) new ProducerMetadata
6) new RecordAccumulator。此时需要关注的两个配置是: batch.size:批量发送的大小;linger.ms:超时发送的时间。
3.2 消息加入发送队列的过程
1) 调用KafkaProducer.send发送消息
private Future<RecordMetadata> doSend(ProducerRecord<K, V> record, Callback callback) {
2) 对消息按照partition策略进行分区。
// 获取分区号
int partition = partition(record, serializedKey, serializedValue, cluster);
3) 将消息追加到RecordAccumulator。
// 将消息追加到RecordAccumulator
RecordAccumulator.RecordAppendResult result = accumulator.append(tp, timestamp, serializedKey,
serializedValue, headers, interceptCallback, remainingWaitMs, true, nowMs);
4) 根据topic和partition信息获取一个recordBatch,然后在获取MemoryRecords,将消息加入其中
// 先根据topic和partition信息获取一个ProducerBatch
Deque<ProducerBatch> dq = getOrCreateDeque(tp);
// ProducerBatch类非安全,需要加外部同步
synchronized (dq) {
if (closed)
throw new KafkaException("Producer closed while send in progress");
// 获取最后一个ProducerBatch,将消息追加到该ProducerBatch里面
RecordAppendResult appendResult = tryAppend(timestamp, key, value, headers, callback, dq, nowMs);
if (appendResult != null)
return appendResult;
}
5) RecordAccumulator.tryAppend
private RecordAppendResult tryAppend(long timestamp, byte[] key, byte[] value, Header[] headers,
Callback callback, Deque<ProducerBatch> deque, long nowMs) {
// 获取最后一个RecordBatch
ProducerBatch last = deque.peekLast();
if (last != null) {
// 将消息追加到该RecordBatch里面
FutureRecordMetadata future = last.tryAppend(timestamp, key, value, headers, callback, nowMs);
if (future == null)
last.closeForRecordAppends();
else
return new RecordAppendResult(future, deque.size() > 1 || last.isFull(), false, false);
}
return null;
}
6) ProducerBatch.tryAppend
public FutureRecordMetadata tryAppend(long timestamp, byte[] key, byte[] value, Header[] headers, Callback callback, long now) {
// 首先会判断是否有充足的空间
if (!recordsBuilder.hasRoomFor(timestamp, key, value, headers)) {
return null;
} else {
// 将消息加入memoryRecords
Long checksum = this.recordsBuilder.append(timestamp, key, value, headers);
this.maxRecordSize = Math.max(this.maxRecordSize, AbstractRecords.estimateSizeInBytesUpperBound(magic(),
recordsBuilder.compressionType(), key, value, headers));
this.lastAppendTime = now;
// 注意:这里new了一个future,等待客户端响应的时候才会执行
FutureRecordMetadata future = new FutureRecordMetadata(this.produceFuture, this.recordCount,
timestamp, checksum,
key == null ? -1 : key.length,
value == null ? -1 : value.length,
Time.SYSTEM);
// we have to keep every future returned to the users in case the batch needs to be
// split to several new batches and resent.
thunks.add(new Thunk(callback, future));
this.recordCount++;
return future;
}
}
3.3 消息发送的过程
1)Sender.runOnce
while (running) {
try {
runOnce();
} catch (Exception e) {
log.error("Uncaught error in kafka producer I/O thread: ", e);
}
}
2) sendProducerData
long currentTimeMs = time.milliseconds();
long pollTimeout = sendProducerData(currentTimeMs);
client.poll(pollTimeout, currentTimeMs);
3) sendProduceData
private long sendProducerData(long now) {
//获取当前cluster信息
Cluster cluster = metadata.fetch();
//获取当前准备好发送的有数据的分区
// get the list of partitions with data ready to send
RecordAccumulator.ReadyCheckResult result = this.accumulator.ready(cluster, now);
// if there are any partitions whose leaders are not known yet, force metadata update
// 刷新未知leader的partition元数据信息
if (!result.unknownLeaderTopics.isEmpty()) {
// The set of topics with unknown leader contains topics with leader election pending as well as
// topics which may have expired. Add the topic again to metadata to ensure it is included
// and request metadata update, since there are messages to send to the topic.
for (String topic : result.unknownLeaderTopics)
this.metadata.add(topic, now);
log.debug("Requesting metadata update due to unknown leader topics from the batched records: {}",
result.unknownLeaderTopics);
this.metadata.requestUpdate();
}
// 移除不能发送Request的node
// remove any nodes we aren't ready to send to
Iterator<Node> iter = result.readyNodes.iterator();
long notReadyTimeout = Long.MAX_VALUE;
while (iter.hasNext()) {
Node node = iter.next();
if (!this.client.ready(node, now)) {
iter.remove();
notReadyTimeout = Math.min(notReadyTimeout, this.client.pollDelayMs(node, now));
}
}
// create produce requests
// 从accumulator拉取ProducerBatch
Map<Integer, List<ProducerBatch>> batches = this.accumulator.drain(cluster, result.readyNodes, this.maxRequestSize, now);
addToInflightBatches(batches);
if (guaranteeMessageOrder) {
// Mute all the partitions drained
for (List<ProducerBatch> batchList : batches.values()) {
for (ProducerBatch batch : batchList)
this.accumulator.mutePartition(batch.topicPartition);
}
}
accumulator.resetNextBatchExpiryTime();
List<ProducerBatch> expiredInflightBatches = getExpiredInflightBatches(now);
List<ProducerBatch> expiredBatches = this.accumulator.expiredBatches(now);
expiredBatches.addAll(expiredInflightBatches);
if (!expiredBatches.isEmpty())
log.trace("Expired {} batches in accumulator", expiredBatches.size());
for (ProducerBatch expiredBatch : expiredBatches) {
String errorMessage = "Expiring " + expiredBatch.recordCount + " record(s) for " + expiredBatch.topicPartition
+ ":" + (now - expiredBatch.createdMs) + " ms has passed since batch creation";
failBatch(expiredBatch, -1, NO_TIMESTAMP, new TimeoutException(errorMessage), false);
if (transactionManager != null && expiredBatch.inRetry()) {
// This ensures that no new batches are drained until the current in flight batches are fully resolved.
transactionManager.markSequenceUnresolved(expiredBatch);
}
}
sensors.updateProduceRequestMetrics(batches);
long pollTimeout = Math.min(result.nextReadyCheckDelayMs, notReadyTimeout);
pollTimeout = Math.min(pollTimeout, this.accumulator.nextExpiryTimeMs() - now);
pollTimeout = Math.max(pollTimeout, 0);
if (!result.readyNodes.isEmpty()) {
log.trace("Nodes with data ready to send: {}", result.readyNodes);
pollTimeout = 0;
}
// 发送request
sendProduceRequests(batches, now);
return pollTimeout;
}
4) NetworkClient真正发送的逻辑,最终调用的org.apache.kafka.common.network.Selector的poll逻辑和Broker一样
public List<ClientResponse> poll(long timeout, long now) {
ensureActive();
if (!abortedSends.isEmpty()) {
// If there are aborted sends because of unsupported version exceptions or disconnects,
// handle them immediately without waiting for Selector#poll.
List<ClientResponse> responses = new ArrayList<>();
handleAbortedSends(responses);
completeResponses(responses);
return responses;
}
long metadataTimeout = metadataUpdater.maybeUpdate(now);
try {
this.selector.poll(Utils.min(timeout, metadataTimeout, defaultRequestTimeoutMs));
} catch (IOException e) {
log.error("Unexpected error during I/O", e);
}
// process completed actions
long updatedNow = this.time.milliseconds();
List<ClientResponse> responses = new ArrayList<>();
handleCompletedSends(responses, updatedNow);
handleCompletedReceives(responses, updatedNow);
handleDisconnections(responses, updatedNow);
handleConnections();
handleInitiateApiVersionRequests(updatedNow);
handleTimedOutRequests(responses, updatedNow);
completeResponses(responses);
return responses;
}