Kafka 编写自己的producer、partitioner和consumer

本文介绍了如何编写Kafka的简单Producer、实现callback功能的Producer以及自动和手动提交的Consumer。通过示例,展示了如何将消息发送到特定分区,并探讨了Kafka的参数设置。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

 

 1. 简单的 Producer

import java.util.Properties;

import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.Producer;
import org.apache.kafka.clients.producer.ProducerConfig;
import org.apache.kafka.clients.producer.ProducerRecord;
import org.junit.Test;

public class MyProducer {
	
	@Test
	public void testProducer(){
		 Properties props = new Properties();
		 props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "centos1:9092");
		 props.put(ProducerConfig.ACKS_CONFIG, "all");
		 props.put(ProducerConfig.RETRIES_CONFIG, 0);
		 props.put(ProducerConfig.BATCH_SIZE_CONFIG, 16384);
		 props.put(ProducerConfig.LINGER_MS_CONFIG, 1);
		 props.put(ProducerConfig.BUFFER_MEMORY_CONFIG, 33554432);
		 props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringSerializer");
		 props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringSerializer");
		 props.put(ProducerConfig.PARTITIONER_CLASS_CONFIG, "org.kafka.practice.MyPartitioner");

		 Producer<String, String> producer = new KafkaProducer<>(props);
		 for(int i = 0; i < 100; i++)
		     producer.send(new ProducerRecord<String, String>("mytopic", Integer.toString(i), "7777-"+i));
		 producer.close();
	}
}

 

简单的partitioner

package org.kafka.practice;

import java.util.Map;

import org.apache.kafka.clients.producer.Partitioner;
import org.apache.kafka.common.Cluster;

public class MyPartitioner implements Partitioner{

	@Override
	public void configure(Map<String, ?> configs) {
	}

	@Override
	public int partition(String topic, Object key, byte[] keyBytes,
			Object value, byte[] valueBytes, Cluster cluster) {
		return 1;
	}

	@Override
	public void close() {
	}
}

 

结果:

所发送的消息全部写道编号为1的分区上,查看log文件 /tmp/kafka-logs/mytopic-1/0000000000.log

 

 

2. 实现了callback函数的producer

import java.util.Properties;

import org.apache.kafka.clients.producer.Callback;
import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.Producer;
import org.apache.kafka.clients.producer.ProducerConfig;
import org.apache.kafka.clients.producer.ProducerRecord;
import org.apache.kafka.clients.producer.RecordMetadata;
import org.junit.Test;

public class MyProducer {
	
	@Test
	public void testProducer(){
		 Properties props = new Properties();
		 props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "centos1:9092");
		 props.put(ProducerConfig.ACKS_CONFIG, "all");
		 props.put(ProducerConfig.RETRIES_CONFIG, 0);
		 props.put(ProducerConfig.BATCH_SIZE_CONFIG, 16384);
		 props.put(ProducerConfig.LINGER_MS_CONFIG, 1);
		 props.put(ProducerConfig.BUFFER_MEMORY_CONFIG, 33554432);
		 props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringSerializer");
		 props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringSerializer");
		 props.put(ProducerConfig.PARTITIONER_CLASS_CONFIG, "org.kafka.practice.MyPartitioner");

		 Producer<String, String> producer = new KafkaProducer<>(props);
		 for(int i = 0; i < 5; i++){
			 ProducerRecord<String, String> record = new ProducerRecord<String, String>("mytopic", Integer.toString(i), "222-"+i);
		     producer.send(record, new Callback(){
				@Override
				public void onCompletion(RecordMetadata metadata, Exception exception) {
					System.out.println("received ack!!!");
				}
		     });
		     System.out.println("send message!!!");
		 }
		 producer.close();
	}
}

运行结果: 

send message!!!

send message!!!

send message!!!

send message!!!

send message!!!

17/05/18 15:23:40 INFO producer.KafkaProducer: Closing the Kafka producer with timeoutMillis = 9223372036854775807 ms.

received ack!!!

received ack!!!

received ack!!!

received ack!!!

received ack!!!

 

 

3. 简单的consumer- 自动提交

	@Test
	public void testConsumer() throws Exception{
		Properties props = new Properties();
		props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "centos1:9092");
		props.put(ConsumerConfig.GROUP_ID_CONFIG, "group1");
		props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, "true"); //自动提交
		props.put(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG, "1000");
		props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,"org.apache.kafka.common.serialization.StringDeserializer");
		props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,"org.apache.kafka.common.serialization.StringDeserializer");
		KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
		consumer.subscribe(Arrays.asList("mytopic"));
		while (true) {
			ConsumerRecords<String, String> records = consumer.poll(100);
			for (ConsumerRecord<String, String> record : records){
				Date now = new Date();
				System.out.printf(now + " offset = %d, key = %s, value = %s%n", record.offset(), record.key(), record.value());
				Thread.sleep(3000);
			}
		}
	}

 

 3. 简单的consumer- 手动提交

	public void testConsumer2() {
	     Properties props = new Properties();
	     props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
	     props.put(ConsumerConfig.GROUP_ID_CONFIG, "group1");
	     props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, "false");
	     props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,"org.apache.kafka.common.serialization.StringDeserializer");
	     props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,"org.apache.kafka.common.serialization.StringDeserializer");
	     KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
	     consumer.subscribe(Arrays.asList("mytopic"));
	     final int minBatchSize = 200;
	     List<ConsumerRecord<String, String>> buffer = new ArrayList<>();
	     while (true) {
	         ConsumerRecords<String, String> records = consumer.poll(100);
	         for (ConsumerRecord<String, String> record : records) {
	             buffer.add(record);
	         }
	         if (buffer.size() >= minBatchSize) {
	             //insertIntoDb(buffer);
	             consumer.commitSync();
	             buffer.clear();
	         }
	     }
	}

 

 

 

参考:

Kafka参数说明: http://www.cnblogs.com/rilley/p/5391268.html

### Kafka 中消息一致性排序性的实现机制 #### 消息顺序性 Kafka 的消息顺序性依赖于其分区的设计。每个主题被划分为多个分区,而这些分区中的每一条消息都有唯一的偏移量(offset)。由于每个分区内部的消息是严格有序的,因此可以通过控制消息写入特定分区来保证全局或局部的消息顺序[^1]。 当生产者向 Kafka 发送消息时,可以选择指定目标分区或者让 Kafka 自动分配分区。如果指定了具体的分区,则该条消息会被发送到所选的分区中;如果没有显式指定分区,Kafka 将依据内置策略(如轮询算法)决定消息的目标分区[^2]。 #### 数据一致性保障 为了确保数据的一致性,Kafka 实现了一套基于事务的日志提交机制。具体来说,在消费过程中会记录消费者的位移(offset),并通过确认机制标记哪些消息已经被成功处理。只有在消费者明确告知 broker 已经完成某一批次消息的处理之后,才会更新相应的 offset 记录[^3]。 此外,对于可能发生的网络中断或其他异常状况,Kafka 提供了幂等生产精确一次语义的功能支持。这意味着即使发生重试操作也不会造成重复写入的问题,从而进一步增强了系统的可靠性。 #### 分区顺序投递保障 关于分区顺序投递方面,需要注意的是虽然单个 topic 下面的不同 partitions 可能会出现乱序现象,但是只要保持同一线程只针对单一 partition 进行连续读取/写入动作的话,那么就可以维持住这个 partition 内部原有的先后关系不变。然而实际应用当中可能会遇到诸如 canal 同步至 kafka 期间因配置错误引发 “Invalid partition given with record” 类型错误的情况 [^4] ,这就提醒我们在设置 producer 参数以及定义 message key 或 partitioner logic 时候要格外小心谨慎以免破坏既定业务逻辑下的序列化需求。 ```python from confluent_kafka import Producer, Consumer, KafkaException producer_config = { 'bootstrap.servers': 'localhost:9092', 'acks': 'all', # Ensure all replicas acknowledge before considering a write successful. 'retries': 5 # Retry on failure up to this many times. } consumer_config = { 'bootstrap.servers': 'localhost:9092', 'group.id': 'test-group', 'auto.offset.reset': 'earliest' } ``` 上述代码片段展示了如何通过调整 `Producer` `Consumer` 配置参数来增强 Kafka 应用程序的数据可靠性性能表现。 ---
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

qq_26182553

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值