Kafka简介
Kafka是最初由Linkedin公司开发,是一个分布式、分区的、多副本的、多生产者、多订阅者,基于
zookeeper协调的分布式日志系统(也可以当做MQ系统),常见可以用于web/nginx日志、访问日志,
消息服务等等。
其主要应用场景是:日志收集系统和消息系统。
Kafka主要设计目标如下:
以时间复杂度为O(1)的方式提供消息持久化能力,即使对TB级以上数据也能保证常数时间的访
问性能。
高吞吐率。即使在非常廉价的商用机器上也能做到单机支持每秒100K条消息的传输。
支持Kafka Server间的消息分区,及分布式消费,同时保证每个partition内的消息顺序传输。
同时支持离线数据处理和实时数据处理。
支持在线水平扩展
消息系统有两种主要的消息传递模式:点对点传递模式、发布-订阅模式。大部分的消息系统选用发布-订阅
模式。Kafka就是一种发布-订阅模式。
对于消息中间件,消息分推拉两种模式。Kafka只有消息的拉取,没有推送,可以通过轮询实现消息
的推送。
- Kafka在一个或多个可以跨越多个数据中心的服务器上作为集群运行。
- Kafka集群中按照主题分类管理,一个主题可以有多个分区,一个分区可以有多个副本分区。
- 每个记录由一个键,一个值和一个时间戳组成。
Kafka具有四个核心API: - Producer API:允许应用程序将记录流发布到一个或多个Kafka主题。
- Consumer API:允许应用程序订阅一个或多个主题并处理为其生成的记录流。
- Streams API:允许应用程序充当流处理器,使用一个或多个主题的输入流,并生成一个或多
个输出主题的输出流,从而有效地将输入流转换为输出流。 - Connector API:允许构建和运行将Kafka主题连接到现有应用程序或数据系统的可重用生产者
或使用者。例如,关系数据库的连接器可能会捕获对表的所有更改。
Spring Boot集成Kafka
SpringBoot与Kafka集成相当简单,而且相关的配置也较为完善,很好上手,麻烦的在于运维配置,kafka集群的配置,通信
1.先引入相关依赖,该demo使用较新版本3.2.1,客户端使用3.6.2
<!-- spring-kafka -->
<dependency>
<groupId>org.springframework.kafka</groupId>
<artifactId>spring-kafka</artifactId>
<version>3.2.1</version>
</dependency>
<!-- kafka-clients -->
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-clients</artifactId>
<version>3.6.2</version>
</dependency>
2.编写application.yml中的配置
server:
port: 8088
spring:
application:
name: xx
datasource:
url: jdbc:mysql://localhost:3306/xx?useUnicode=true&characterEncoding=utf-8&useSSL=false&serverTimezone=Asia/Shanghai
driver-class-name: com.mysql.cj.jdbc.Driver
username: root
password: root
type: com.alibaba.druid.pool.DruidDataSource
druid:
initial-size: 5
min-idle: 5
max-active: 20
max-wait: 6000
jackson:
date-format: yyyy-MM-dd HH:mm:ss
time-zone: GMT+8
default-property-inclusion: non_null
# kafuka配置
kafka:
# 用于建立初始连接的broker地址
bootstrap-servers: localhost:9092
# 生产者配置
producer:
# producer用到的key和value的序列化类
key-serializer: org.apache.kafka.common.serialization.StringSerializer
value-serializer: org.apache.kafka.common.serialization.StringSerializer
batch-size: 16384 # 默认的批处理记录数
buffer-memory: 33554432 # 默认的缓冲区大小 32MB的总发送缓存
# 消费者配置
consumer:
# 消费者consumer用到的key和value的序列化类
key-deserializer: org.apache.kafka.common.serialization.StringDeserializer
value-deserializer: org.apache.kafka.common.serialization.StringDeserializer
# consumer的消费组id
group-id: vhr-group
# 是否自动提交消费者偏移量
enable-auto-commit: false
# 每隔100ms向broker提交一次偏移量
auto-commit-interval: 1000
auto-offset-reset: earliest # 果该消费者的偏移量不存在,则自动设置为最早的偏移量
3.kafka配置类
在上面的yml编写后,理论上就已经可以使用KafkaTemplate<String, Object>类注入后使用kafka了,可以不再编写配置类了,此次是为了方便读取配置使用
import com.google.common.collect.Maps;
import org.apache.kafka.clients.consumer.ConsumerConfig;
import org.apache.kafka.clients.consumer.KafkaConsumer;
import org.apache.kafka.clients.producer.ProducerConfig;
import org.apache.kafka.common.serialization.StringDeserializer;
import org.apache.kafka.common.serialization.StringSerializer;
import org.springframework.beans.factory.annotation.Qualifier;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.kafka.annotation.EnableKafka;
import org.springframework.kafka.config.ConcurrentKafkaListenerContainerFactory;
import org.springframework.kafka.core.*;
import java.util.Map;
@Configuration
@EnableKafka
public class KafkaConfig {
@Value("${spring.kafka.bootstrap-servers}")
private String bootstrapServers;
@Value("${spring.kafka.consumer.group-id}")
private String groupId;
@Value("${spring.kafka.consumer.enable-auto-commit}")
private boolean autoCommitEnabled;
@Value("${spring.kafka.consumer.auto-commit-interval}")
private int autoCommitInterval;
@Value("${spring.kafka.consumer.auto-offset-reset}")
private String autoOffsetReset;
/**
* 创建并返回一个生产者工厂实例,该工厂用于创建发送消息到Kafka主题的生产者客户端
*
* @return 返回一个生产者工厂实例,专门用于创建处理String键和Object值的生产者客户端
*/
@Bean
public ProducerFactory<String, Object> producerFactory() {
Map<String, Object> config = Maps.newHashMapWithExpectedSize(16);
config.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
config.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
config.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
return new DefaultKafkaProducerFactory<>(config);
}
@Bean
public KafkaTemplate<String, Object> kafkaTemplate() {
return new KafkaTemplate<>(producerFactory());
}
/**
* 创建Kafka消费者监听器工厂
* 该工厂用于配置和实例化Kafka消费者,以便消费Kafka主题的消息
*
* @return 返回一个配置了必要属性的Kafka消费者工厂实例
*/
@Bean
public ConsumerFactory<String, Object> kafkaConsumerListener() {
Map<String, Object> config = this.consumerConfigs();
return new DefaultKafkaConsumerFactory<>(config);
}
@Bean
public ConcurrentKafkaListenerContainerFactory<String, Object> kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, Object> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(kafkaConsumerListener());
return factory;
}
@Bean
public <K, V> KafkaConsumer<K, V> kafkaConsumerConfig(@Qualifier("kafkaConsumerListener") ConsumerFactory<String, Object> consumerFactory) {
return (KafkaConsumer<K, V>)consumerFactory.createConsumer();
}
@Bean
public Map<String, Object> consumerConfigs() {
Map<String, Object> config = Maps.newHashMapWithExpectedSize(16);
// 用于建立初始连接的broker地址
config.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
// consumer的消费组id
config.put(ConsumerConfig.GROUP_ID_CONFIG, groupId);
config.put(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG, autoCommitInterval);
config.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, autoCommitEnabled);
config.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
config.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
config.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, autoOffsetReset);
return config;
}
}
编写测试方法,查看结果
import com.common.plugin.kafuka.KafkaConfig;
import com.common.plugin.redis.RedisUtil;
import org.apache.commons.lang3.StringUtils;
import org.apache.kafka.clients.consumer.ConsumerRecord;
import org.apache.kafka.clients.consumer.ConsumerRecords;
import org.apache.kafka.clients.consumer.KafkaConsumer;
import org.apache.kafka.clients.producer.Callback;
import org.apache.kafka.clients.producer.Producer;
import org.apache.kafka.clients.producer.ProducerRecord;
import org.apache.kafka.clients.producer.RecordMetadata;
import org.junit.jupiter.api.Test;
import org.junit.runner.RunWith;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;
import org.springframework.kafka.core.KafkaTemplate;
import org.springframework.kafka.support.SendResult;
import org.springframework.test.context.junit4.SpringRunner;
import java.time.Duration;
import java.util.List;
import java.util.Objects;
import java.util.Optional;
import java.util.concurrent.CompletableFuture;
import java.util.concurrent.ExecutionException;
@SpringBootTest(classes = VhrServerApplication.class)
@RunWith(SpringRunner.class)
class VhrServerApplicationTests {
@Autowired
private KafkaTemplate<String, Object> kafkaTemplate;
@Autowired
private KafkaConfig config;
/**
* 同步发送
*/
@Test
public void testKafKaSend() {
CompletableFuture<SendResult<String, Object>> send = kafkaTemplate.send("vhr_topic_01", "test", "hello world");
try {
SendResult<String, Object> sendResult = send.get();
RecordMetadata metadata = sendResult.getRecordMetadata();
System.out.println("kafka同步发送结果:主题:" + metadata.topic() + "分区:" + metadata.partition() + "偏移量:" + metadata.offset());
} catch (InterruptedException | ExecutionException e) {
throw new RuntimeException(e);
}
}
/**
* 异步发送,带有回调方法
*/
@Test
public void testKafkaAsync() {
try (Producer<String, Object> producer = kafkaTemplate.getProducerFactory().createProducer()) {
ProducerRecord<String, Object> record = new ProducerRecord<>("vhr_topic_01", "test", "hello world async");
// 使用lambda表达式演示Callback,ifPresentOrElse为JDK9的新特性,项目中使用的JDK版本为Java17
producer.send(record, (recordMetadata, e) -> Optional.ofNullable(e).ifPresentOrElse(ex -> System.out.println("发送失败"),
() -> System.out.println("kafka异步发送主题:" + recordMetadata.topic() + ",分区:" + recordMetadata.partition() + ",偏移量:" + recordMetadata.offset())));
// 使用非lambda表达式演示Callback
// producer.send(record, new Callback() {
// @Override
// public void onCompletion(RecordMetadata recordMetadata, Exception e) {
// if (Objects.nonNull(e)) {
// System.out.println("发送失败");
// }else {
// System.out.println("kafka异步发送主题:" + recordMetadata.topic() + ",分区:" + recordMetadata.partition() + ",偏移量:" + recordMetadata.offset());
// }
// }
// });
} catch (Exception e) {
throw new RuntimeException(e);
}
}
/**
* kafka消息拉取测试
**/
@Test
public void testKafKaPoll(){
try(KafkaConsumer<String, Object> consumer = new KafkaConsumer<>(config.consumerConfigs())) {
// 消费消息前需要先订阅主题,List.of方法是Java17新方法
consumer.subscribe(List.of("test","vhr_topic_01"));
ConsumerRecords<String, Object> consumerRecords = consumer.poll(Duration.ofMillis(1000L));
StringBuilder builder = new StringBuilder();
for (ConsumerRecord<String, Object> record : consumerRecords) {
builder.append("看看主题:").append(record.topic()).append("\t").append("分区:").append(record.partition()).append("\t,").append("偏移量:").append(record.offset())
.append("\t,key:").append(record.key()).append("\t,value:").append(record.value()).append("\n\n");
}
String msg = builder.toString();
if (StringUtils.isNotBlank(msg)) {
System.out.println("kafka拉取结果:"+msg);
}
}
}
}
同步测试结果:
异步测试结果:
消息拉取结果:
其他,使用一个定时任务,拉取kafka消息
import cn.hutool.core.thread.ThreadUtil;
import lombok.extern.slf4j.Slf4j;
import org.apache.commons.lang3.StringUtils;
import org.apache.kafka.clients.consumer.Consumer;
import org.apache.kafka.clients.consumer.ConsumerRecord;
import org.apache.kafka.clients.consumer.ConsumerRecords;
import org.apache.kafka.clients.consumer.KafkaConsumer;
import org.springframework.scheduling.annotation.Scheduled;
import org.springframework.stereotype.Service;
import java.time.Duration;
import java.util.List;
import java.util.concurrent.ExecutorService;
@Service
@Slf4j
public class KafKaPolTest {
private final KafkaConfig kafkaConfig;
public KafKaPolTest(KafkaConfig kafkaConfig) {
this.kafkaConfig = kafkaConfig;
}
private static final ExecutorService executor = ThreadUtil.newExecutor(10, 20);
@Scheduled(cron = "*/60 * * * * ?")
public void pool() {
log.info("定时任务会运行吗?");
executor.execute(() -> {
System.out.println("看看线程:" + Thread.currentThread().getName());
// 1.· 创建一个 KafkaConsumer,使用DefaultKafkaConsumerFactory方法
// try (Consumer<String, Object> consumer = new DefaultKafkaConsumerFactory<String,Object>(kafkaConfig.consumerConfigs()).createConsumer()) {
// 2. 使用KafkaConsumer创建一个消费者
try (Consumer<String, Object> consumer =new KafkaConsumer<>(kafkaConfig.consumerConfigs())) {
consumer.subscribe(List.of("test","vhr_topic_01"));
int maxAttempts = 5; // 最多尝试 5 次
int attempts = 0;
while (attempts<maxAttempts) {
StringBuilder builder = new StringBuilder();
ConsumerRecords<String, Object> consumerRecords = consumer.poll(Duration.ofMillis(1000L));
for (ConsumerRecord<String, Object> record : consumerRecords) {
builder.append("看看主题:").append(record.topic()).append("\t").append("分区:").append(record.partition()).append("\t,").append("偏移量:").append(record.offset())
.append("\t,key:").append(record.key()).append("\t,value:").append(record.value()).append("\n\n");
}
String msg = builder.toString();
if (StringUtils.isNotBlank(msg)) {
log.info("看看消息:{}", msg);
break;
}
attempts++;
}
}
});
}
}
经过测试,注释KafkaConfig类里面生产者和消费者配置的代码,不影响KafkaTemplate<String, Object> 的使用,新配置的KafkaConfig类内容如下:
import com.google.common.collect.Maps;
import org.apache.kafka.clients.consumer.ConsumerConfig;
import org.apache.kafka.common.serialization.StringDeserializer;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.kafka.annotation.EnableKafka;
import java.util.Map;
@Configuration
@EnableKafka
public class KafkaConfig {
@Value("${spring.kafka.bootstrap-servers}")
private String bootstrapServers;
@Value("${spring.kafka.consumer.group-id}")
private String groupId;
@Value("${spring.kafka.consumer.enable-auto-commit}")
private boolean autoCommitEnabled;
@Value("${spring.kafka.consumer.auto-commit-interval}")
private int autoCommitInterval;
@Value("${spring.kafka.consumer.auto-offset-reset}")
private String autoOffsetReset;
/**
* 创建并返回一个生产者工厂实例,该工厂用于创建发送消息到Kafka主题的生产者客户端
*
* @return 返回一个生产者工厂实例,专门用于创建处理String键和Object值的生产者客户端
*/
// @Bean
// public ProducerFactory<String, Object> producerFactory() {
// Map<String, Object> config = Maps.newHashMapWithExpectedSize(16);
// config.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
// config.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
// config.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
// return new DefaultKafkaProducerFactory<>(config);
// }
//
// @Bean
// public KafkaTemplate<String, Object> kafkaTemplate() {
// return new KafkaTemplate<>(producerFactory());
// }
//
// /**
// * 创建Kafka消费者监听器工厂
// * 该工厂用于配置和实例化Kafka消费者,以便消费Kafka主题的消息
// *
// * @return 返回一个配置了必要属性的Kafka消费者工厂实例
// */
// @Bean
// public ConsumerFactory<String, Object> kafkaConsumerListener() {
// Map<String, Object> config = this.consumerConfigs();
// return new DefaultKafkaConsumerFactory<>(config);
// }
//
// @Bean
// public ConcurrentKafkaListenerContainerFactory<String, Object> kafkaListenerContainerFactory() {
// ConcurrentKafkaListenerContainerFactory<String, Object> factory = new ConcurrentKafkaListenerContainerFactory<>();
// factory.setConsumerFactory(kafkaConsumerListener());
// return factory;
// }
//
// @Bean
// public <K, V> KafkaConsumer<K, V> kafkaConsumerConfig(@Qualifier("kafkaConsumerListener") ConsumerFactory<String, Object> consumerFactory) {
// return (KafkaConsumer<K, V>)consumerFactory.createConsumer();
// }
@Bean
public Map<String, Object> consumerConfigs() {
Map<String, Object> config = Maps.newHashMapWithExpectedSize(16);
config.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
config.put(ConsumerConfig.GROUP_ID_CONFIG, groupId);
config.put(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG, autoCommitInterval);
config.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, autoCommitEnabled);
config.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
config.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
config.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, autoOffsetReset);
return config;
}
}
测试结果如下: