-
Kafka Rest API性能测试
-
Rest与Java Client配置
测试服务器是CentOS 7 - 16核,测试代码与Confluent在同一台机器上,使用相同的配置并使用了HttpClientPool:
KEY_SERIALIZER_CLASS_CONFIG |
org.apache.kafka.common.serialization.StringSerializer |
VALUE_SERIALIZER_CLASS_CONFIG |
org.apache.kafka.common.serialization.StringSerializer |
BATCH_SIZE_CONFIG |
10000 |
BUFFER_MEMORY_CONFIG |
33554432 |
MAX_REQUEST_SIZE_CONFIG |
1048576 |
LINGER_MS_CONFIG | 0 |
ACKS_CONFIG | 1 |
-
测试单线程逐条发送和成批发送效率
- 基本逻辑
|--初始化数据集
|
| //逐条的
|--初始化总时间0
|--LOOP 遍历数据集
| |--计时开始
| |--发送一条
| |--计时结束
| |--计时间隔加入总时间
| |--IF 验证失败
| | |--打印异常记录
| |--IF END
|--LOOP END
|--打印总时间
|
| //成批的
|--计时开始
|--发送一批(全部数据数据集)
|--计时结束
|--IF 验证失败
| |--打印异常记录
|--IF END
|--打印用时间隔
-
测试代码
/**
* 测试 Kafka-rest 的发送效率
* 目标 Topic 为 rest_test(partition_num:1)
*/
public class TestingSR {
public static String TOPIC = "rest_test";
public static String REST_HOST = "xxx";
public static int REST_PORT = 8085;
public static String CLIENT_HOST = "xxxx";
public static int CLIENT_PORT = 9095;
public static String CONTENT_TYPE = "application/vnd.kafka.binary.v2+json";
public static String ENCODE = "utf-8";
public static int[] DATA_SIZE = {10,100,500,1000,2000,5000};
public static int[] POOL_SIZE = {10,16,32,64,100,150};
public static void main(String[] args) throws Exception{
/**
* Rest测试一批数据逐条发送效率和成批发送效率
*/
for(int size : DATA_SIZE) {
System.out.println("<============== Data "+size+" =============>");
HashMap<String, String> data = new HashMap<>();
for (int i = 0; i < size; i++) {
data.put("key" + i, "value" + i);
}
ExecutorService executor = Executors.newFixedThreadPool(2);
Future f1 = executor.submit(new Send1Thread(data));
Future f2 = executor.submit(new SendAllThread(data));
f1.get();
f2.get();
executor.shutdown();
}
/**
* Client测试一批数据逐条发送效率
*/
KafkaClientTool<String,String> producer = new KafkaClientTool<>(CLIENT_HOST,CLIENT_PORT);
for(int size : DATA_SIZE) {
System.out.println("<============== Data "+size+" =============>");
HashMap<String, String> data = new HashMap<>();
for (int i = 0; i < size; i++) {
data.put("key" + i, "value" + i);
}
ExecutorService executor = Executors.newFixedThreadPool(1);
Future f1 = executor.submit(new ClientSend1Thread(producer,data));
f1.get();
executor.shutdown();
}
/**
* 测试 1W条数据并发单条发送效率
*/
HashMap<Integer,String> data = new HashMap<>();
for(int i=0; i<10000;i++){
data.put(i,"aaaaaaaaaa");
}
StopWatch stopWatch = new StopWatch();
for(int size: POOL_SIZE){
stopWatch.reset();
stopWatch.start();
ExecutorService executor = Executors.newFixedThreadPool(size);
LinkedList<Future> futures = new LinkedList<>();
for(int key:data.keySet()){
futures.add(
executor.submit(
new SendThread(String.valueOf(key),data.get(key))
)
);
}
for(Future f:futures){
f.get();
}
executor.shutdown();
stopWatch.stop();
System.out.println("Send 1 for "+data.size()+" with "+size+" threads use : "+stopWatch.getTime());
}
HttpClientPoolTool.closeConnectionPool();
}
/**
* Kafka-rest逐条发送线程
*/
public static class Send1Thread implements Runnable{
private HashMap<String,String> data;
public Send1Thread(HashMap<String,String> data) {
this.data = data;
}
@Override
public void run() {
StopWatch stopWatch = new StopWatch();
long time = 0L;
for(Map.Entry<String,String> entry:this.data.entrySet()){
stopWatch.reset();
stopWatch.start();
String rel = sendMessage(
REST_HOST,
REST_PORT,
TOPIC,
0,
entry.getKey(),
entry.getValue(),
CONTENT_TYPE,
ENCODE);
stopWatch.stop();
time = time + stopWatch.getTime();
if(KafkaRestTool.checkSuccess(rel)){
System.out.print("$FAIL$");
}
}
System.out.println("Send 1 for "+data.size()+" use : "+time);
}
}
/**
* Kafka-rest成批发送线程
*/
public static class SendAllThread implements Runnable{
private HashMap<String,String> data;
public SendAllThread(HashMap<String,String> data) {
this.data = data;
}
@Override
public void run() {
StopWatch stopWatch = new StopWatch();
stopWatch.start();
String rel = sendMessageAll(
REST_HOST,
REST_PORT,
TOPIC,
0,
this.data,
CONTENT_TYPE,
ENCODE);
if(KafkaRestTool.checkSuccess(rel)){
System.out.print("$FAIL$");
}
stopWatch.stop();
System.out.println("Send all for "+data.size()+" use : "+stopWatch.getTime());
}
}
/**
* Kafka-rest单条条发送线程
*/
public static class SendThread implements Runnable{
private String key;
private String value;
public SendThread(String key,String value) {
this.key = key;
this.value = value;
}
@Override
public void run() {
StopWatch stopWatch = new StopWatch();
stopWatch.start();
String rel = sendMessage(
REST_HOST,
REST_PORT,
TOPIC,
0,
this.key,
this.value,
CONTENT_TYPE,
ENCODE);
stopWatch.stop();
if(KafkaRestTool.checkSuccess(rel)){
System.out.print("$FAIL$");
}
}
}
/**
* Kafka-client逐条发送线程
*/
public static class ClientSend1Thread implements Runnable{
private KafkaClientTool<String,String> producer;
private HashMap<String,String> data;
public ClientSend1Thread(KafkaClientTool<String,String> producer,HashMap<String,String> data) {
this.producer = producer;
this.data = data;
}
@Override
public void run() {
StopWatch stopWatch = new StopWatch();
long time = 0L;
for(Map.Entry<String,String> entry:this.data.entrySet()){
stopWatch.reset();
stopWatch.start();
this.producer.sendMessage(
TOPIC,
0,
entry.getKey(),
entry.getValue());
stopWatch.stop();
time = time + stopWatch.getTime();
}
System.out.println("Send 1 for "+data.size()+" use : "+time);
}
}
public static HashMap<String,String> getDataUse(int threadId,int poolsize,HashMap<Integer,String> data){
HashMap<String,String> rel = new HashMap<>();
for(int i:data.keySet()){
if( i%poolsize == threadId) {
rel.put(String.valueOf(i), data.get(i));
}
}
return rel;
}
}
-
结果
测试一批数据逐条发送效率和成批发送效率
数据量 |
Rest逐条耗时 |
Rest成批耗时 |
Client逐条耗时 |
10 |
79 |
41 |
10 |
100 |
775 |
42 |
11 |
500 |
3324 |
68 |
14 |
1000 |
6366 |
63 |
25 |
2000 |
12869 |
190 |
35 |
5000 |
31436 |
781 |
72 |
测试 1W条数据并发单条发送效率
并发数 |
时间 |
10 | 9360 |
16 | 6292 |
32 | 6646 |
64 | 5340 |
100 | 4181 |
150 | 3879 |
每次Rest请求,都要有一个构建Header和Entity的过程,而Entity的构建和数据量正相关,各以简单理解他的关系是是这样的。
假设数据总量为N,单次请求数据量为n,构建Header用时为H,构建Entity用时为E(n),则逐条(单次请求数据量为1)耗时:
N * ( H + E(1)),可以理解成N * 一个定值K;
而成批(单次请求数据量为N)耗时:
H + E(N),N够大H忽略不计可以理解为E(N);
通过使用对数函数验证,发现E与n的关系并不是直线性的,而逐条耗时是正线性的,也就是说随着数据总量的增加逐条耗时的增长速度会大于成批耗时的增长速度,这点从数据上得到了验证。
所以结论一:
“Rest成批效率高于Rest逐条,因为Rest逐条把大部分时间浪费在了重复构建请求体上”
逐条和成批发送,Rest明显劣于Client,因为Rest是使用Producer Pool,当接受到请求时会分配一个ProducerTask,而这个ProducerTask底层就是使用KafkaProducer发送的,所以当测试代码和Confluent在同一个机器上时,由于Rest有HTTP请求构建时间,所以不如Client,但如果测试(发送消息的程序)与Confluent不在同一个机器上时,经过测试效率上“Rest成批>>Client逐条>>Rest逐条”。
结论二:
“当发送消息的程序和Confluent在同一个机器上时,直接使用Client之最好的,但如果不在同一台机器上时,使用Rest的成批发送最好,主要Rest避免了Client与Kafka版本不一致所带来的种种问题,而且即便是没有Client的人,不想用Client的人也可以通过简单的HTTP实现发送”