1、druid 中配置 加载kafka 任务。可以参考本人的 相关文章。
2、 向 kafka中发送数据,(注意:在向kafka中发送数据是,一定要注意dataSchema 中timestampSpec的配置,如果配置了时间字段(timestamp)写入数据时一定要赋值,否则druid将不会加载数据)
package com;
import com.alibaba.fastjson.JSONObject;
import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.Producer;
import org.apache.kafka.clients.producer.ProducerRecord;
import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;
import java.util.ArrayList;
import java.util.List;
import java.util.Properties;
import java.util.UUID;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.Future;
/**
* 功能描述,该部分必须以中文句号结尾。
*
* @author panqingcui
*/
public class VisitorStatisticsProducer {
private static final Logger LOGGER = LogManager.getLogger();
public static void main(String[] args) throws ExecutionException, InterruptedException {
Properties props = new Properties();
props.put("bootstrap.servers", "192.168.2.176:9092");
props.put("acks", "all");
props.put("retries", 0);
props.put("batch.size", 16384);
props.put("linger.ms", 10);
props.put("buffer.memory", 33554432);
props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
//wikiticker
//visitor_statistics
String topic = "visitor_statistics";
Producer<String, String> producer = new KafkaProducer<>(props);
List<String> types = new ArrayList();
types.add("Andriod");
types.add("Apple Phone");
types.add("Windows Phone");
// {"isRobot":false,"channel":"#de.wikipedia","cityName":"Soest","timestamp":"2016-06-27T21:30:50.804Z","flags":"","isUnpatrolled":false,"page":"Fußball-Europameisterschaft 2016/England","countryName":"Germany","regionIsoCode":"NW","diffUrl":"https://de.wikipedia.org/w/index.php?diff=155683760&oldid=155683358","added":93,"metroCode":null,"comment":"alles bei allen Artikeln vereinheitlichen","commentLength":41,"isNew":false,"isMinor":false,"delta":93,"countryIsoCode":"DE","isAnonymous":true,"user":"80.187.112.223","regionName":"North Rhine-Westphalia","deltaBucket":0.0,"deleted":0,"namespace":"Main"}
for (int i = 0; i < 1; i++) {
JSONObject jsonObject = new JSONObject();
String id = UUID.randomUUID().toString();
jsonObject.put("new_user_id",id);
if(i%2==0){
jsonObject.put("user_id", id);
}else{
jsonObject.put("click_userid",id);
}
if(i%4==0){
jsonObject.put("is_new",true);
}else {
jsonObject.put("is_new",false);
}
// jsonObject.put("device_type",types.get(i%types.size()));
// jsonObject.put("isRobot",false);
// jsonObject.put("channel","#de.wikipedia");
// jsonObject.put("cityName","Soest");
jsonObject.put("timestamp","2017-06-27T21:30:50.804Z");
Future future = producer.send(new ProducerRecord<String, String>(topic, "test", jsonObject.toJSONString()));
LOGGER.info(future.get().toString());
}
producer.close();
}
}
3、如果向kafka 发送数据有异常,可以参见本人java 写入kafka数据异常。
本文介绍了一种使用Java向Kafka发送数据的方法,并通过具体代码示例展示了如何配置KafkaProducer,构造JSON消息体,以及如何确保时间戳字段正确设置以使Druid能够成功加载数据。

1514

被折叠的 条评论
为什么被折叠?



