本文主要介绍如何在Storm编程实现与Kafka的集成
一、实现模型
数据流程:
1、Kafka Producter生成topic1主题的消息
2、Storm中有个Topology,包含了KafkaSpout、SenqueceBolt、KafkaBolt三个组件。其中KafkaSpout订阅了topic1主题消息,然后发送
给SenqueceBolt加工处理,最后数据由KafkaBolt生成topic2主题消息发送给Kafka
3、Kafka Consumer负责消费topic2主题的消息
二、Topology实现
1、创建maven工程,配置pom.xml
需要依赖storm-core、kafka_2.10、storm-kafka三个包
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
|
<dependencies>
<dependency>
<groupId>org.apache.storm</groupId>
<artifactId>storm-core</artifactId>
<version>
0.9
.
2
-incubating</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka_2.
10
</artifactId>
<version>
0.8
.
1.1
</version>
<exclusions>
<exclusion>
<groupId>org.apache.zookeeper</groupId>
<artifactId>zookeeper</artifactId>
</exclusion>
<exclusion>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.storm</groupId>
<artifactId>storm-kafka</artifactId>
<version>
0.9
.
2
-incubating</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<version>
2.4
</version>
<configuration>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
<executions>
<execution>
<id>make-assembly</id>
<phase>
package
</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
|
2、KafkaSpout
KafkaSpout是Storm中自带的Spout,源码在https://github.com/apache/incubator-storm/tree/master/external
使用KafkaSpout时需要子集实现Scheme接口,它主要负责从消息流中解析出需要的数据
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
|
public
class
MessageScheme
implements
Scheme {
/* (non-Javadoc)
* @see backtype.storm.spout.Scheme#deserialize(byte[])
*/
public
List<Object> deserialize(
byte
[] ser) {
try
{
String msg =
new
String(ser,
"UTF-8"
);
return
new
Values(msg);
}
catch
(UnsupportedEncodingException e) {
}
return
null
;
}
/* (non-Javadoc)
* @see backtype.storm.spout.Scheme#getOutputFields()
*/
public
Fields getOutputFields() {
// TODO Auto-generated method stub
return
new
Fields(
"msg"
);
}
}
|
3、SenqueceBolt
SenqueceBolt实现很简单,在接收的spout的消息前面加上“I‘m”
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
|
public
class
SenqueceBolt
extends
BaseBasicBolt{
/* (non-Javadoc)
* @see backtype.storm.topology.IBasicBolt#execute(backtype.storm.tuple.Tuple, backtype.storm.topology.BasicOutputCollector)
*/
public
void
execute(Tuple input, BasicOutputCollector collector) {
// TODO Auto-generated method stub
String word = (String) input.getValue(
0
);
String out =
"I'm "
+ word +
"!"
;
System.out.println(
"out="
+ out);
collector.emit(
new
Values(out));
}
/* (non-Javadoc)
* @see backtype.storm.topology.IComponent#declareOutputFields(backtype.storm.topology.OutputFieldsDeclarer)
*/
public
void
declareOutputFields(OutputFieldsDeclarer declarer) {
declarer.declare(
new
Fields(
"message"
));
}
}
|
4、KafkaBolt
KafkaBolt是Storm中自带的Bolt,负责向Kafka发送主题消息
5、Topology
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
|
public
class
StormKafkaTopo {
public
static
void
main(String[] args)
throws
Exception {
// 配置Zookeeper地址
BrokerHosts brokerHosts =
new
ZkHosts(
"node04:2181,node05:2181,node06:2181"
);
// 配置Kafka订阅的Topic,以及zookeeper中数据节点目录和名字
SpoutConfig spoutConfig =
new
SpoutConfig(brokerHosts,
"topic1"
,
"/zkkafkaspout"
,
"kafkaspout"
);
// 配置KafkaBolt中的kafka.broker.properties
Config conf =
new
Config();
Map<String, String> map =
new
HashMap<String, String>();
// 配置Kafka broker地址
map.put(
"metadata.broker.list"
,
"node04:9092"
);
// serializer.class为消息的序列化类
map.put(
"serializer.class"
,
"kafka.serializer.StringEncoder"
);
conf.put(
"kafka.broker.properties"
, map);
// 配置KafkaBolt生成的topic
conf.put(
"topic"
,
"topic2"
);
spoutConfig.scheme =
new
SchemeAsMultiScheme(
new
MessageScheme());
TopologyBuilder builder =
new
TopologyBuilder();
builder.setSpout(
"spout"
,
new
KafkaSpout(spoutConfig));
builder.setBolt(
"bolt"
,
new
SenqueceBolt()).shuffleGrouping(
"spout"
);
builder.setBolt(
"kafkabolt"
,
new
KafkaBolt<String, Integer>()).shuffleGrouping(
"bolt"
);
if
(args !=
null
&& args.length >
0
) {
conf.setNumWorkers(
3
);
StormSubmitter.submitTopology(args[
0
], conf, builder.createTopology());
}
else
{
LocalCluster cluster =
new
LocalCluster();
cluster.submitTopology(
"Topo"
, conf, builder.createTopology());
Utils.sleep(
100000
);
cluster.killTopology(
"Topo"
);
cluster.shutdown();
}
}
}
|
三、测试验证
1、使用Kafka client模拟Kafka Producter ,生成topic1主题
bin/kafka-console-producer.sh --broker-list node04:9092 --topic topic1
2、使用Kafka client模拟Kafka Consumer,订阅topic2主题
bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic topic2 --from-beginning
3、运行Strom Topology
bin/storm jar storm-kafka-0.0.1-SNAPSHOT-jar-with-dependencies.jar StormKafkaTopo KafkaStorm
4、运行结果
转自:http://www.tuicool.com/articles/f6RVvq
http://www.sxt.cn/u/756/blog/4584