storm集成kafka

本文主要介绍如何在Storm编程实现与Kafka的集成

  一、实现模型

   数据流程:

    1、Kafka Producter生成topic1主题的消息 

    2、Storm中有个Topology,包含了KafkaSpout、SenqueceBolt、KafkaBolt三个组件。其中KafkaSpout订阅了topic1主题消息,然后发送

      给SenqueceBolt加工处理,最后数据由KafkaBolt生成topic2主题消息发送给Kafka

    3、Kafka Consumer负责消费topic2主题的消息

    


  二、Topology实现

    1、创建maven工程,配置pom.xml

      需要依赖storm-core、kafka_2.10、storm-kafka三个包

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
<dependencies> 
     <dependency>
          <groupId>org.apache.storm</groupId>
            <artifactId>storm-core</artifactId>
            <version> 0.9 . 2 -incubating</version>
            <scope>provided</scope>
      </dependency>
  
   <dependency>
       <groupId>org.apache.kafka</groupId>
       <artifactId>kafka_2. 10 </artifactId>
       <version> 0.8 . 1.1 </version>
       <exclusions>
           <exclusion>
               <groupId>org.apache.zookeeper</groupId>
               <artifactId>zookeeper</artifactId>
           </exclusion>
           <exclusion>
               <groupId>log4j</groupId>
               <artifactId>log4j</artifactId>
           </exclusion>
       </exclusions>
   </dependency>
      
       <dependency> 
         <groupId>org.apache.storm</groupId> 
        <artifactId>storm-kafka</artifactId> 
         <version> 0.9 . 2 -incubating</version> 
   </dependency> 
</dependencies>
 
<build>
   <plugins>
     <plugin>
       <artifactId>maven-assembly-plugin</artifactId>
       <version> 2.4 </version>
       <configuration>
         <descriptorRefs>
           <descriptorRef>jar-with-dependencies</descriptorRef>
         </descriptorRefs>
       </configuration>
       <executions>
         <execution>
           <id>make-assembly</id>
           <phase> package </phase>
           <goals>
             <goal>single</goal>
           </goals>
         </execution>
       </executions>
     </plugin>
   </plugins>
</build>

 

    2、KafkaSpout

      KafkaSpout是Storm中自带的Spout,源码在https://github.com/apache/incubator-storm/tree/master/external

      使用KafkaSpout时需要子集实现Scheme接口,它主要负责从消息流中解析出需要的数据

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
public class MessageScheme implements Scheme {
     
     /* (non-Javadoc)
      * @see backtype.storm.spout.Scheme#deserialize(byte[])
      */
     public List<Object> deserialize( byte [] ser) {
         try {
             String msg = new String(ser, "UTF-8" );
             return new Values(msg);
         } catch (UnsupportedEncodingException e) { 
          
         }
         return null ;
     }
     
     
     /* (non-Javadoc)
      * @see backtype.storm.spout.Scheme#getOutputFields()
      */
     public Fields getOutputFields() {
         // TODO Auto-generated method stub
         return new Fields( "msg" ); 
    
}

    3、SenqueceBolt

       SenqueceBolt实现很简单,在接收的spout的消息前面加上“I‘m” 

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
public class SenqueceBolt extends BaseBasicBolt{
     
     /* (non-Javadoc)
      * @see backtype.storm.topology.IBasicBolt#execute(backtype.storm.tuple.Tuple, backtype.storm.topology.BasicOutputCollector)
      */
     public void execute(Tuple input, BasicOutputCollector collector) {
         // TODO Auto-generated method stub
          String word = (String) input.getValue( 0 ); 
          String out = "I'm " + word +  "!"
          System.out.println( "out=" + out);
          collector.emit( new Values(out));
     }
     
     /* (non-Javadoc)
      * @see backtype.storm.topology.IComponent#declareOutputFields(backtype.storm.topology.OutputFieldsDeclarer)
      */
     public void declareOutputFields(OutputFieldsDeclarer declarer) {
         declarer.declare( new Fields( "message" ));
     }
}

    4、KafkaBolt

      KafkaBolt是Storm中自带的Bolt,负责向Kafka发送主题消息

    5、Topology

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
public class StormKafkaTopo {  
     public static void main(String[] args) throws Exception {
     // 配置Zookeeper地址
         BrokerHosts brokerHosts = new ZkHosts( "node04:2181,node05:2181,node06:2181" );
         // 配置Kafka订阅的Topic,以及zookeeper中数据节点目录和名字
         SpoutConfig spoutConfig = new SpoutConfig(brokerHosts, "topic1" , "/zkkafkaspout" , "kafkaspout" );
        
     // 配置KafkaBolt中的kafka.broker.properties
         Config conf = new Config(); 
         Map<String, String> map = new HashMap<String, String>();
     // 配置Kafka broker地址      
         map.put( "metadata.broker.list" , "node04:9092" );
         // serializer.class为消息的序列化类
         map.put( "serializer.class" , "kafka.serializer.StringEncoder" );
         conf.put( "kafka.broker.properties" , map);
     // 配置KafkaBolt生成的topic
         conf.put( "topic" , "topic2" );
         
         spoutConfig.scheme = new SchemeAsMultiScheme( new MessageScheme()); 
         TopologyBuilder builder = new TopologyBuilder();  
         builder.setSpout( "spout" , new KafkaSpout(spoutConfig)); 
         builder.setBolt( "bolt" , new SenqueceBolt()).shuffleGrouping( "spout" );
         builder.setBolt( "kafkabolt" , new KafkaBolt<String, Integer>()).shuffleGrouping( "bolt" );       
 
         if (args != null && args.length > 0 ) { 
             conf.setNumWorkers( 3 ); 
             StormSubmitter.submitTopology(args[ 0 ], conf, builder.createTopology()); 
         } else
   
             LocalCluster cluster = new LocalCluster(); 
             cluster.submitTopology( "Topo" , conf, builder.createTopology()); 
             Utils.sleep( 100000 ); 
             cluster.killTopology( "Topo" ); 
             cluster.shutdown(); 
        
    
}

 


  三、测试验证

    1、使用Kafka client模拟Kafka Producter ,生成topic1主题   

      bin/kafka-console-producer.sh --broker-list node04:9092 --topic topic1

    2、使用Kafka client模拟Kafka Consumer,订阅topic2主题

      bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic topic2 --from-beginning

    3、运行Strom Topology

      bin/storm jar storm-kafka-0.0.1-SNAPSHOT-jar-with-dependencies.jar  StormKafkaTopo KafkaStorm

    4、运行结果

     

转自:http://www.tuicool.com/articles/f6RVvq

http://www.sxt.cn/u/756/blog/4584

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值