storm的kafkaSpout实例

本文介绍了如何在storm单机环境下搭建并使用kafkaSpout。首先完成storm的安装与启动,通过UI确认集群运行正常。接着,文章详细讲解了相关代码实现,并在最后提出了操作过程中的注意事项。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

[时间的手,翻雨覆雨了什么]


上一篇讲到了flume-ng+kakfa的安装与单点测试,在此基础上,再加入storm的单机安装,以及kafkaSpout的实例。形成一个完整的实时单机处理实例。


1.安装与启动

(1)官网下载0.10.0版本,解压
wget http://mirror.bit.edu.cn/apache/storm/apache-storm-0.10.0/apache-storm-0.10.0.tar.gz

tar -xvzf apache-storm-0.10.0.tar.gz
(2)修改配置文件,添加环境变量
配置文件apache-storm-0.10.0/conf/storm.yaml,单机测试无需修改
修改bash_profile,增加如下几行
export STORM_HOME="/home/XX/apache-storm-0.10.0"
PATH=$PATH:${STORM_HOME}/bin
(3)启动集群
storm nimbus &
storm supervisor &
storm ui &

UI地址:localhost:8080/index.html

输入jps,可以看到 nimbus 和 supervisor进程,说明集群启动成功。

2.代码

topology部分
package cn.realtime;

import backtype.storm.Config;
import backtype.storm.LocalCluster;
import backtype.storm.StormSubmitter;
import backtype.storm.generated.StormTopology;
import backtype.storm.spout.SchemeAsMultiScheme;
import backtype.storm.topology.TopologyBuilder;
import storm.kafka.*;

import java.util.Map;

/**
 * Created by maixiaohai on 16/4/21.
 */
public class KafkaTopology {
    public static int NUM_WORKERS = 1;
    public static int NUM_ACKERS = 1;
    public static int MSG_TIMEOUT = 180;
    public static int SPOUT_PARALLELISM_HINT = 1;
    public static int PARSE_BOLT_PARALLELISM_HINT = 1;

    public StormTopology buildTopology(Map map) {
        String zkServer = map.get("zookeeper").toString();
        System.out.println("zkServer: " + zkServer);
        final BrokerHosts zkHosts = new ZkHosts(zkServer);
        SpoutConfig kafkaConfig = new SpoutConfig(zkHosts, "YOUR_KAFKA_TOPIC", "/test", "single-point-test");
        kafkaConfig.scheme = new SchemeAsMultiScheme(new StringScheme());
        TopologyBuilder builder = new TopologyBuilder();
        builder.setSpout("kafkaSpout", new KafkaSpout(kafkaConfig), SPOUT_PARALLELISM_HINT);
        builder.setBolt("parseBolt", new ParseBolt(), PARSE_BOLT_PARALLELISM_HINT).shuffleGrouping("kafkaSpout");
        return builder.createTopology();
    }
    public static void main(String[] args) throws Exception {
        System.out.println("===========start===========");
        Map map = XmlHelper.Dom2Map("realtime.xml");
        KafkaTopology kafkaTopology = new KafkaTopology();
        StormTopology stormTopology = kafkaTopology.buildTopology(map);
        Config config = new Config();
        config.setNumWorkers(NUM_WORKERS);
        config.setNumAckers(NUM_ACKERS);
        config.setMessageTimeoutSecs(MSG_TIMEOUT);
        config.setMaxSpoutPending(5000);
//        LocalCluster cluster = new LocalCluster();
//        cluster.submitTopology("single-point-test", config, stormTopology);
        StormSubmitter.submitTopology("single-point-test", config, stormTopology);
    }
}
realtime.xml配置
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <nimbus>
        <host>localhost</host>
        <thriftPort>6627</thriftPort>
    </nimbus>
    <zookeeper>localhost:2181</zookeeper>
    <kafka>localhost:9092</kafka>
</configuration>
parseBolt打印传入的tuple值
package cn.realtime;

import backtype.storm.topology.BasicOutputCollector;
import backtype.storm.topology.OutputFieldsDeclarer;
import backtype.storm.topology.base.BaseBasicBolt;
import backtype.storm.tuple.Tuple;

/**
 * Created by maixiaohai on 16/4/21.
 */
public class ParseBolt extends BaseBasicBolt {
    public void execute(Tuple tuple, BasicOutputCollector basicOutputCollector) {
        String word = tuple.getString(0);
        System.out.println(word);
    }

    public void declareOutputFields(OutputFieldsDeclarer outputFieldsDeclarer) {

    }
}
dependencies
<dependencies>
        <dependency>
            <groupId>org.apache.storm</groupId>
            <artifactId>storm-core</artifactId>
            <version>0.10.0</version>
            <exclusions>
                <exclusion>
                    <groupId>org.slf4j</groupId>
                    <artifactId>log4j-over-slf4j</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>org.slf4j</groupId>
                    <artifactId>slf4j-log4j12</artifactId>
                </exclusion>

            </exclusions>
        </dependency>
        <dependency>
            <groupId>org.apache.storm</groupId>
            <artifactId>storm-kafka</artifactId>
            <version>0.10.0</version>
            <exclusions>
                <exclusion>
                    <groupId>org.slf4j</groupId>
                    <artifactId>slf4j-log4j12</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
        <dependency>
            <groupId>org.apache.kafka</groupId>
            <artifactId>kafka_2.10</artifactId>
            <version>0.8.2.1</version>
            <exclusions>
                <exclusion>
                    <groupId>org.slf4j</groupId>
                    <artifactId>slf4j-log4j12</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
        <!--<dependency>-->
            <!--<groupId>com.googlecode.json-simple</groupId>-->
            <!--<artifactId>json-simple</artifactId>-->
            <!--<version>1.1.1</version>-->
        <!--</dependency>-->
    </dependencies>
打包后,执行storm jar kafkaTopology.jar cn.realtime.KafkaTopology
在UI上可以看到新提交的topology,在apache-storm-0.10.0/logs/目录下,会有相应的日志
结合 上一篇的flume监测文件路径的方式,即 agent.sources.source1.type = spooldir 时,
增加对应目录下文件并保存,可以看到storm日志中打印出对应的行。代表整个流程成功。

3.注意事项

(1)过程中的一些报错
报错1
java.lang.NoSuchMethodError: org.apache.zookeeper.ZooKeeper.<init>(Ljava/lang/String;ILorg/apache/zookeeper/Watcher;Z)V at org.apache.curator.utils.DefaultZookeeperFactory.newZooKeeper(DefaultZookeep
报错2
java.lang.NoClassDefFoundError: org/json/simple/JSONValue at storm.kafka.DynamicBrokersReader 

类似报错基本都是源于引入的maven库版本不合适,对应的jar包版本不对导致的报错。
比如报错1,代表zookeeper版本不对,找到引入zookeeper的maven库,发现是kafka_2.10的的version过低,
改为当前版本后报错1就消失了。
(2) 打包时,应该去掉strom-core,否则会冲突
(3)有关kafkaSpout和storm的配置参数后面会再写文章解释




评论 4
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值