storm-blot(3)

2121SC@SDUSC

一、 LocalCluster VS StormSubmitter

把自己的拓扑提交给运行中的Storm集群:Storm有一的功能,可以在一个真实的集群上运行自己的拓扑,需要把LocalCluster换成StormSubmitter并实现submitTopology方法, 它负责把拓扑发送给集群。

代码如下:

//LocalCluster cluster = new LocalCluster();
//cluster.submitTopology("Count-Word-Topology-With-Refresh-Cache", conf, 
//builder.createTopology());
StormSubmitter.submitTopology("Count-Word-Topology-With_Refresh-Cache", conf,
        builder.createTopology());
//Thread.sleep(1000);
//cluster.shutdown();

注: 当使用StormSubmitter时,不能像使用LocalCluster时一样通过代码控制集群

接下来,把源码压缩成一个jar包,运行Storm客户端命令,把拓扑提交给集群。如果已经使用了Maven, 就只需要在命令行进入源码目录运行:mvn package即可

storm jar target/Topologies-0.0.1-SNAPSHOT.jar  
 countword.TopologyMain src/main/resources/words.txt

拓扑发布集群成功

如果想停止或杀死它,运行:

storm kill Count-Word-Topology-With-Refresh-Cache

注:拓扑名称必须保证惟一性。

二、DRPC拓扑

有一种特殊的拓扑类型叫做分布式远程过程调用(DRPC),它利用Storm的分布式特性执行远程过程调用(RPC)。Storm提供了一些用来实现DRPC的工具。拓扑执行最后的bolt时,它必须分配RPC请求ID和结果,使DRPC服务器把结果返回正确的客户端。

注:单实例DRPC服务器能够执行许多函数。每个函数由一个惟一的名称标识。

Storm提供的第二个工具是LineDRPCTopologyBuilder,一个辅助构建DRPC拓扑的抽象概念。生成的拓扑创建DRPCSpouts——它连接到DRPC服务器并向拓扑的其它部分分发数据——并包装bolts,使结果从最后一个bolt返回。依次执行所有添加到LinearDRPCTopologyBuilder对象的bolts。

bolt按下面的方式声明输出:

public void declareOutputFields
(OutputFieldsDeclarer declarer) 
{
    declarer.declare(new Fields("id","result"));
}

这是拓扑中惟一的bolt,它必须发布RPC ID和结果。
execute方法负责执行加法运算

public void execute(Tuple input) 
{
    String[] numbers = input.getString(1).split("\\+");
    Integer added = 0;
    if(numbers.length<2)
    {
        throw new InvalidParameterException
        ("Should be at least 2 numbers");
    }
    for(String num : numbers)
    {
        added += Integer.parseInt(num);
    }
    collector.emit(new Values(input.getValue(0),added));
}

包含加法bolt的拓扑定义如下:

public static void main(String[] args)
 {
    LocalDRPC drpc = new LocalDRPC();

    LinearDRPCTopologyBuilder builder = new LinearDRPCTopologyBuilder("add");
    builder.addBolt(AdderBolt(),2);

    Config conf = new Config();
    conf.setDebug(true);

    LocalCluster cluster = new LocalCluster();
    cluster.submitTopology
    ("drpcder-topology", conf,
        builder.createLocalTopology(drpc));
    String result = drpc.execute("add", "1+-1");
    checkResult(result,0);
    result = drpc.execute("add", "1+1+5+10");
    checkResult(result,17);

    cluster.shutdown();
    drpc.shutdown();
}

创建一个拓扑构建器,把bolt添加到拓扑。运行DRPC对象(LocalDRPC对象)的execute方法测试拓扑。

三、executor中的error测试总结

执行器传输多线程测试:

有些拓扑可能在组件中生成额外的线程,以执行实际的处理工作并发出处理后的结果。

这个单元测试是在{@link ExecutorTransfer}级别模拟这些场景,并确保以多线程方式发送到workerTransferQueue的元组被远程Worker(消费者)正确地处理和接收。

这个测试模拟的拓扑结构是:
{worker1: taskId=1, component=“1”}——> {worker2: taskId=2, component=“2”}。

public class ExecutorTransferMultiThreadingTest 
{
//初始化定义
private WorkerState workerState;
private Map<String, Object> topoConf;
private JCQueue transferQueue;
private GeneralTopologyContext generalTopologyContext;
private int selfTaskId = 1;
private String sourceComp = "1";
private int remoteTaskId = 2;
private String destComp = "2";
private static String value1 = "string-value";
private static int value2 = 1234;

@Before
public void setup() throws NoSuchFieldException {
    topoConf = Utils.readStormConfig();
    String topologyId = "multi-threaded-topo-test";
    StormTopology stormTopology = createStormTopology();

    WorkerTopologyContext workerTopologyContext = mock(WorkerTopologyContext.class);
    when(workerTopologyContext.getRawTopology()).thenReturn(stormTopology);
    when(workerTopologyContext.getComponentId(selfTaskId)).thenReturn(sourceComp);
    when(workerTopologyContext.getComponentId(remoteTaskId)).thenReturn(destComp);

    workerState = mock(WorkerState.class);
    when(workerState.getWorkerTopologyContext()).thenReturn(workerTopologyContext);
    Map<Integer, JCQueue> receiveQMap = new HashMap<>();

//局部recvQ在这个测试中不重要,做简单的模拟

    receiveQMap.put(selfTaskId, mock(JCQueue.class));
    when(workerState.getLocalReceiveQueues()).thenReturn(receiveQMap);
    when(workerState.getTopologyId()).thenReturn(topologyId);
    when(workerState.getPort()).thenReturn(6701);
    when(workerState.getMetricRegistry()).thenReturn(new StormMetricRegistry());
    when(workerState.tryTransferRemote(any(), any(), any())).thenCallRealMethod();

在此测试中使用的实际工作传输队列,工作转移队列的taskId应该是-1,但是已经有一个worker转移队列被WorkerTransfer类(taskId=-1)初始化。

taskId只用于度量,在这里不重要。调到零下100度以避免碰撞。

    transferQueue = new JCQueue
    ("worker-transfer-queue", "worker-transfer-queue", 
    1024, 0, 1, new WaitStrategyPark(100),
        workerState.getTopologyId(), Constants.SYSTEM_COMPONENT_ID,
         Collections.singletonList(-100), workerState.getPort(),
        workerState.getMetricRegistry());

将WorkerTransfer(在WorkerState内部)中的transferQueue替换为本次测试中使用的自定义transferQueueReplace the transferQueue inside WorkerTransfer (inside WorkerState) with the customized transferQueue to be used in this test

   WorkerTransfer workerTransfer = new WorkerTransfer(workerState, topoConf, 2);
    FieldSetter.setField(workerTransfer, workerTransfer.getClass().getDeclaredField("transferQueue"), transferQueue);
    FieldSetter.setField(workerState, workerState.getClass().getDeclaredField("workerTransfer"), workerTransfer);

    generalTopologyContext = mock(GeneralTopologyContext.class);
}

@Test
public void testExecutorTransfer() throws InterruptedException {

每个执行器有一个ExecutorTransfer

    ExecutorTransfer executorTransfer = new ExecutorTransfer(workerState, topoConf);
    executorTransfer.initLocalRecvQueues();
    ExecutorService executorService = Executors.newFixedThreadPool(5);

每个执行器中可能有多个产生器线程发送元组,这模仿了多线程组件的情况,其中一个组件生成额外的线程来发出元组

    int producerTaskNum = 10;
    Runnable[] producerTasks = new Runnable[producerTaskNum];
    for (int i = 0; i < producerTaskNum; i++) {
        producerTasks[i] = createProducerTask(executorTransfer);
    }
    for (Runnable task : producerTasks) {
        executorService.submit(task);
    }

给生产者足够的时间将消息插入到队列中

    executorService.awaitTermination(1000, 
    TimeUnit.MILLISECONDS);

使用队列中的所有元组,并逐个反序列化它们模拟远程工作者。

    KryoTupleDeserializer deserializer = new KryoTupleDeserializer(topoConf, workerState.getWorkerTopologyContext());
    SingleThreadedConsumer consumer = new SingleThreadedConsumer(deserializer, producerTaskNum);
    transferQueue.consume(consumer);
    consumer.finalCheck();
    executorService.shutdown();
}

private Runnable createProducerTask(ExecutorTransfer executorTransfer) {
    return new Runnable() {
        Tuple tuple = new TupleImpl(generalTopologyContext, new Values(value1, value2), sourceComp, selfTaskId, "default");
        AddressedTuple addressedTuple = new AddressedTuple(remoteTaskId, tuple);

        @Override
        public void run() {
            executorTransfer.tryTransfer(addressedTuple, null);
        }
    };
}

private StormTopology createStormTopology() {
    TopologyBuilder builder = new TopologyBuilder();
    builder.setSpout(sourceComp, new TestWordSpout(true), 1);
    builder.setBolt(destComp, new TestWordCounter(), 1).fieldsGrouping(sourceComp, new Fields("word"));
    return builder.createTopology();
}

private static class SingleThreadedConsumer implements JCQueue.Consumer {
    KryoTupleDeserializer deserializer;
    int numMessages;
    int msgCount = 0;

    public SingleThreadedConsumer(KryoTupleDeserializer deserializer, int numMessages) {
        this.deserializer = deserializer;
        this.numMessages = numMessages;
    }

有多个生产者同时向队列发送消息,消费者逐个接收消息并尝试反序列化它们。如果在过程中有任何问题/异常,这基本上意味着数据损坏发生了,@param接收的对象

    @Override
    public void accept(Object o) {
        TaskMessage taskMessage = (TaskMessage) o;
        TupleImpl receivedTuple = deserializer.deserialize(taskMessage.message());
        Assert.assertEquals(receivedTuple.getValue(0), value1);
        Assert.assertEquals(receivedTuple.getValue(1), value2);
        msgCount++;
    }

这确保了生产者发送的每个消息都能被消费者接收。

    public void finalCheck() {
        Assert.assertEquals(numMessages, msgCount);
    }

    @Override
    public void flush() {
    }
}
 }

知识点学习:https://ifeve.com/getting-started-with-storm-3/

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值