2121SC@SDUSC
一、 LocalCluster VS StormSubmitter
把自己的拓扑提交给运行中的Storm集群:Storm有一的功能,可以在一个真实的集群上运行自己的拓扑,需要把LocalCluster换成StormSubmitter并实现submitTopology方法, 它负责把拓扑发送给集群。
代码如下:
//LocalCluster cluster = new LocalCluster();
//cluster.submitTopology("Count-Word-Topology-With-Refresh-Cache", conf,
//builder.createTopology());
StormSubmitter.submitTopology("Count-Word-Topology-With_Refresh-Cache", conf,
builder.createTopology());
//Thread.sleep(1000);
//cluster.shutdown();
注: 当使用StormSubmitter时,不能像使用LocalCluster时一样通过代码控制集群
接下来,把源码压缩成一个jar包,运行Storm客户端命令,把拓扑提交给集群。如果已经使用了Maven, 就只需要在命令行进入源码目录运行:mvn package即可
storm jar target/Topologies-0.0.1-SNAPSHOT.jar
countword.TopologyMain src/main/resources/words.txt
拓扑发布集群成功
如果想停止或杀死它,运行:
storm kill Count-Word-Topology-With-Refresh-Cache
注:拓扑名称必须保证惟一性。
二、DRPC拓扑
有一种特殊的拓扑类型叫做分布式远程过程调用(DRPC),它利用Storm的分布式特性执行远程过程调用(RPC)。Storm提供了一些用来实现DRPC的工具。拓扑执行最后的bolt时,它必须分配RPC请求ID和结果,使DRPC服务器把结果返回正确的客户端。
注:单实例DRPC服务器能够执行许多函数。每个函数由一个惟一的名称标识。
Storm提供的第二个工具是LineDRPCTopologyBuilder,一个辅助构建DRPC拓扑的抽象概念。生成的拓扑创建DRPCSpouts——它连接到DRPC服务器并向拓扑的其它部分分发数据——并包装bolts,使结果从最后一个bolt返回。依次执行所有添加到LinearDRPCTopologyBuilder对象的bolts。
bolt按下面的方式声明输出:
public void declareOutputFields
(OutputFieldsDeclarer declarer)
{
declarer.declare(new Fields("id","result"));
}
这是拓扑中惟一的bolt,它必须发布RPC ID和结果。
execute方法负责执行加法运算
public void execute(Tuple input)
{
String[] numbers = input.getString(1).split("\\+");
Integer added = 0;
if(numbers.length<2)
{
throw new InvalidParameterException
("Should be at least 2 numbers");
}
for(String num : numbers)
{
added += Integer.parseInt(num);
}
collector.emit(new Values(input.getValue(0),added));
}
包含加法bolt的拓扑定义如下:
public static void main(String[] args)
{
LocalDRPC drpc = new LocalDRPC();
LinearDRPCTopologyBuilder builder = new LinearDRPCTopologyBuilder("add");
builder.addBolt(AdderBolt(),2);
Config conf = new Config();
conf.setDebug(true);
LocalCluster cluster = new LocalCluster();
cluster.submitTopology
("drpcder-topology", conf,
builder.createLocalTopology(drpc));
String result = drpc.execute("add", "1+-1");
checkResult(result,0);
result = drpc.execute("add", "1+1+5+10");
checkResult(result,17);
cluster.shutdown();
drpc.shutdown();
}
创建一个拓扑构建器,把bolt添加到拓扑。运行DRPC对象(LocalDRPC对象)的execute方法测试拓扑。
三、executor中的error测试总结
执行器传输多线程测试:
有些拓扑可能在组件中生成额外的线程,以执行实际的处理工作并发出处理后的结果。
这个单元测试是在{@link ExecutorTransfer}级别模拟这些场景,并确保以多线程方式发送到workerTransferQueue的元组被远程Worker(消费者)正确地处理和接收。
这个测试模拟的拓扑结构是:
{worker1: taskId=1, component=“1”}——> {worker2: taskId=2, component=“2”}。
public class ExecutorTransferMultiThreadingTest
{
//初始化定义
private WorkerState workerState;
private Map<String, Object> topoConf;
private JCQueue transferQueue;
private GeneralTopologyContext generalTopologyContext;
private int selfTaskId = 1;
private String sourceComp = "1";
private int remoteTaskId = 2;
private String destComp = "2";
private static String value1 = "string-value";
private static int value2 = 1234;
@Before
public void setup() throws NoSuchFieldException {
topoConf = Utils.readStormConfig();
String topologyId = "multi-threaded-topo-test";
StormTopology stormTopology = createStormTopology();
WorkerTopologyContext workerTopologyContext = mock(WorkerTopologyContext.class);
when(workerTopologyContext.getRawTopology()).thenReturn(stormTopology);
when(workerTopologyContext.getComponentId(selfTaskId)).thenReturn(sourceComp);
when(workerTopologyContext.getComponentId(remoteTaskId)).thenReturn(destComp);
workerState = mock(WorkerState.class);
when(workerState.getWorkerTopologyContext()).thenReturn(workerTopologyContext);
Map<Integer, JCQueue> receiveQMap = new HashMap<>();
//局部recvQ在这个测试中不重要,做简单的模拟
receiveQMap.put(selfTaskId, mock(JCQueue.class));
when(workerState.getLocalReceiveQueues()).thenReturn(receiveQMap);
when(workerState.getTopologyId()).thenReturn(topologyId);
when(workerState.getPort()).thenReturn(6701);
when(workerState.getMetricRegistry()).thenReturn(new StormMetricRegistry());
when(workerState.tryTransferRemote(any(), any(), any())).thenCallRealMethod();
在此测试中使用的实际工作传输队列,工作转移队列的taskId应该是-1,但是已经有一个worker转移队列被WorkerTransfer类(taskId=-1)初始化。
taskId只用于度量,在这里不重要。调到零下100度以避免碰撞。
transferQueue = new JCQueue
("worker-transfer-queue", "worker-transfer-queue",
1024, 0, 1, new WaitStrategyPark(100),
workerState.getTopologyId(), Constants.SYSTEM_COMPONENT_ID,
Collections.singletonList(-100), workerState.getPort(),
workerState.getMetricRegistry());
将WorkerTransfer(在WorkerState内部)中的transferQueue替换为本次测试中使用的自定义transferQueueReplace the transferQueue inside WorkerTransfer (inside WorkerState) with the customized transferQueue to be used in this test
WorkerTransfer workerTransfer = new WorkerTransfer(workerState, topoConf, 2);
FieldSetter.setField(workerTransfer, workerTransfer.getClass().getDeclaredField("transferQueue"), transferQueue);
FieldSetter.setField(workerState, workerState.getClass().getDeclaredField("workerTransfer"), workerTransfer);
generalTopologyContext = mock(GeneralTopologyContext.class);
}
@Test
public void testExecutorTransfer() throws InterruptedException {
每个执行器有一个ExecutorTransfer
ExecutorTransfer executorTransfer = new ExecutorTransfer(workerState, topoConf);
executorTransfer.initLocalRecvQueues();
ExecutorService executorService = Executors.newFixedThreadPool(5);
每个执行器中可能有多个产生器线程发送元组,这模仿了多线程组件的情况,其中一个组件生成额外的线程来发出元组
int producerTaskNum = 10;
Runnable[] producerTasks = new Runnable[producerTaskNum];
for (int i = 0; i < producerTaskNum; i++) {
producerTasks[i] = createProducerTask(executorTransfer);
}
for (Runnable task : producerTasks) {
executorService.submit(task);
}
给生产者足够的时间将消息插入到队列中
executorService.awaitTermination(1000,
TimeUnit.MILLISECONDS);
使用队列中的所有元组,并逐个反序列化它们模拟远程工作者。
KryoTupleDeserializer deserializer = new KryoTupleDeserializer(topoConf, workerState.getWorkerTopologyContext());
SingleThreadedConsumer consumer = new SingleThreadedConsumer(deserializer, producerTaskNum);
transferQueue.consume(consumer);
consumer.finalCheck();
executorService.shutdown();
}
private Runnable createProducerTask(ExecutorTransfer executorTransfer) {
return new Runnable() {
Tuple tuple = new TupleImpl(generalTopologyContext, new Values(value1, value2), sourceComp, selfTaskId, "default");
AddressedTuple addressedTuple = new AddressedTuple(remoteTaskId, tuple);
@Override
public void run() {
executorTransfer.tryTransfer(addressedTuple, null);
}
};
}
private StormTopology createStormTopology() {
TopologyBuilder builder = new TopologyBuilder();
builder.setSpout(sourceComp, new TestWordSpout(true), 1);
builder.setBolt(destComp, new TestWordCounter(), 1).fieldsGrouping(sourceComp, new Fields("word"));
return builder.createTopology();
}
private static class SingleThreadedConsumer implements JCQueue.Consumer {
KryoTupleDeserializer deserializer;
int numMessages;
int msgCount = 0;
public SingleThreadedConsumer(KryoTupleDeserializer deserializer, int numMessages) {
this.deserializer = deserializer;
this.numMessages = numMessages;
}
有多个生产者同时向队列发送消息,消费者逐个接收消息并尝试反序列化它们。如果在过程中有任何问题/异常,这基本上意味着数据损坏发生了,@param接收的对象
@Override
public void accept(Object o) {
TaskMessage taskMessage = (TaskMessage) o;
TupleImpl receivedTuple = deserializer.deserialize(taskMessage.message());
Assert.assertEquals(receivedTuple.getValue(0), value1);
Assert.assertEquals(receivedTuple.getValue(1), value2);
msgCount++;
}
这确保了生产者发送的每个消息都能被消费者接收。
public void finalCheck() {
Assert.assertEquals(numMessages, msgCount);
}
@Override
public void flush() {
}
}
}
知识点学习:https://ifeve.com/getting-started-with-storm-3/