public class MessageTopology {
public static void main(String[] args) throws Exception {
//声明节点
MessageSpout messageSpout = new MessageSpout();
SpliterBolt splitBolt = new SpliterBolt();
PrintBolt printBolt = new PrintBolt();
//构建拓扑
TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("spout_id", messageSpout);
builder.setBolt("splitBolt_id", splitBolt,1).shuffleGrouping("spout_id");
builder.setBolt("printBolt_id", printBolt,1).shuffleGrouping("splitBolt_id");
Config cfg =new Config();
cfg.setDebug(true);
cfg.setNumWorkers(2);
LocalCluster cluster = new LocalCluster();
cluster.submitTopology("topo1", cfg, builder.createTopology());
Thread.sleep(20000);
cluster.shutdown();
}
}
MessageTopology类 主要做的事情:
1. 声明数据来源 MessageSpout, 声明字符拆分处理 SpliterBolt, 声明 字符持久化处理 printBolt
2. 按 spout --> splitBolt ---> printBolt 顺序构建拓扑
3. 本地提交执行拓扑
public class MessageSpout extends BaseRichSpout{
private static final long serialVersionUID = 1L;
private SpoutOutputCollector conllector;
private boolean flag = false;
private String[] subjects = new String[]{
"groovy,oeacnbase",
"openfire,restful",
"flume,activiti",
"hadoop,hbase",
"spark,sqoop"
};
private int index=0;
public void open(Map conf, TopologyContext context, SpoutOutputCollector collector) {
this.conllector = collector;
}
public void nextTuple() {
if(index < subjects.length) {
String sub = subjects[index];
conllector.emit(new Values(sub),index);
index++;
}
}
public void declareOutputFields(OutputFieldsDeclarer declarer) {
declarer.declare(new Fields("subjects"));
}
@Override
public void ack(Object msgId) {
//super.ack(msgId);
System.out.println("【发送成功】 msgId="+msgId);
}
@Override
public void fail(Object msgId) {
//super.fail(msgId);
System.out.println("【发送失败】 msgId="+msgId);
System.out.println("消息重新发送");
this.conllector.emit(new Values(subjects[(Integer)msgId]), msgId);
System.out.println("重新发送完毕");
}
}
MessageSpout 类 为整个保证机制提供数据
重写的open() nextTuple() declareOutputFields() 方法不再赘述
重要关注一下 ack() 和 fail() 方法
如果当前 splot 或者 后继的 bolt 发送数据成功将会回调 ack方法,相反的 则会调用fail方法
@Override
public void ack(Object msgId) {
//super.ack(msgId);
System.out.println("【发送成功】 msgId="+msgId);
}
@Override
public void fail(Object msgId) {
//super.fail(msgId);
System.out.println("【发送失败】 msgId="+msgId);
System.out.println("消息重新发送");
this.conllector.emit(new Values(subjects[(Integer)msgId]), msgId);
System.out.println("重新发送完毕");
}
public class SpliterBolt implements IRichBolt{
private static final long serialVersionUID = 1L;
private boolean flag =false;
private OutputCollector collector;
public void prepare(Map stormConf, TopologyContext context, OutputCollector collector) {
this.collector = collector;
}
public void execute(Tuple input) {
try {
String subjects =input.getStringByField("subjects");
/*if(!flag && subjects.equals("openfire,restful")) {
this.flag = true;
int i =10/0;
}*/
String[] sub = subjects.split(",");
for( String word : sub) {
this.collector.emit(input, new Values(word));
}
this.collector.ack(input);
} catch (Exception e) {
e.printStackTrace();
this.collector.fail(input);
}
}
public void cleanup() { }
public void declareOutputFields(OutputFieldsDeclarer declarer) {
declarer.declare(new Fields("word"));
}
public Map<String, Object> getComponentConfiguration() {
return null;
}
}
SpliterBolt 类在该类execute 方法中添加 int i =10/0; 模仿信息发送失败,由于flag初始值为 false, 满足subjects.equals("openfire,restful") 即执行且只执行一次 该代码,抛出的异常被try代码捕获,执行this.collector.fail(input); 告诉splot 发送失败,将此发送过程重新执行一边。需要注意一点发送时要携带tuple this.collector.emit(input, new Values(word)); 必须有标记
成功标记 this.collector.ack(input);
失败标记 this.collector.fail(input);
public class PrintBolt implements IRichBolt{
private static final long serialVersionUID = 1L;
private boolean flag = false;
private OutputCollector collector;
private FileWriter fileWriter;
public void prepare(Map stormConf, TopologyContext context, OutputCollector collector) {
try {
this.collector = collector;
this.fileWriter = new FileWriter("d:/message/message.txt");
} catch (IOException e) {
e.printStackTrace();
}
}
public void execute(Tuple input) {
String word = input.getStringByField("word");
try {
if(!flag && word.equals("restful")) {
flag = true;
int i =10/0;
}
this.fileWriter.write(word);
this.fileWriter.write("\r\n");
this.fileWriter.flush();
} catch (Exception e) {
e.printStackTrace();
collector.fail(input);
}
collector.emit(input, new Values(word));
collector.ack(input);
}
public void cleanup() {
try {
this.fileWriter.close();
} catch (IOException e) {
e.printStackTrace();
}
}
public void declareOutputFields(OutputFieldsDeclarer declarer) {
declarer.declare(new Fields("word"));
}
public Map<String, Object> getComponentConfiguration() {
return null;
}
}
PrintBolt类与SpliterBolt 实现原理一致,不在赘述
总结: strom 在发送数据时,如果某条数据发送失败,其保证机制会使其将失败的内容再次执行一边,这种基础的保证机制
在一定程度上可以解决数据丢失的问题,但是该机制也会导致另一种问题的出现,那就是 Storm 重复消费问题,解决原理根据数据的id去重