Hama: BSPMaster

http://wiki.apache.org/hama/BSPMaster

 

Introduction

 

The main responsibility of BSPMaster can be found at Architecture

 

Services

 

The BSPMaster is a collection of services performing different tasks, including:

  • masterServer: An RPC server.
  • instructor: Asynchronous message dispatcher.
  • taskScheduler: A task scheduling service.
  • infoServer: A http server.
  • supervisor: (TODO: move to Monitor?)
  • systemDirCleaner: Cleanup system directory, default /tmp/hadoop/bsp/system, on HDFS.
  • syncClient: BSPMaster ZooKeeper client (TODO:curator?)

  • timer service: TODO


 

 

State

 

Two states are applied to BSPMaster node, including:

  • INITIALIZING
  • RUNNING
  • FAILED
  • SHUTTING DOWN
  • STOPPED

BSPMaster State

 

Scenario

 

  • Restart
    • When a reported task fails on a groom server, restart that job by re-running all tasks from the latest checkpoint that universally available. The reason not merely re-running the task that fails comes from the fact that universally available checkpoint may not be only one step behind the current superstep. This may lead to the deadlock between alive tasks and the restarted one during sync phase. For example, the universally checkpoint available is the 6th superstep, and currently running the computation from the 7th to 8th superstep. Suppose one of the tasks fails, then the system migrates the failed task to another machine and resumes the failed task from the 6th superstep checkpoint whilst other tasks keep continuously running until hitting the barrier sync at the superstep 8th. Now the dead lock is raised when the resumed task, that previous fails, hits the barrier sync at the superstep 7th because no other tasks are at the superstep 7th. There is one proposed solution to fix a task failure issue. A more complicated logic can be applied for this issue, but right now may just implement the simpler one.

 

Source

BSPMaster.java

评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值