1,从$SPARK_HOME/sbin/start-all.sh调用/sbin/start-slaves.sh,如果是不传--with-tachyon,会直接调用start-slaves.sh脚本,没有参数。(参看spark-core_05:$SPARK_HOME/sbin/start-all.sh、start-master.sh脚本分析)
2,$SPARK_HOME/sbin/start-slaves.sh会调用$SPARK_HOME/sbin/slaves.sh,传递如下参数:
start-slaves.sh cd/data/spark-1.6.0-bin-hadoop2.6 ;/data/spark-1.6.0-bin-hadoop2.6/sbin/start-slave.sh spark://luyl152:7077
(参看spark-core_07: $SPARK_HOME/sbin/start-slaves.sh脚本分析)
3,$SPARK_HOME/sbin/slaves.sh,在slavas.sh会让ssh去每个slave机器执行,如下命令
ssh -o StrictHostKeyChecking=no$slave "cd $SPARK_HOME ; $SPARK_HOME/sbin/start-slave.shspark://luyl152:7077"
==>然后,slave.sh会去调用spark-daemon.sh,传递如下参数
spark-daemon.sh start org.apache.spark.deploy.worker.Worker 1--webui-port 8081 spark://luyl152:7077
(参看:spark-core_08: $SPARK_HOME/sbin/slaves.sh、start-slave.sh脚本分析)
4,在spark-daemon.sh会调用spark-class,传递如下参数
$SPARK_HOME/bin/spark-class org.apache.spark.deploy.worker.Worker --webui-port 8081 spark://luyl152:7077
(参看:spark-core_06: $SPARK_HOME/sbin/spark-daemon.sh脚本分析)
5,所以spark-class最终会使用launch.Main来初始化参数,然后再exec"${CMD[@]}" 调用Worker
所以传给Worker的args参数就是--webui-port 8081 spark://masterHost:7077
(参看:spark-core_02:spark-submit、spark-class脚本分析)
一、进入Worker的main方法,主要代码是初始SparkConf、WorkerArguments,Worker自己的RpcEnv;
Master主要负责整个集群的资源调度,Worker负责自己所在节点的资源调度
private[deploy] object Workerextends Logging {
val SYSTEM_NAME= "sparkWorker"
val ENDPOINT_NAME= "Worker"
//start-all.sh让slaves.sh启动的每个Worker传进来的参数是:--webui-port 8081 spark://luyl152:7077
def main(argStrings: Array[String]) {
SignalLogger.register(log)
//关于SparkConf初始化过程和Master一样(spark-core_09:org.apache.spark.deploy.master.Master源码解析1)
val conf= new SparkConf
//给sprkConf加属性:spark-class 在启动master时并没有指定--properties-file,或直接在conf/spark-defaults.conf增加就可以
val args= new WorkerArguments(argStrings, conf)
1,分析一下WorkerArguments,主要的工作就是将spark环境变量及提交给worker的参数,赋给自己的成员。
/**
* Command-line parser for the worker.
* args 参数是 --webui-port 8081 spark://luyl152:7077
*/
private[worker] class WorkerArguments(args:Array[String], conf: SparkConf) {
var host= Utils.localHostName()
var port= 0
var webUiPort= 8081
var cores= inferDefaultCores()//得当前服务器的处理器数量
var memory= inferDefaultMemory() //给服务器留1g,其它都拿来使用
var masters: Array[String] = null
var workDir: String = null
var propertiesFile: String= null
// Check for settings in environment variables
//SPARK_WORKER_PORT、SPARK_WORKER_CORES、SPARK_WORKER_MEMORY等这些参数都可以在spark_env.sh中设置
if (System.getenv("SPARK_WORKER_PORT") != null) {
port =System.getenv("SPARK_WORKER_PORT").toInt
}
if (System.getenv("SPARK_WORKER_CORES") != null) {
cores =System.getenv("SPARK_WORKER_CORES").toInt
}
if (conf.getenv("SPARK_WORKER_MEMORY") != null) {
memory =Utils.memoryStringToMb(conf.getenv("SPARK_WORKER_MEMORY"))
}
if (System.getenv("SPARK_WORKER_WEBUI_PORT") != null) {
//默认不在spark_env.sh中设置是没有值的,会在parse()将它设置成8081
webUiPort = System.getenv("SPARK_WORKER_WEBUI_PORT").toInt
}
if (System.getenv("SPARK_WORKER_DIR") != null) {
//SPARK_WORKER_DIR如果不设置这个变量,会在worker启动时在spark_home下面创建一个work目录
workDir = System.getenv("SPARK_WORKER_DIR")
}
parse(args.toList)
// This mutates the SparkConf, so all accesses to it mustbe made after this line
//可变的SparkConf,所以需要先执行,并且spark-class 在启动master时并没有指定--properties-file,所以propertiesFile是null
/** 从给定filePath加载默认的Spark属性。 如果未提供文件,请使用通用默认文件spark-defaults.conf。
*
* 这会在给定的SparkConf和此JVM的系统属性中改变状态。 返回使用的属性文件的路径。
* 当前spark-class在启动Worker时,并没有指定--properties-file,所以propertiesFile是null,因此会取默认文件
*
* # apache.spark.master spark://master:7077
# apache.spark.eventLog.enabled true
# apache.spark.eventLog.dir hdfs://namenode:8021/directory
# apache.spark.serializer org.apache.spark.serializer.KryoSerializer
# apache.spark.driver.memory 5g
# apache.spark.executor.extraJavaOptions -XX:+PrintGCDetails -Dkey=value-Dnumbers="one two three"
*/
propertiesFile = Utils.loadDefaultSparkProperties(conf, propertiesFile)
if (conf.contains("spark.worker.ui.port")) {
webUiPort = conf.get("spark.worker.ui.port").toInt
}
checkWorkerMemory()
//--webui-port 8081 spark://luyl152:7077
private def parse(args: List[String]): Unit =args match {
。。。
case "--webui-port":: IntParam(value) :: tail =>
webUiPort = value
parse(tail)
case ("--properties-file") :: value :: tail =>
propertiesFile = value
parse(tail)
case ("--help") :: tail =>
printUsageAndExit(0)
case value:: tail =>
/**在start-slave.sh有这一段代码,默认没有$SPARK_WORKER_PORT是没有值的,所以提交时会有两个空格* ${SPARK_HOME}/sbin"/spark-daemon.sh start $CLASS$WORKER_NUM \
--webui-port"$WEBUI_PORT" $PORT_FLAG $PORT_NUM $MASTER "$@"
* if [ "$SPARK_WORKER_PORT" ="" ]; then
PORT_FLAG=
PORT_NUM=
else
*/
if (masters!= null){ // Two positional arguments were given
printUsageAndExit(1)
}
//最后一次会将masters的值设置成spark://luyl152:7077,如果有多个也会将它们加上spark://host:7077
masters = Utils.parseStandaloneMasterUrls(value)
parse(tail)
case Nil =>
if (masters == null) { // No positional argument was given
printUsageAndExit(1)
}
case _=>
printUsageAndExit(1)
}
。。。
def inferDefaultCores(): Int = {
//得当前服务器的处理器数量
Runtime.getRuntime.availableProcessors()
}
//这段代码还是很有意思的,可以用jdk的包来得到当前节点的内存及空闲内存,包括交互区的
def inferDefaultMemory(): Int = {
//Java 运行时环境供应商: 我电脑的值是Oracle Corporation,所以这个ibmVendor是false
val ibmVendor= System.getProperty("java.vendor").contains("IBM")
var totalMb= 0
try {
// scalastyle:off classforname
//返回运行 Java 虚拟机的操作系统的管理 Bean
val bean= ManagementFactory.getOperatingSystemMXBean()
if (ibmVendor){
val beanClass= Class.forName("com.ibm.lang.management.OperatingSystemMXBean")
val method= beanClass.getDeclaredMethod("getTotalPhysicalMemory")
totalMb = (method.invoke(bean).asInstanceOf[Long] / 1024/ 1024).toInt
} else {
//这个接口还有getFreePhysicalMemorySize空闲内存的方法,总交互区内存方法:getTotalSwapSpaceSize
//http://www.cnblogs.com/davidwang456/p/6182453.html
val beanClass = Class.forName("com.sun.management.OperatingSystemMXBean")
val method= beanClass.getDeclaredMethod("getTotalPhysicalMemorySize") //方法返回的字节数
totalMb = (method.invoke(bean).asInstanceOf[Long] / 1024/ 1024).toInt
}
// scalastyle:on classforname
} catch{
case e: Exception => {
totalMb = 2*1024
// scalastyle:off println
System.out.println("Failed to get total physical memory. Using " + totalMb + " MB")
// scalastyle:on println
}
}
// Leave out 1 GB for the operating system, but don'treturn a negative memory size
//为操作系统留出1 GB的空间,但不会返回负数的内存大小
math.max(totalMb - 1024, Utils.DEFAULT_DRIVER_MEM_MB)
}
===》再回到Worker的main方法中:
//将worker注册到RpcEnv中: host:"worker对应的主机名", port:没有设置SPARK_WORKER_PORT则默认0,webUiPort:8081
//cores的数量就是当前worker节点的处理器的数据,memory会给服务器留1g其它都拿来使用,masters:Array(spark://luyl152:7077)
//workDir默认为null
val rpcEnv= startRpcEnvAndEndpoint(args.host, args.port, args.webUiPort, args.cores,
args.memory, args.masters, args.workDir, conf = conf)
rpcEnv.awaitTermination()
}
2,分析一下startRpcEnvAndEndpoint(),它的作用就是创建RpcEnv,初始化WorkerRpcEndpoint
//host:"worker对应的主机名",port:没有设置SPARK_WORKER_PORT则默认0,webUiPort:8081
//cores的数量就是当前worker节点的处理器的数据,memory会给服务器留1g其它都拿来使用,masters:Array(spark://luyl152:7077)
//workDir默认为null
def startRpcEnvAndEndpoint(
host: String,
port: Int,
webUiPort: Int,
cores: Int,
memory: Int,
masterUrls: Array[String], //Array(spark://luyl152:7077)
workDir: String,
workerNumber: Option[Int] =None,
conf: SparkConf = new SparkConf):RpcEnv = {
// The LocalSparkCluster runs multiple local sparkWorkerXRPC Environments
//sparkWorker
//这个workerNumber目前没有用到,当Worker有多个实例时,start-slave.sh会多次调用这个main
val systemName= SYSTEM_NAME + workerNumber.map(_.toString).getOrElse("")
val securityMgr= new SecurityManager(conf)
//会在每个worker节点上,创建RpcEnv容器,标识是sparkWorker
val rpcEnv= RpcEnv.create(systemName, host, port, conf, securityMgr)
//会将spark://host:port中的host及port给masterAddresses:Array(RpcAddress(luyl152,7077))
val masterAddresses= masterUrls.map(RpcAddress.fromSparkURL(_))
//将Worker注入到RpcEnv中,并它的标识给Worker主构造方法
rpcEnv.setupEndpoint(ENDPOINT_NAME, new Worker(rpcEnv, webUiPort, cores, memory,
masterAddresses, systemName, ENDPOINT_NAME, workDir, conf, securityMgr))
rpcEnv
}
二、分析Worker实例化的过程
/** Worker也是RpcEndpoint的子类,所以接下来查看RpcEndpoint生命周期的四个方法: onStart -> receive(receiveAndReply)* -> onStop * host:"worker对应的主机名", port:没有设置SPARK_WORKER_PORT则默认0,webUiPort:8081 cores的数量就是当前worker节点的处理器的数据,memory会给服务器留1g其它都拿来使用, masterAddresses:Array(RpcAddress(luyl152, 7077),RpcAddress(luyl153, 7077),RpcAddress(luyl154, 7077)) systemName: sparkWorker ,endpointName : Worker workDir默认为null */ private[deploy] class Worker( override val rpcEnv: RpcEnv, webUiPort: Int, cores: Int, memory: Int, masterRpcAddresses: Array[RpcAddress], systemName: String, endpointName: String, workDirPath: String = null, val conf: SparkConf, val securityMgr: SecurityManager) extends ThreadSafeRpcEndpoint with Logging { //worker的host和port private val host = rpcEnv.address.host private val port = rpcEnv.address.port Utils.checkHost(host, "Expected hostname") assert (port > 0) // A scheduled executor used to send messages at the specified time. //executor调度用的,按指定时间发送信息 private val forwordMessageScheduler = ThreadUtils.newDaemonSingleThreadScheduledExecutor("worker-forward-message-scheduler") // A separated thread to clean up the workDir. Used to provide the implicit parameter of `Future` methods. // 一个分离的线程来清理workDir。 用于提供`Future`方法的隐式参数。 private val cleanupThreadExecutor = ExecutionContext.fromExecutorService( ThreadUtils.newDaemonSingleThreadExecutor("worker-cleanup-thread")) // For worker and executor IDs,给Worker和executor做id用的 private def createDateFormat = new SimpleDateFormat("yyyyMMddHHmmss") // Send a heartbeat every (heartbeat timeout) / 4 milliseconds 每15秒发一个心跳 private val HEARTBEAT_MILLIS = conf.getLong("spark.worker.timeout", 60) * 1000 / 4 // Model retries to connect to the master, after Hadoop's model. // The first six attempts to reconnect are in shorter intervals (between 5 and 15 seconds) // Afterwards, the next 10 attempts are between 30 and 90 seconds. // A bit of randomness is introduced so that not all of the workers attempt to reconnect at // the same time. //在Hadoop的模型之后,模型重试连接到master。 //前六次尝试重新连接的时间间隔较短(5到15秒之间)之后,接下来的10次尝试在30到90秒之间。 引入一点随机性,以便并非所有工作人员都试图同时重新连接。 private val INITIAL_REGISTRATION_RETRIES = 6 private val TOTAL_REGISTRATION_RETRIES = INITIAL_REGISTRATION_RETRIES + 10 private val FUZZ_MULTIPLIER_INTERVAL_LOWER_BOUND = 0.500 private val REGISTRATION_RETRY_FUZZ_MULTIPLIER = { //从UUID.randomUUID.getMostSignificantBits取一个随机值的种子给Random,从而保证在每一次运行时都不一样。 val randomNumberGenerator = new Random(UUID.randomUUID.getMostSignificantBits) //Random默认nextDouble是[0,1)之间的值+ 0.500 randomNumberGenerator.nextDouble + FUZZ_MULTIPLIER_INTERVAL_LOWER_BOUND } //round返回接近参数的long值,1-15之间的值 private val INITIAL_REGISTRATION_RETRY_INTERVAL_SECONDS = (math.round(10 * REGISTRATION_RETRY_FUZZ_MULTIPLIER)) //6-90之间的值 private val PROLONGED_REGISTRATION_RETRY_INTERVAL_SECONDS = (math.round(60 * REGISTRATION_RETRY_FUZZ_MULTIPLIER)) //清理选项默认是关闭 private val CLEANUP_ENABLED = conf.getBoolean("spark.worker.cleanup.enabled", false) // How often worker will clean up old app folders //清理旧app的文件夹的时间 private val CLEANUP_INTERVAL_MILLIS = conf.getLong("spark.worker.cleanup.interval", 60 * 30) * 1000 // TTL for app folders/data; after TTL expires it will be cleaned up //TTL用于应用程序文件夹/数据; TTL到期后它将被清理 private val APP_DATA_RETENTION_SECONDS = conf.getLong("spark.worker.cleanup.appDataTtl", 7 * 24 * 3600) private val testing: Boolean = sys.props.contains("spark.testing") private var master: Option[RpcEndpointRef] = None private var activeMasterUrl: String = "" private[worker] var activeMasterWebUiUrl : String = "" //systemName: sparkWorker ,endpointName : Worker //返回spark://sparkWorker@luyl153:RpcAddress.port,里面调用了RpcEndpointAddress.toString方法 private val workerUri = rpcEnv.uriOf(systemName, rpcEnv.address, endpointName) private var registered = false private var connected = false //worker-20180321165947-luyl153-RpcAddress.port值 private val workerId = generateWorkerId() private val sparkHome = if (testing) { assert(sys.props.contains("spark.test.home"), "spark.test.home is not set!") new File(sys.props("spark.test.home")) } else { new File(sys.env.get("SPARK_HOME").getOrElse(".")) } var workDir: File = null //finishedExecutors是管理ExecutorBackEnd进程 val finishedExecutors = new LinkedHashMap[String, ExecutorRunner] val drivers = new HashMap[String, DriverRunner] val executors = new HashMap[String, ExecutorRunner] val finishedDrivers = new LinkedHashMap[String, DriverRunner] val appDirectories = new HashMap[String, Seq[String]] val finishedApps = new HashSet[String] //spark.worker.ui.retainedExecutors: Spark UI和状态API在垃圾收集之前记住了多少已完成的executor程序,默认是1000。 val retainedExecutors = conf.getInt("spark.worker.ui.retainedExecutors", WorkerWebUI.DEFAULT_RETAINED_EXECUTORS) //spark.worker.ui.retainedDrivers: Spark UI和状态API在垃圾收集之前记住了多少个已完成的drivers程序。默认是1000 val retainedDrivers = conf.getInt("spark.worker.ui.retainedDrivers", WorkerWebUI.DEFAULT_RETAINED_DRIVERS) // The shuffle service is not actually started unless configured. //默认shuffle服务实际上不会启动。如果多个应用程序共享Spark群集中的资源,开启它功能特别有用。 //如果是standalone模式:只要每个worker开启spark.shuffle.service.enabled为true //给sprkConf加属性:spark-class 在启动时并没有指定--properties-file,或直接在conf/spark-defaults.conf增加就可以 //http://spark.apache.org/docs/1.6.0/job-scheduling.html#configuration-and-setup private val shuffleService = new ExternalShuffleService(conf, securityMgr)
三、分析一下new ExternalShuffleService(conf, securityMgr)这对于当多个应用程序共享spark群集资源很有用。
/**
* Provides a server from which Executorscan read shuffle files (rather than reading directly from
* each other), to provide uninterruptedaccess to the files in the face of executors being turned
* off or killed.
*
* Optionally requires SASLauthentication in order to read. See [[SecurityManager]].
*
* 提供一个服务器,执行者可以从中读取shuffle文件(而不是直接读取对方),在执行者被关闭或死亡时提供对文件的不间断访问。
*/
private[deploy]
class ExternalShuffleService(sparkConf: SparkConf, securityManager: SecurityManager)
extends Logging {
/**
* 如果spark.dynamicAllocation.enabled为“true”就可以启用外部shuffle服务。 该服务保留由执行程序写入的shuffle文件,因此可以安全地移除执行程序。
*
* a, 如果多个应用程序共享Spark群集中的资源,开启它功能特别有用。
* b,在standalone,yarn, Mesos coarse-grained mode 都可以使用:http://spark.apache.org/docs/1.6.0/job-scheduling.html#configuration-and-setup
* c,在standalone开启外部shuffle服务,只要 每个worker节点都要spark.shuffle.service.enabled 为 true
哪如何开启呢,只需要在启动start-all.sh时指定--conf PROP=VALUE或--properties-file或spark_home/conf/spark-defaults.conf中添加就可以 * * spark.dynamicAllocation.enabled: 是否使用动态资源分配,这种分配可以根据工作负载来扩大和缩小注册到此应用程序的执行程序的数量。 * 只有yarn时要开启它 这需要设置spark.shuffle.service.enabled。 以下配置也是相关的:spark.dynamicAllocation.minExecutors,spark.dynamicAllocation.maxExecutors
和spark.dynamicAllocation.initialExecutors */ private val enabled = sparkConf.getBoolean("spark.shuffle.service.enabled", false) private val port = sparkConf.getInt("spark.shuffle.service.port", 7337) private val useSasl: Boolean = securityManager.isAuthenticationEnabled() private val transportConf = SparkTransportConf.fromSparkConf(sparkConf, "shuffle", numUsableCores = 0) private val blockHandler = newShuffleBlockHandler(transportConf) private val transportContext: TransportContext = new TransportContext(transportConf, blockHandler, true) private var server: TransportServer = _ /** Create a new shuffle block handler. Factored out for subclasses to override. */ protected def newShuffleBlockHandler(conf: TransportConf): ExternalShuffleBlockHandler = { new ExternalShuffleBlockHandler(conf, null) } /** Starts the external shuffle service if the user has configured us to.
该方法会被Worker在onStart方法中调用,所以在standalone中,只需要给每个worker设置
spark.shuffle.service.enabled 为 true就可以了
*/ def startIfEnabled() { if (enabled) { start() } } /** Start the external shuffle service */ def start() { require(server == null, "Shuffle server already started") logInfo(s"Starting shuffle service on port $port with useSasl = $useSasl") val bootstraps: Seq[TransportServerBootstrap] = if (useSasl) { Seq(new SaslServerBootstrap(transportConf, securityManager)) } else { Nil } //使用netty服务构造的server server = transportContext.createServer(port, bootstraps.asJava) }
===》再回到Worker的类中
//当前worker的主机名 private val publicAddress = { val envVar = conf.getenv("SPARK_PUBLIC_DNS") if (envVar != null) envVar else host } private var webUi: WorkerWebUI = null private var connectionAttemptCount = 0 //在Master中分析过MetricsSystem
(spark-core_10: org.apache.spark.deploy.master.Master源码解析2--Master这个RpcEndPoint是如何初始化Master的) private val metricsSystem = MetricsSystem.createMetricsSystem("worker", conf, securityMgr) private val workerSource = new WorkerSource(this) private var registerMasterFutures: Array[JFuture[_]] = null private var registrationRetryTimer: Option[JScheduledFuture[_]] = None // A thread pool for registering with masters. Because registering with a master is a blocking // action, this thread pool must be able to create "masterRpcAddresses.size" threads at the same time so that we can register with all masters. //注册到master上的线程池,因为注册的过程是阻塞操作,所以提供一个masterRpcAddresses.size个数的线程去同时向所有master进行注册 private val registerMasterThreadPool = ThreadUtils.newDaemonCachedThreadPool( "worker-register-master-threadpool", masterRpcAddresses.size // Make sure we can register with all masters at the same time ) var coresUsed = 0 var memoryUsed = 0 def coresFree: Int = cores - coresUsed def memoryFree: Int = memory - memoryUsed private def createWorkDir() { //SPARK_WORKER_DIR如果不设置这个变量, 会在worker启动时在spark_home下面创建一个work目录 workDir = Option(workDirPath).map(new File(_)).getOrElse(new File(sparkHome, "work")) try { // This sporadically fails - not sure why ... !workDir.exists() && !workDir.mkdirs() // So attempting to create and then check if directory was created or not. workDir.mkdirs() if ( !workDir.exists() || !workDir.isDirectory) { logError("Failed to create work directory " + workDir) System.exit(1) } assert (workDir.isDirectory) } catch { case e: Exception => logError("Failed to create work directory " + workDir, e) System.exit(1) } } //RpcEndpoint会向onStart,然后不断在receive(receiveAndReply)* 中收信息,最后onstop override def onStart() { assert(!registered) //打印启动worker的信息 logInfo("Starting Spark worker %s:%d with %d cores, %s RAM".format( host, port, cores, Utils.megabytesToString(memory))) logInfo(s"Running Spark version ${org.apache.spark.SPARK_VERSION}") logInfo("Spark home: " + sparkHome) createWorkDir()
//这就是开启协调多个应用共享一个spark集群资源的入口 shuffleService.startIfEnabled() //使用jettyServer启动的,可以参看MasterWebUI webUi = new WorkerWebUI(this, workDir, webUiPort) webUi.bind() //先将Worker信息注册到Master上 registerWithMaster() metricsSystem.registerSource(workerSource) metricsSystem.start() // Attach the worker metrics servlet handler to the web ui after the metrics system is started. metricsSystem.getServletHandlers.foreach(webUi.attachHandler) }

本文详细解析了Spark集群中Worker节点的启动流程,包括从start-all.sh脚本开始,到Worker实例化并完成初始化的过程。重点介绍了启动脚本、环境变量配置、参数解析、RPC环境创建以及与Master节点的通信机制。
402

被折叠的 条评论
为什么被折叠?



