目录
1. SparkConf
- SparkConf对象,也就是Spark的配置对象,用来描述Spark的配置信息,主要是以键值对的行式加载配置信息。
- 一旦通过new SparkConf()完成了对象的实例化,会默认的加载spark.*配置文件。
class SparkConf(loadDefaults: Boolean) {
def this() = this(true)
}
注意事项
- SparkContext对象的实例化,需要有一个SparkConf对象作为参数。
- 在SparkContext内部,会完成对这个SparkConf对象的克隆,得到一个各个属性值都完全相同的对象,但是和传入的SparkConf并不是同一个对象
- 在SparkContext后续的操作中,使用到的都是这个克隆的SparkConf对象。
- 注意事项:将SparkConf对象作为参数,传递给SparkContext对象,后续在修改这个SparkConf对象是无效的!
/** Copy this object */
override def clone: SparkConf = {
val cloned = new SparkConf(false)
settings.entrySet().asScala.foreach { e =>
cloned.set(e.getKey(), e.getValue(), true)
}
cloned
}
2. SparkContext
SparkContext的初始化过程:
1. 初始化了SparkConf对象,读取了默认的配置信息,并可以设置一些信息。
2. 将SparkConf对象,加载到SparkContext中,对各个配置属性进行初始化的设置。
3. 通过createTaskScheduler方法,实例化了TaskScheduler和DAGScheduler。
// 根据传入的Master地址,创建SchedulerBackend和TaskScheduler
// SparkContext.scala ,line 2692
private def createTaskScheduler(
sc: SparkContext,
master: String,
deployMode: String): (SchedulerBackend, TaskScheduler) = {
import SparkMasterRegex._
// When running locally, don't try to re-execute tasks on failure.
val MAX_LOCAL_TASK_FAILURES = 1
master match {
// setMaster("local"),local模式
case "local" =>
val scheduler = new TaskSchedulerImpl(sc, MAX_LOCAL_TASK_FAILURES, isLocal = true)
val backend = new LocalSchedulerBackend(sc.getConf, scheduler, 1)
scheduler.initialize(backend)
(backend, scheduler)
// setMaster("local[2]") || setMaster("local[*]"),local模式
case LOCAL_N_REGEX(threads) =>
def localCpuCount: Int = Runtime.getRuntime.availableProcessors()
// local[*] estimates the number of cores on the machine; local[N] uses exactly N threads.
val threadCount = if (threads == "*") localCpuCount else threads.toInt
if (threadCount <= 0) {
throw new SparkException(s"Asked to run locally with $threadCount threads")
}
val scheduler = new TaskSchedulerImpl(sc, MAX_LOCAL_TASK_FAILURES, isLocal = true)
val backend = new LocalSchedulerBackend(sc.getConf, scheduler, threadCount)
scheduler.initialize(backend)
(backend, scheduler)
// Standalone模式
case SPARK_REGEX(sparkUrl) =>
val scheduler = new TaskSchedulerImpl(sc)
val masterUrls = sparkUrl.split(",").map("spark://" + _)
val backend = new StandaloneSchedulerBackend(scheduler, sc, masterUrls)
scheduler.initialize(backend)
(backend, scheduler)
// 其他的资源调度,例如Mesos、YARN
case LOCAL_CLUSTER_REGEX(numSlaves, coresPerSlave, memoryPerSlave) =>
// Check to make sure memory requested <= memoryPerSlave. Otherwise Spark will just hang.
val memoryPerSlaveInt = memoryPerSlave.toInt
if (sc.executorMemory > memoryPerSlaveInt) {
throw new SparkException(
"Asked to launch cluster with %d MB RAM / worker but requested %d MB/worker".format(
memoryPerSlaveInt, sc.executorMemory))
}
val scheduler = new TaskSchedulerImpl(sc)
val localCluster = new LocalSparkCluster(
numSlaves.toInt, coresPerSlave.toInt, memoryPerSlaveInt, sc.conf)
val masterUrls = localCluster.start()
val backend = new StandaloneSchedulerBackend(scheduler, sc, masterUrls)
scheduler.initialize(backend)
backend.shutdownCallback = (backend: StandaloneSchedulerBackend) => {
localCluster.stop()
}
(backend, scheduler)
}
}
3. TaskScheduler
/**
* Low-level task scheduler interface, currently implemented exclusively by
* [[org.apache.spark.scheduler.TaskSchedulerImpl]].
* This interface allows plugging in different task schedulers. Each TaskScheduler schedules tasks
* for a single SparkContext. These schedulers get sets of tasks submitted to them from the
* DAGScheduler for each stage, and are responsible for sending the tasks to the cluster, running
* them, retrying if there are failures, and mitigating stragglers. They return events to the
* DAGScheduler.
*/
TaskScheduler是一个低级别的Task调度的接口,目前只有一个实现类,就是TaskSchedulerImpl。这个TaskScheduler可以挂载在不同的调度器上(指的是SchedulerBackend)
每一个TaskScheduler只能为一个SparkContext调度任务。初始化TaskScheduler是处理的之前的Spark的任务。如果有新的SparkApplication提交,此时就会销毁当前的TaskScheduler,并创建一个新的TaskScheduler来处理新的任务。
TaskScheduler可以从DAGScheduler获取每一个Stage的TaskSet,用来提交处理这些Task,发到集群执行。如果失败后,会进行重复提交,处理散兵游勇。并将任务的执行结果结果反馈给DAGScheduler。
(散兵游勇: 提交给集群运行的Task,可能会有掉队的情况,需要将这样的Task处理掉,不至于由于这一两个Task影响整体的执行。)
3.1. TaskSchedulerImpl
客户端需要先调用initialize()和start()方法,然后才可以通过runTasks提交TaskSet
// line81
// Task的等待时常,默认是100ms
val SPECULATION_INTERVAL_MS = conf.getTimeAsMs("spark.speculation.interval", "100ms")
// line92
// 初始化TaskSet的时常,默认是15s
val STARVATION_TIMEOUT_MS = conf.getTimeAsMs("spark.starvation.timeout", "15s")
// line95
// 每一个Task分配到的CPU核数
val CPUS_PER_TASK = conf.getInt("spark.task.cpus", 1)
// line136
// 调度模式,默认是FIFO
private val schedulingModeConf = conf.get(SCHEDULER_MODE_PROPERTY, SchedulingMode.FIFO.toString)
// CoarseGrainedSchedulerBackend
// 粗粒度调度器(CoarseGrainedSchedulerBackend)
// Job的每一个生命周期中,都有一个Executor。
// 当一个Task执行结束后,并不会立即释放Executor。
// 当一个新的Task进来之后,不会创建一个新的Executor,会复用之前的Executor。
// 实现了Executor的复用。
//
// 细粒度调度器(FineGrainedSchedulerBackend)
// Task执行结束后,会释放Executor。
// 当一个新的Task进来之后,会创建一个新的Executor去执行。
//
// Standalone模式和YARN模式,只支持粗粒度调度器。
// Mesos支持细粒度调度器。
// 任务调度的方式:
// FIFO : 先进先出的调度方式
// 优先将Executor分配到一个Worker上,当这个Worker资源不足的时候,才会将Executor分配到其他的Worker。
// FAIR : 公平调度
// 基于负载均衡,平均的将Executor分配到每一个Worker节点
def initialize(backend: SchedulerBackend) {
this.backend = backend
schedulableBuilder = {
schedulingMode match {
case SchedulingMode.FIFO =>
new FIFOSchedulableBuilder(rootPool)
case SchedulingMode.FAIR =>
new FairSchedulableBuilder(rootPool, conf)
case _ =>
throw new IllegalArgumentException(s"Unsupported $SCHEDULER_MODE_PROPERTY: " +
s"$schedulingMode")
}
}
schedulableBuilder.buildPools()
}
override def start() {
backend.start()
if (!isLocal && conf.getBoolean("spark.speculation", false)) {
logInfo("Starting speculative execution thread")
speculationScheduler.scheduleWithFixedDelay(new Runnable {
override def run(): Unit = Utils.tryOrStopSparkContext(sc) {
checkSpeculatableTasks()
}
}, SPECULATION_INTERVAL_MS, SPECULATION_INTERVAL_MS, TimeUnit.MILLISECONDS)
}
}
3.2. StandaloneSchedulerBackend
override def start() {
// 调用父类CoarseGrainedSchedulerBackend中的方法实现
// driverEndpoint = createDriverEndpointRef(properties)
// 实例化了一个Driver的RPC通信终端
super.start()
// ...
// 创建了一个Application的描述对象,传递了一系列参数,表示Application所需要的资源信息
val appDesc = ApplicationDescription(sc.appName, maxCores, sc.executorMemory, command,
webUrl, sc.eventLogDir, sc.eventLogCodec, coresPerExecutor, initialExecutorLimit)
// 创建了一个Application的任务对象,包含了作业的资源信息
// 用于和集群管理器进行通信
client = new StandaloneAppClient(sc.env.rpcEnv, masters, appDesc, this, conf)
client.start()
launcherBackend.setState(SparkAppHandle.State.SUBMITTED)
// 等待注册是否完成,在StandaloneAppClient完成
waitForRegistration()
launcherBackend.setState(SparkAppHandle.State.RUNNING)
}
4. DriverEndpoint
是CoarseGrainedSchedulerBackend的内部类,是Driver端的通信模型
override def onStart() {
// Periodically revive offers to allow delay scheduling to work
val reviveIntervalMs = conf.getTimeAsMs("spark.scheduler.revive.interval", "1s")
reviveThread.scheduleAtFixedRate(new Runnable {
override def run(): Unit = Utils.tryLogNonFatalError {
// 给自己发送一个ReviveOffers信号
Option(self).foreach(_.send(ReviveOffers))
}
}, 0, reviveIntervalMs, TimeUnit.MILLISECONDS)
}
override def receive: PartialFunction[Any, Unit] = {
case StatusUpdate(executorId, taskId, state, data) =>
scheduler.statusUpdate(taskId, state, data.value)
if (TaskState.isFinished(state)) {
executorDataMap.get(executorId) match {
case Some(executorInfo) =>
executorInfo.freeCores += scheduler.CPUS_PER_TASK
makeOffers(executorId)
case None =>
// Ignoring the update since we don't know about the executor.
logWarning(s"Ignored task status update ($taskId state $state) " +
s"from unknown executor with ID $executorId")
}
}
case ReviveOffers =>
makeOffers()
}
// 为Executor创建虚拟的资源信息
private def makeOffers() {
// Make sure no executor is killed while some task is launching on it
val taskDescs = CoarseGrainedSchedulerBackend.this.synchronized {
// Filter out executors under killing
val activeExecutors = executorDataMap.filterKeys(executorIsAlive)
val workOffers = activeExecutors.map { case (id, executorData) =>
new WorkerOffer(id, executorData.executorHost, executorData.freeCores)
}.toIndexedSeq
scheduler.resourceOffers(workOffers)
}
if (!taskDescs.isEmpty) {
launchTasks(taskDescs)
}
}
5. StandaloneAppClient
override def onStart(): Unit = {
try {
// 向Master发送注册信息
// 参数的1代表第一次注册,在注册逻辑中,如果注册失败,则会将这个数字+1继续调用注册
// 失败次数 >= 3的时候,注册失败
registerWithMaster(1)
} catch {
case e: Exception =>
logWarning("Failed to connect to master", e)
markDisconnected()
stop()
}
}
private def registerWithMaster(nthRetry: Int) {
registerMasterFutures.set(tryRegisterAllMasters())
registrationRetryTimer.set(registrationRetryThread.schedule(new Runnable {
override def run(): Unit = {
if (registered.get) {
registerMasterFutures.get.foreach(_.cancel(true))
registerMasterThreadPool.shutdownNow()
} else if (nthRetry >= REGISTRATION_RETRIES) {
markDead("All masters are unresponsive! Giving up.")
} else {
registerMasterFutures.get.foreach(_.cancel(true))
registerWithMaster(nthRetry + 1)
}
}
}, REGISTRATION_TIMEOUT_SECONDS, TimeUnit.SECONDS))
}
6. Master
// line 258
// 在receive方法中,用来接收Driver端发送过来的消息,进行模式匹配
case RegisterApplication(description, driver) =>
// TODO Prevent repeatd registrations from some driver
if (state == RecoveryState.STANDBY) {
// ignore, don't send response
} else {
logInfo("Registering app " + description.name)
// 创建应用程序,封装对应的Driver端的资源
val app = createApplication(description, driver)
// 在Master的内部完成Appliction的注册
registerApplication(app)
logInfo("Registered app " + description.name + " with ID " + app.id)
// 使用持久化的操作,将任务的元信息保存,以便Task使用
persistenceEngine.addApplication(app)
// 告诉Driver端,注册完成!
driver.send(RegisteredApplication(app.id, self))
schedule()
}
StandaloneAppClient
override def receive: PartialFunction[Any, Unit] = {
case RegisteredApplication(appId_, masterRef) =>
// FIXME How to handle the following cases?
// 1. A master receives multiple registrations and sends back multiple
// RegisteredApplications due to an unstable network.
// 2. Receive multiple RegisteredApplication from different masters because the master is
// changing.
appId.set(appId_)
registered.set(true)
master = Some(masterRef)
listener.connected(appId.get)
1652

被折叠的 条评论
为什么被折叠?



