Spark-Scheduler:三、DAGScheduler实现过程

本文深入探讨了Apache Spark中Job的执行流程,重点讲解了DAGScheduler如何将任务划分为Stages并调度Task的执行。从SparkContext的runJob方法入手,详细分析了DAGScheduler和TaskScheduler的创建过程,以及JobWaiter如何监听Job状态。

前言:

通过前面部分内容,我们知道DAGScheduler会根基RDD的计算逻辑,将DAG划分为不同的Stage,每个Stage可以并发执行一组逻辑完全相同的Task,只是分布作用于不同数据集上面。

 

现在从一个简单的RDD count为例,来看一下Spark的内部实现原理。

 

1、SparkContext#runJob

def count(): Long = sc.runJob(this, Utils.getIteratorSize _).sum

其实SparkContext实现了很多runJob,这些函数的区别就是调用参数不同,但是这些runJob最后都会

调用DAGSchedulerrunJob:

dagScheduler.runJob(rdd, cleanedFunc, partitions, callSite, allowLocal, resut)

DAGSchedulerrunJob会开始对用户提交的Job进行处理,包括Stage的划分、Task的生成等。

TaskScheduler是通过SparkContext#createTaskScheduler创建,而DAGScheduler时直接调用它的构造函数创建。只不过DAGScheduler保存了TaskScheduler的引用,因此需要在TaskScheduler创建之后进行创建。

sparkContext中创建DAGScheduler

dagScheduler = new DAGScheduler(this);

这个构造函数的实现是:

def this(sc: SparkContext) = this(sc, sc.taskScheduler)

继续看this(sc, sctaskScheduler)的实现:

def this(sc: SparkContext, taskScheduler:TaskScheduler) = {
this(
sc,
taskScheduler,
sc.listenerBus,
sc.env.mapOutTracker.asInstanceOf[MapOutputTrackerMaster],
sc.env.blockManager.master
sc.env
)
}

this(sc, sc,taskScheduler)通过调用下面的构造函数完成DAGScheduler的创建:

private[spark]

class DAGScheduler(

private[scheduler] val sc : SparkContext,

private[scheduler] val taskScheduler: TaskScheduler,

listenerBus: LiverListenerBus,

mapOutputTracker: MapOutputTrackerMaster, //shuffle map task 的输出,下游的task可以获取shuffle的位置信息

blockManagerMaster: BlockManagerMaster, //在Driver端管理整个Job的Block的信息。

env: SparkEnv,

clock: Clock = SystemClock

)extends Logging{}

这里我们看见DAGScheduler初始化的都是集群状态信息的数据结构,对于监管它还会创建一个Actor(org.apache.spark.scheduler.DAGSchedulerEventProcessActor),变量名为event-ProcessActor

。这个Actor的主要职责就是处理DAGScheduler发送给它的各种消息;

def receive = {

case p: Props => sender ! context.actorOf(p)

case _=> logWarning("received unknown message in DAGSchedulerActorSupervis")

}

//DAGSchedulerEventProcessActor的创建过程

implicit val timeout = Timeout(30 seconds)

val initEventActorReply = dagSchedulerActorSupervisor ? Props(new DAGSchedulerEventProcessActor(this))

eventProcesActor = Await.result(initEventActorReply, timeout.duration).asInstanceOf[ActorRef]

DAGSchedulerActorSupervisor来完成eventProcessActor的创建。如果Actor出现错误,则取消DAGScheduler的所有job,停止SparkContext,最终退出。

2、count具体的执行过程

言归正传前面是介绍了DAGScheduler的内部创建机制,以及我们在调用count的时候sparkContext是如何调用,已经如何创建DAGSchedulerTaskScheduler的。

那么抛开方法里面的细节,整个count的执行流程如下

 

由上面图可大致了解count执行的具体过程,第三到第四步的实现:

val waiter = submitJob(rdd, func, partitions, callSite, allowLocal, resultHandler)

waiter.awaitResult() match{

case JobSucceeded => {

logInfo("Job %d finished: %s, took %f s".format(waiter.jobId,callSite.shotForm,(System.nanoTime - strat)/1e9))

case JobFailed(exception: Exception) =>

logInfo("Job %d failed: %s, took %f s").format(waiter.jobId, callSite.shortForm, (System.nanoTIme - start)/1e9)

}

}

上述submitJob首先会为这个Job生成一个Job ID,并且生成一个JobWaiter的实里来监听Job的执行情况:

val waiter = new JobWaiter(this, jobId, partitions.size, resultHandler)

JobWaiter会监听Job的执行状态,而Job是由多个Task组成的,因此只有Job的所有Task都成功完成,Job才标记成功。当其中一个Task失败,都会标记Job失败,这是DAGScheduler通过调用org.apache.spark.scheduler.JobWaiter#jobFailed实现的。

最后,DAGScheduler会向eventProcessActor提交该Job:

eventProcessActor ! JobSubmitted()

 

spark-submit --master yarn --class org.apache.spark.examples.SparkPi $SPARK_HOME/examples/jars/spark-examples_2.12-3.1.1.jar 2025-11-06 21:07:09,900 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2025-11-06 21:07:10,285 INFO spark.SparkContext: Running Spark version 3.1.1 2025-11-06 21:07:10,415 INFO resource.ResourceUtils: ============================================================== 2025-11-06 21:07:10,416 INFO resource.ResourceUtils: No custom resources configured for spark.driver. 2025-11-06 21:07:10,417 INFO resource.ResourceUtils: ============================================================== 2025-11-06 21:07:10,417 INFO spark.SparkContext: Submitted application: Spark Pi 2025-11-06 21:07:10,532 INFO resource.ResourceProfile: Default ResourceProfile created, executor resources: Map(cores -> name: cores, amount: 1, script: , vendor: , memory -> name: memory, amount: 1024, script: , vendor: , offHeap -> name: offHeap, amount: 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0) 2025-11-06 21:07:10,558 INFO resource.ResourceProfile: Limiting resource is cpus at 1 tasks per executor 2025-11-06 21:07:10,561 INFO resource.ResourceProfileManager: Added ResourceProfile id: 0 2025-11-06 21:07:10,742 INFO spark.SecurityManager: Changing view acls to: root 2025-11-06 21:07:10,743 INFO spark.SecurityManager: Changing modify acls to: root 2025-11-06 21:07:10,743 INFO spark.SecurityManager: Changing view acls groups to: 2025-11-06 21:07:10,743 INFO spark.SecurityManager: Changing modify acls groups to: 2025-11-06 21:07:10,743 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set() 2025-11-06 21:07:11,132 INFO util.Utils: Successfully started service 'sparkDriver' on port 45064. 2025-11-06 21:07:11,174 INFO spark.SparkEnv: Registering MapOutputTracker 2025-11-06 21:07:11,219 INFO spark.SparkEnv: Registering BlockManagerMaster 2025-11-06 21:07:11,242 INFO storage.BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 2025-11-06 21:07:11,243 INFO storage.BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up 2025-11-06 21:07:11,307 INFO spark.SparkEnv: Registering BlockManagerMasterHeartbeat 2025-11-06 21:07:11,330 INFO storage.DiskBlockManager: Created local directory at /tmp/blockmgr-fe31a8f2-6d55-4bca-9de0-4c3c04435803 2025-11-06 21:07:11,349 INFO memory.MemoryStore: MemoryStore started with capacity 413.9 MiB 2025-11-06 21:07:11,394 INFO spark.SparkEnv: Registering OutputCommitCoordinator 2025-11-06 21:07:11,525 INFO util.log: Logging initialized @3309ms to org.sparkproject.jetty.util.log.Slf4jLog 2025-11-06 21:07:11,697 INFO server.Server: jetty-9.4.36.v20210114; built: 2021-01-14T16:44:28.689Z; git: 238ec6997c7806b055319a6d11f8ae7564adc0de; jvm 1.8.0_211-b12 2025-11-06 21:07:11,737 INFO server.Server: Started @3522ms 2025-11-06 21:07:11,787 INFO server.AbstractConnector: Started ServerConnector@68ed96ca{HTTP/1.1, (http/1.1)}{0.0.0.0:4040} 2025-11-06 21:07:11,787 INFO util.Utils: Successfully started service 'SparkUI' on port 4040. 2025-11-06 21:07:11,837 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@a23a01d{/jobs,null,AVAILABLE,@Spark} 2025-11-06 21:07:11,839 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@61a5b4ae{/jobs/json,null,AVAILABLE,@Spark} 2025-11-06 21:07:11,840 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@5b69fd74{/jobs/job,null,AVAILABLE,@Spark} 2025-11-06 21:07:11,858 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@63a5e46c{/jobs/job/json,null,AVAILABLE,@Spark} 2025-11-06 21:07:11,859 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@49ef32e0{/stages,null,AVAILABLE,@Spark} 2025-11-06 21:07:11,860 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@6bd51ed8{/stages/json,null,AVAILABLE,@Spark} 2025-11-06 21:07:11,860 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@51abf713{/stages/stage,null,AVAILABLE,@Spark} 2025-11-06 21:07:11,862 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3fc08eec{/stages/stage/json,null,AVAILABLE,@Spark} 2025-11-06 21:07:11,862 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@7b02e036{/stages/pool,null,AVAILABLE,@Spark} 2025-11-06 21:07:11,863 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1e287667{/stages/pool/json,null,AVAILABLE,@Spark} 2025-11-06 21:07:11,864 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4201a617{/storage,null,AVAILABLE,@Spark} 2025-11-06 21:07:11,864 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1bb9aa43{/storage/json,null,AVAILABLE,@Spark} 2025-11-06 21:07:11,865 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@df5f5c0{/storage/rdd,null,AVAILABLE,@Spark} 2025-11-06 21:07:11,865 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@66b72664{/storage/rdd/json,null,AVAILABLE,@Spark} 2025-11-06 21:07:11,866 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@58cd06cb{/environment,null,AVAILABLE,@Spark} 2025-11-06 21:07:11,866 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@64b31700{/environment/json,null,AVAILABLE,@Spark} 2025-11-06 21:07:11,867 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@bae47a0{/executors,null,AVAILABLE,@Spark} 2025-11-06 21:07:11,868 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@85ec632{/executors/json,null,AVAILABLE,@Spark} 2025-11-06 21:07:11,868 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@65ef722a{/executors/threadDump,null,AVAILABLE,@Spark} 2025-11-06 21:07:11,874 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@214894fc{/executors/threadDump/json,null,AVAILABLE,@Spark} 2025-11-06 21:07:11,895 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@e362c57{/static,null,AVAILABLE,@Spark} 2025-11-06 21:07:11,896 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@24528a25{/,null,AVAILABLE,@Spark} 2025-11-06 21:07:11,897 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@59221b97{/api,null,AVAILABLE,@Spark} 2025-11-06 21:07:11,898 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3ee39da0{/jobs/job/kill,null,AVAILABLE,@Spark} 2025-11-06 21:07:11,898 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@7cc9ce8{/stages/stage/kill,null,AVAILABLE,@Spark} 2025-11-06 21:07:11,900 INFO ui.SparkUI: Bound SparkUI to 0.0.0.0, and started at http://master:4040 2025-11-06 21:07:11,927 INFO spark.SparkContext: Added JAR file:/opt/module/spark/examples/jars/spark-examples_2.12-3.1.1.jar at spark://master:45064/jars/spark-examples_2.12-3.1.1.jar with timestamp 1762434430273 2025-11-06 21:07:12,297 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.43.100:8032 2025-11-06 21:07:12,618 INFO yarn.Client: Requesting a new application from cluster with 3 NodeManagers 2025-11-06 21:07:13,819 INFO conf.Configuration: resource-types.xml not found 2025-11-06 21:07:13,819 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'. 2025-11-06 21:07:13,852 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container) 2025-11-06 21:07:13,852 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead 2025-11-06 21:07:13,853 INFO yarn.Client: Setting up container launch context for our AM 2025-11-06 21:07:13,853 INFO yarn.Client: Setting up the launch environment for our AM container 2025-11-06 21:07:13,874 INFO yarn.Client: Preparing resources for our AM container 2025-11-06 21:07:14,105 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME. 2025-11-06 21:07:15,694 INFO yarn.Client: Uploading resource file:/tmp/spark-20fc2948-904e-414a-ae32-c4b4a203de07/__spark_libs__565497487478790957.zip -> hdfs://master:9000/user/root/.sparkStaging/application_1762434298919_0001/__spark_libs__565497487478790957.zip 2025-11-06 21:07:18,951 INFO yarn.Client: Uploading resource file:/tmp/spark-20fc2948-904e-414a-ae32-c4b4a203de07/__spark_conf__3334845677680966791.zip -> hdfs://master:9000/user/root/.sparkStaging/application_1762434298919_0001/__spark_conf__.zip 2025-11-06 21:07:19,028 INFO spark.SecurityManager: Changing view acls to: root 2025-11-06 21:07:19,028 INFO spark.SecurityManager: Changing modify acls to: root 2025-11-06 21:07:19,028 INFO spark.SecurityManager: Changing view acls groups to: 2025-11-06 21:07:19,028 INFO spark.SecurityManager: Changing modify acls groups to: 2025-11-06 21:07:19,028 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set() 2025-11-06 21:07:19,066 INFO yarn.Client: Submitting application application_1762434298919_0001 to ResourceManager 2025-11-06 21:07:19,516 INFO impl.YarnClientImpl: Submitted application application_1762434298919_0001 2025-11-06 21:07:20,528 INFO yarn.Client: Application report for application_1762434298919_0001 (state: ACCEPTED) 2025-11-06 21:07:20,540 INFO yarn.Client: client token: N/A diagnostics: [星期四 十一月 06 21:07:19 +0800 2025] Scheduler has assigned a container for AM, waiting for AM container to be launched ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1762434439193 final status: UNDEFINED tracking URL: http://master:8088/proxy/application_1762434298919_0001/ user: root 2025-11-06 21:07:21,547 INFO yarn.Client: Application report for application_1762434298919_0001 (state: ACCEPTED) 2025-11-06 21:07:22,553 INFO yarn.Client: Application report for application_1762434298919_0001 (state: ACCEPTED) 2025-11-06 21:07:23,577 INFO yarn.Client: Application report for application_1762434298919_0001 (state: ACCEPTED) 2025-11-06 21:07:24,590 INFO yarn.Client: Application report for application_1762434298919_0001 (state: ACCEPTED) 2025-11-06 21:07:25,602 INFO yarn.Client: Application report for application_1762434298919_0001 (state: ACCEPTED) 2025-11-06 21:07:26,608 INFO yarn.Client: Application report for application_1762434298919_0001 (state: ACCEPTED) 2025-11-06 21:07:27,616 INFO yarn.Client: Application report for application_1762434298919_0001 (state: ACCEPTED) 2025-11-06 21:07:28,637 INFO yarn.Client: Application report for application_1762434298919_0001 (state: ACCEPTED) 2025-11-06 21:07:29,642 INFO yarn.Client: Application report for application_1762434298919_0001 (state: RUNNING) 2025-11-06 21:07:29,642 INFO yarn.Client: client token: N/A diagnostics: N/A ApplicationMaster host: 192.168.43.100 ApplicationMaster RPC port: -1 queue: default start time: 1762434439193 final status: UNDEFINED tracking URL: http://master:8088/proxy/application_1762434298919_0001/ user: root 2025-11-06 21:07:29,647 INFO cluster.YarnClientSchedulerBackend: Application application_1762434298919_0001 has started running. 2025-11-06 21:07:29,709 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 38208. 2025-11-06 21:07:29,709 INFO netty.NettyBlockTransferService: Server created on master:38208 2025-11-06 21:07:29,712 INFO storage.BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy 2025-11-06 21:07:29,727 INFO storage.BlockManagerMaster: Registering BlockManager BlockManagerId(driver, master, 38208, None) 2025-11-06 21:07:29,759 INFO storage.BlockManagerMasterEndpoint: Registering block manager master:38208 with 413.9 MiB RAM, BlockManagerId(driver, master, 38208, None) 2025-11-06 21:07:29,772 INFO storage.BlockManagerMaster: Registered BlockManager BlockManagerId(driver, master, 38208, None) 2025-11-06 21:07:29,774 INFO storage.BlockManager: Initialized BlockManager: BlockManagerId(driver, master, 38208, None) 2025-11-06 21:07:30,352 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@41a374be{/metrics/json,null,AVAILABLE,@Spark} 2025-11-06 21:07:30,511 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> master, PROXY_URI_BASES -> http://master:8088/proxy/application_1762434298919_0001), /proxy/application_1762434298919_0001 2025-11-06 21:07:32,468 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(spark-client://YarnAM) 2025-11-06 21:07:41,336 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (192.168.43.100:54034) with ID 2, ResourceProfileId 0 2025-11-06 21:07:41,588 INFO storage.BlockManagerMasterEndpoint: Registering block manager master:36440 with 413.9 MiB RAM, BlockManagerId(2, master, 36440, None) 2025-11-06 21:07:42,069 INFO cluster.YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after waiting maxRegisteredResourcesWaitingTime: 30000000000(ns) 2025-11-06 21:07:43,507 INFO spark.SparkContext: Starting job: reduce at SparkPi.scala:38 2025-11-06 21:07:43,563 INFO scheduler.DAGScheduler: Got job 0 (reduce at SparkPi.scala:38) with 2 output partitions 2025-11-06 21:07:43,563 INFO scheduler.DAGScheduler: Final stage: ResultStage 0 (reduce at SparkPi.scala:38) 2025-11-06 21:07:43,564 INFO scheduler.DAGScheduler: Parents of final stage: List() 2025-11-06 21:07:43,565 INFO scheduler.DAGScheduler: Missing parents: List() 2025-11-06 21:07:43,582 INFO scheduler.DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34), which has no missing parents 2025-11-06 21:07:43,588 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (192.168.43.102:46956) with ID 1, ResourceProfileId 0 2025-11-06 21:07:43,780 INFO storage.BlockManagerMasterEndpoint: Registering block manager slave2:43051 with 413.9 MiB RAM, BlockManagerId(1, slave2, 43051, None) 2025-11-06 21:07:43,936 INFO memory.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 3.1 KiB, free 413.9 MiB) 2025-11-06 21:07:44,043 INFO memory.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 1816.0 B, free 413.9 MiB) 2025-11-06 21:07:44,048 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on master:38208 (size: 1816.0 B, free: 413.9 MiB) 2025-11-06 21:07:44,055 INFO spark.SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1383 2025-11-06 21:07:44,103 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34) (first 15 tasks are for partitions Vector(0, 1)) 2025-11-06 21:07:44,104 INFO cluster.YarnScheduler: Adding task set 0.0 with 2 tasks resource profile 0 2025-11-06 21:07:44,185 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0) (slave2, executor 1, partition 0, PROCESS_LOCAL, 4589 bytes) taskResourceAssignments Map() 2025-11-06 21:07:44,191 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1) (master, executor 2, partition 1, PROCESS_LOCAL, 4589 bytes) taskResourceAssignments Map() 2025-11-06 21:07:44,932 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on slave2:43051 (size: 1816.0 B, free: 413.9 MiB) 2025-11-06 21:07:44,989 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on master:36440 (size: 1816.0 B, free: 413.9 MiB) 2025-11-06 21:07:46,460 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 2313 ms on slave2 (executor 1) (1/2) 2025-11-06 21:07:46,864 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 2674 ms on master (executor 2) (2/2) 2025-11-06 21:07:46,865 INFO cluster.YarnScheduler: Removed TaskSet 0.0, whose tasks have all completed, from pool 2025-11-06 21:07:46,870 INFO scheduler.DAGScheduler: ResultStage 0 (reduce at SparkPi.scala:38) finished in 3.220 s 2025-11-06 21:07:46,873 INFO scheduler.DAGScheduler: Job 0 is finished. Cancelling potential speculative or zombie tasks for this job 2025-11-06 21:07:46,873 INFO cluster.YarnScheduler: Killing all running tasks in stage 0: Stage finished 2025-11-06 21:07:46,896 INFO scheduler.DAGScheduler: Job 0 finished: reduce at SparkPi.scala:38, took 3.388776 s Pi is roughly 3.143555717778589 2025-11-06 21:07:47,074 INFO server.AbstractConnector: Stopped Spark@68ed96ca{HTTP/1.1, (http/1.1)}{0.0.0.0:4040} 2025-11-06 21:07:47,143 INFO ui.SparkUI: Stopped Spark web UI at http://master:4040 2025-11-06 21:07:47,147 INFO cluster.YarnClientSchedulerBackend: Interrupting monitor thread 2025-11-06 21:07:47,202 INFO cluster.YarnClientSchedulerBackend: Shutting down all executors 2025-11-06 21:07:47,203 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Asking each executor to shut down 2025-11-06 21:07:47,238 INFO cluster.YarnClientSchedulerBackend: YARN client scheduler backend Stopped 2025-11-06 21:07:47,332 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 2025-11-06 21:07:47,359 WARN nio.NioEventLoop: Selector.select() returned prematurely 512 times in a row; rebuilding Selector io.netty.channel.nio.SelectedSelectionKeySetSelector@388f2936. 2025-11-06 21:07:47,360 INFO nio.NioEventLoop: Migrated 1 channel(s) to the new Selector. 2025-11-06 21:07:47,398 INFO memory.MemoryStore: MemoryStore cleared 2025-11-06 21:07:47,399 INFO storage.BlockManager: BlockManager stopped 2025-11-06 21:07:47,449 INFO storage.BlockManagerMaster: BlockManagerMaster stopped 2025-11-06 21:07:47,457 INFO scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 2025-11-06 21:07:47,518 INFO spark.SparkContext: Successfully stopped SparkContext 2025-11-06 21:07:47,628 INFO util.ShutdownHookManager: Shutdown hook called 2025-11-06 21:07:47,628 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-20fc2948-904e-414a-ae32-c4b4a203de07 2025-11-06 21:07:47,632 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-8d735b42-9f96-43fc-a797-1915fee707b1
最新发布
11-08
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值