当触发一个RDD的action后,以count为例,调用关系如下:
- org.apache.spark.rdd.RDD#count
- org.apache.spark.SparkContext#runJob
- org.apache.spark.scheduler.DAGScheduler#runJob
- org.apache.spark.scheduler.DAGScheduler#submitJob
- org.apache.spark.scheduler.DAGSchedulerEventProcessActor#receive(JobSubmitted)
- org.apache.spark.scheduler.DAGScheduler#handleJobSubmitted
其中步骤五的DAGSchedulerEventProcessActor是DAGScheduler 的与外部交互的接口代理,DA