1. prepareForExecution
SQLContext的prepareForExecution变量是实现了RuleExecutor接口的匿名内部类对象,用于对物理计划进行转换,代码是
/**
* Prepares a planned SparkPlan for execution by inserting shuffle operations and internal
* row format conversions as needed.
*/
@transient
protected[sql] val prepareForExecution = new RuleExecutor[SparkPlan] {
val batches = Seq(
Batch("Add exchange", Once, EnsureRequirements(self)),
Batch("Add row converters", Once, EnsureRowFormats)
)
}
EnsureRequirement用于在父子物理计划之间插入一个Exchange物理计划,通过插入一个Exchange物理计划,实现插入shuffle操作的目的。为什么要插入一个Exchange物理计划呢?即插入shuffle操作的目的是什么呢?
2. 实例
给定下面的SQL语句,物理机计划将使用SortMergeJoin进行join操作,
val df = sqlContext.sql("select * from TBL_STUDENT a join TBL_CLASS b where a.classId = b.classId")
df.show
该SQL语句产生的执行计划如下,从最后Prepared Physical Plan中可以看出,在SortMergeJoin和PhysicalRDD两个物理计划之间插入了TungstenSort和TungstenExchange两个物理计划,其中TungstenExchange是TungstenSort的child
== Parsed Logical Plan ==
'nodeName: <Project>, argString:< [unresolvedalias(*)]>
'nodeName: <Filter>, argString:< (UnresolvedAttribute: 'a.classId = UnresolvedAttribute: 'b.classId)>
'nodeName: <Join>, argString:< Inner, None>
'nodeName: <UnresolvedRelation>, argString:< [TBL_STUDENT], Some(a)>
'nodeName: <UnresolvedRelation>, argString:< [TBL_CLASS], Some(b)>
== Analyzed Logical Plan ==
id: string, name: string, classId: string, age: int, classId: string, className: string
nodeName: <Project>, argString:< [AttributeReference:id#0,AttributeReference:name#1,AttributeReference:classId#2,AttributeReference:age#3,AttributeReference:classId#4,AttributeReference:className#5]>
nodeName: <Filter>, argString:< (AttributeReference:classId#2 = AttributeReference:classId#4)>
nodeName: <Join>, argString:< Inner, None>
nodeName: <Subquery>, argString:< a>
nodeName: <Subquery>, argString:< TBL_STUDENT>
nodeName: <LogicalRDD>, argString:< [AttributeReference:id#0,AttributeReference:name#1,AttributeReference:classId#2,AttributeReference:age#3], MapPartitionsRDD[4] at main at NativeMethodAccessorImpl.java:-2>
nodeName: <Subquery>, argString:< b>
nodeName: <Subquery>, argString:< TBL_CLASS>
nodeName: <LogicalRDD>, argString:< [AttributeReference:classId#4,AttributeReference:className#5], MapPartitionsRDD[9] at main at NativeMethodAccessorImpl.java:-2>
== Optimized Logical Plan ==
nodeName: <Project>, argString:< [AttributeReference:id#0,AttributeReference:name#1,AttributeReference:classId#2,AttributeReference:age#3,AttributeReference:classId#4,AttributeReference:className#5]>
nodeName: <Join>, argString:< Inner, Some((AttributeReference:classId#2 = AttributeReference:classId#4))>
nodeName: <LogicalRDD>, argString:< [AttributeReference:id#0,AttributeReference:name#1,AttributeReference:classId#2,AttributeReference:age#3], MapPartitionsRDD[4] at main at NativeMethodAccessorImpl.java:-2>
nodeName: <LogicalRDD>, argString:< [AttributeReference:classId#4,AttributeReference:className#5], MapPartitionsRDD[9] at main at NativeMethodAccessorImpl.java:-2>
== Not Prepared Physical Plan ==
nodeName: <TungstenProject>, argString:< [AttributeReference:id#0,AttributeReference:name#1,AttributeReference:classId#2,AttributeReference:age#3,AttributeReference:classId#4,AttributeReference:className#5]>
nodeName: <SortMergeJoin>, argString:< [AttributeReference:classId#2], [AttributeReference:classId#4]>
Scan PhysicalRDD[AttributeReference:id#0,AttributeReference:name#1,AttributeReference:classId#2,AttributeReference:age#3]
Scan PhysicalRDD[AttributeReference:classId#4,AttributeReference:className#5]
== Prepared Physical Plan ==
nodeName: <TungstenProject>, argString:< [AttributeReference:id#0,AttributeReference:name#1,AttributeReference:classId#2,AttributeReference:age#3,AttributeReference:classId#4,AttributeReference:className#5]>
nodeName: <SortMergeJoin>, argString:< [AttributeReference:classId#2], [AttributeReference:classId#4]>
nodeName: <TungstenSort>, argString:< [AttributeReference:classId#2 ASC], false, 0>
nodeName: <TungstenExchange>, argString:< hashpartitioning(AttributeReference:classId#2)>
nodeName: <ConvertToUnsafe>, argString:< >
Scan PhysicalRDD[AttributeReference:id#0,AttributeReference:name#1,AttributeReference:classId#2,AttributeReference:age#3]
nodeName: <TungstenSort>, argString:< [AttributeReference:classId#4 ASC], false, 0>
nodeName: <TungstenExchange>, argString:< hashpartitioning(AttributeReference:classId#4)>
nodeName: <ConvertToUnsafe>, argString:< >
Scan PhysicalRDD[AttributeReference:classId#4,AttributeReference:className#5]
3. 插入Exchange物理计划的动机
首先看EnsureRequirements这个case class的类注释,从中发现插入Exchange的原因
/**
* Ensures that the [[org.apache.spark.sql.catalyst.plans.physical.Partitioning Partitioning]]
* of input data meets the
* [[org.apache.spark.sql.catalyst.plans.physical.Distribution Distribution]] requirements for
* each operator by inserting [[Exchange]] Operators where required. Also ensure that the
* input partition ordering requirements are met.
*/
private[sql] case class EnsureRequirements(sqlContext: SQLContext) extends Rule[SparkPlan] {
EnsureRequirement的类注释说明了两点
a. 为什么要插入Exchange,对于Tungsten mode,就是插入TungstenExchange
b.为什么要插入Sort,不同的情况,插入不同的Sort运算符,比如TungstenSort, ExternalSort以及Sort
为什么要插入Exchange这个物理计划
假如有两个物理计划,B和C是A的children(可以把A想象成SortMergeJoin,B和C想象成两个PhysicalRDD), A期望它的children(B和C)的数据分布(Distribution)满足一定的要求,这个要求通过SparkPlan的requiredChildDistribution函数定义。对于SortMergeJoin来说,就是要求两个child的数据分布都是ClusteredDistribution类型,代码如下
override def requiredChildDistribution: Seq[Distribution] =
ClusteredDistribution(leftKeys) :: ClusteredDistribution(rightKeys) :: Nil
而每个child都有一个数据分区策略,这个数据分区策略通过SparkPlan的outputPartitioning函数进行定义,对于B和C两个PhysicalRDD而言,其定义是
// TODO: Move to `DistributedPlan`
/** Specifies how data is partitioned across different nodes in the cluster. */
/**
*
* @return
*/
def outputPartitioning: Partitioning = UnknownPartitioning(0) // TODO: WRONG WIDTH!
上面outputPartitioning的定义其实是定义于SparkPlan中,PhysicalRDD默认继承了SparkPlan的实现,即PhysicalRDD的outputPartitioning是分区未知类型(UnknownPartitioning)
B和C(children)的outputPartioning需要和A(B和C的parent)的requiredChildDistribution需要满足satisfied关系,具体代码在ensureDistributionAndOrdering中:
//获得operator(比如SortMergeJoin)的requiredChildDistribution集合
val requiredChildDistributions: Seq[Distribution] = operator.requiredChildDistribution
<pre name="code" class="sql"> //获得operator(比如SortMergeJoin)的requiredChildOrdering集合
val requiredChildOrderings: Seq[Seq[SortOrder]] = operator.requiredChildOrdering var children: Seq[SparkPlan] = operator.children // Ensure that the operator's children satisfy their output distribution requirements:
//对SortMergeJoin的每个child物理计划,进行检查(child物理计划的outputPartitioning和distribution)
children = children.zip(requiredChildDistributions).map { case (child, distribution) => val o = child.outputPartitioning val s = o.satisfies(distribution) if (s) { child } else { val p = canonicalPartitioning(distribution) Exchange(p, child) //不满足则插入Exchange } }
4. EnsureRequirements的源代码流程:
/**
* Ensures that the [[org.apache.spark.sql.catalyst.plans.physical.Partitioning Partitioning]]
* of input data meets the
* [[org.apache.spark.sql.catalyst.plans.physical.Distribution Distribution]] requirements for
* each operator by inserting [[Exchange]] Operators where required. Also ensure that the
* input partition ordering requirements are met.
*/
private[sql] case class EnsureRequirements(sqlContext: SQLContext) extends Rule[SparkPlan] {
// TODO: Determine the number of partitions.
private def numPartitions: Int = sqlContext.conf.numShufflePartitions
/**
* Given a required distribution, returns a partitioning that satisfies that distribution.
*/
private def canonicalPartitioning(requiredDistribution: Distribution): Partitioning = {
requiredDistribution match {
case AllTuples => SinglePartition //单一Partition
case ClusteredDistribution(clustering) => HashPartitioning(clustering, numPartitions)
case OrderedDistribution(ordering) => RangePartitioning(ordering, numPartitions)
case dist => sys.error(s"Do not know how to satisfy distribution $dist")
}
}
/**
* 物理计划转换
* @param operator
* @return
*/
private def ensureDistributionAndOrdering(operator: SparkPlan): SparkPlan = {
//用于调试目的,只有当operator是SortMergeJoin时才挂起
if (operator.isInstanceOf[SortMergeJoin]) {
println
}
//当前operator(比如SortMergeJoin)要求子物理计划的数据分区
val requiredChildDistributions: Seq[Distribution] = operator.requiredChildDistribution
//当前operator( 比如SortMergeJoin)要求子物理计划的数据排序情况
//对于SortMergeJoin而言,就是
val requiredChildOrderings: Seq[Seq[SortOrder]] = operator.requiredChildOrdering
//当前operator的孩子物理计划,对于SortMergeJoin而言,就是两个PhysicalRDD
var children: Seq[SparkPlan] = operator.children
// Ensure that the operator's children satisfy their output distribution requirements:
//operator的children和operator的requiredChildDistributions两个集合进行zip操作
//结果是针对每个child和对该child要求的数据分区进行satisfication检查
children = children.zip(requiredChildDistributions).map {
case (child, distribution) =>
//获得该child的数据分区
val o = child.outputPartitioning
//child的数据分区是否满足parent要求的数据分布,不满足则插入Exchange物理计划
//child如果是PhysicalRDD,那么它的outputPartitioning是UnknownPartitioning,而distribution是ClusteredDistribution
//s是false
val s = o.satisfies(distribution)
if (s) {
child
} else {
//根据distribution获得适当的partitioning,
//对于ClusteredDistribution,将返回HashPartitioning
val p = canonicalPartitioning(distribution)
Exchange(p, child)
}
}
// If the operator has multiple children and specifies child output distributions (e.g. join),
// then the children's output partitionings must be compatible:
val a = children.length
val b = requiredChildDistributions.toSet != Set(UnspecifiedDistribution)
val ops = children.map(_.outputPartitioning)
val c = !Partitioning.allCompatible(ops)
if (a > 1 && b && c) {
children = children.zip(requiredChildDistributions).map {
case (child, distribution) =>
val targetPartitioning = canonicalPartitioning(distribution)
val op = child.outputPartitioning
val d = op.guarantees(targetPartitioning)
if (d) {
child
} else {
Exchange(targetPartitioning, child)
}
}
}
// Now that we've performed any necessary shuffles, add sorts to guarantee output orderings:
//插入排序物理计划
children = children.zip(requiredChildOrderings).map {
case (child, requiredOrdering) =>
if (requiredOrdering.nonEmpty) { //对于SortMergeJoin而言,requiredOrdering不为空
// If child.outputOrdering is [a, b] and requiredOrdering is [a], we do not need to sort.
val minSize = Seq(requiredOrdering.size, child.outputOrdering.size).min //对于PhysicalRDD而言,outputOrdering为空集合,长度为0
//有一个为空或者一个集合不是另一个集合的子集
if (minSize == 0 || requiredOrdering.take(minSize) != child.outputOrdering.take(minSize)) {
//此处获得TungstenSort物理计划,如果child是TungstenExchange,那么TungstenExchange是TungstenSort的child
val sortPlan = sqlContext.planner.BasicOperators.getSortOperator(requiredOrdering, global = false, child)
sortPlan
} else {
child
}
} else {
child
}
}
//Returns a copy of this node with the children replaced.
val v = operator.withNewChildren(children)
v
}
/**
* EnsureRequirements是一个RuleExecutor,因此调用apply方法完成物理计划的转换,剪裁
* EnsureRequirements的apply方法调用ensureDistributionAndOrdering方法
* @param plan
* @return
*/
def apply(plan: SparkPlan): SparkPlan = plan.transformUp {
case operator: SparkPlan => ensureDistributionAndOrdering(operator)
}
}