Spark 1.6.0 Rpc通信源码精读

最新推荐文章于 2025-09-06 13:17:25 发布

转载最新推荐文章于 2025-09-06 13:17:25 发布 · 90 阅读

0 ·

CC 4.0 BY-SA版权

原文链接：https://my.oschina.net/corleone/blog/657762

文章标签：

#大数据 #scala #python

本文详细解读了Apache Spark中RPC机制的核心组件RpcEndpoint及其子类DriverEndpoint的功能和实现细节，包括消息接收与处理流程、生命周期、以及接收与发送消息的方式。通过实例分析，展示了如何在分布式环境下高效地进行远程过程调用。

2019独角兽企业重金招聘Python工程师标准>>>

本文是源码读至CoarseGrainedSchedulerBackend.start方法中，创建了DriverEndPoint，涉及到RcpEndpoint，因此进一步分析下RpcEnpoint。

// CoarseGrainedSchedulerBackend.scala line 303

protected def createDriverEndpoint(properties: Seq[(String, String)]): DriverEndpoint = {
  new DriverEndpoint(rpcEnv, properties)
}

// CoarseGrainedSchedulerBackend.scala line 81
class DriverEndpoint(override val rpcEnv: RpcEnv, sparkProperties: Seq[(String, String)])
  extends ThreadSafeRpcEndpoint with Logging 
  
// org.apache.spark.rpc.RpcEndpoint.sclaa line 148
private[spark] trait ThreadSafeRpcEndpoint extends RpcEndpoint

// org.apache.spark.rpc.RpcEndpoint.sclaa line 46
/**
 * An end point for the RPC that defines what functions to trigger given a message.
 *
 * It is guaranteed that `onStart`, `receive` and `onStop` will be called in sequence.
 *
 * The life-cycle of an endpoint is:
 *
 * constructor -> onStart -> receive* -> onStop
 *
 * Note: `receive` can be called concurrently. If you want `receive` to be thread-safe, please use
 * [[ThreadSafeRpcEndpoint]]
 *
 * If any error is thrown from one of [[RpcEndpoint]] methods except `onError`, `onError` will be
 * invoked with the cause. If `onError` throws an error, [[RpcEnv]] will ignore it.
 */
private[spark] trait RpcEndpoint

从上述的注释可以很清楚的了解到RpcEndpoint的作用。

RpcEndpoint是 RPC【Remote Procedure Call ：远程过程调用】中定义了收到消息将触发哪个方法。

同时清楚的阐述了生命周期，构造-> onStart -> receive* -> onStop

还有一个关键的成员变量是self

解释下：这里receive* 是指receive 和 receiveAndReply。

他们的区别是：

receive是无需等待答复，而receiveAndReply是会阻塞线程，直至有答复的。

比如：外卖接到订餐电话，不会马上就让顾客吃到，需要餐馆做完送过去之后，才能吃到。【只需将消息接收即可，不用马上有反馈】

而如果是堂食，则客户订餐之后，就需要马上做完，第一时间上餐。

再来说self，实际上在创建的时候，就已经将消息的接收和发送角色都创建完成了。接收者是RpcEndpoint，发送者就是EndpintRef，这里通过self方法可以调用得到。下面会提到。

// org.apache.spark.rpc.RpcEndpoint.scala line 60
final def self: RpcEndpointRef = {
  require(rpcEnv != null, "rpcEnv has not been initialized")
  rpcEnv.endpointRef(this)
}

// org.apache.spark.rpc.RpcEndpoint.scala line 65
/**
 * Process messages from [[RpcEndpointRef.send]] or [[RpcCallContext.reply)]]. If receiving a
 * unmatched message, [[SparkException]] will be thrown and sent to `onError`.
 */
def receive: PartialFunction[Any, Unit] = {
  case _ => throw new SparkException(self + " does not implement 'receive'")
}

/**
 * Process messages from [[RpcEndpointRef.ask]]. If receiving a unmatched message,
 * [[SparkException]] will be thrown and sent to `onError`.
 */
def receiveAndReply(context: RpcCallContext): PartialFunction[Any, Unit] = {
  case _ => context.sendFailure(new SparkException(self + " won't reply anything"))
}

既然RpcEndpoint是接收消息的，再看发送消息的。

// RpcEndpointRef.scala line 26
/**
 * A reference for a remote [[RpcEndpoint]]. [[RpcEndpointRef]] is thread-safe.
 */
private[spark] abstract class RpcEndpointRef(conf: SparkConf)
  extends Serializable with Logging{
  // line 46
  /**
 * Sends a one-way asynchronous message. Fire-and-forget semantics.
 */
def send(message: Any): Unit

// line 54
/**
 * Send a message to the corresponding [[RpcEndpoint.receiveAndReply)]] and return a [[Future]] to
 * receive the reply within the specified timeout.
 *
 * This method only sends the message once and never retries.
 */
def ask[T: ClassTag](message: Any, timeout: RpcTimeout): Future[T]

// line 62
/**
 * Send a message to the corresponding [[RpcEndpoint.receiveAndReply)]] and return a [[Future]] to
 * receive the reply within a default timeout.
 *
 * This method only sends the message once and never retries.
 */
def ask[T: ClassTag](message: Any): Future[T] = ask(message, defaultAskTimeout)
// line 93
def askWithRetry[T: ClassTag](message: Any, timeout: RpcTimeout): T ={
 // ...
 }
  }

这里的发送对应接收，分为send和ask。顾名思义，send就是只是发送，ask就是要有结果的。当然ask还有重试的机制。详见askWithRetry。

未完，待续

转载于:https://my.oschina.net/corleone/blog/657762