spark ListenerBus
系统中,常常需要异步处理监听事件.用监听器,可以解耦系统.ListenerBus 总是监听器总线.
ListenerBus 类
/**
* An event bus which posts events to its listeners.
* 事件总线 发送事件到监听器
*/
//L 表示监听器, E表示传给监听器的事件.
private[spark] trait ListenerBus[L <: AnyRef, E] extends Logging {
//这里用 一个CopyOnWriteArrayList来存储 监听器,CopyOnWriteArrayList
适用于多读少写的场景,用在这里很适合
private[this] val listenersPlusTimers = new CopyOnWriteArrayList[(L, Option[Timer])]
SparkListenerInterface 类
这个接口,就是ListenerBus 中的L(泛型),并定义了要处理的事件.
/**
* Interface for listening to events from the Spark scheduler. Most applications should probably
* extend SparkListener or SparkFirehoseListener directly, rather than implementing this class.
*
* Note that this is an internal interface which might change in different Spark releases.
*/
private[spark] trait SparkListenerInterface {
/**
* Called when a stage completes successfully or fails, with information on the completed stage.
*/
def onStageCompleted(stageCompleted: SparkListenerStageCompleted): Unit
SparkListenerBus 类实现了ListenerBus ,专为SparkListenerInterface 和对应的SparkListenerEvent事件服务
这个类 实现 了
/**
* A [[SparkListenerEvent]] bus that relays [[SparkListenerEvent]]s to its listeners
*/
private[spark] trait SparkListenerBus
extends ListenerBus[SparkListenerInterface, SparkListenerEvent] {
protected override def doPostEvent(
listener: SparkListenerInterface,
event: SparkListenerEvent): Unit = {
event match {
//根据事件的不同调用不同的方法
case stageSubmitted: SparkListenerStageSubmitted =>
listener.onStageSubmitted(stageSubmitted)
AsyncEventQueue 类,实现了SparkListenerBus
通过有界的LinkedBlockingQueue来管理事件,有需要发送的消息时,就把它添加到LinkedBlockingQueue中. AsyncEventQueue 还需要启动一个线程,不断的消费LinkedBlockingQueue中的内容,把它传递到 实际的监听器.
一个AsyncEventQueue 就对应一个线程.
/**
* An asynchronous queue for events. All events posted to this queue will be delivered to the child
* listeners in a separate thread.
*
* Delivery will only begin when the `start()` method is called. The `stop()` method should be
* called when no more events need to be delivered.
* 事件的异步队列,发送到这个队列的所有事件将在一个单独的线程中被分发到listeners
* 当start方法被调用,分发将开始.当没有事件需要分发进,stop方法被调用
*/
private class AsyncEventQueue(
val name: String,
conf: SparkConf,
metrics: LiveListenerBusMetrics,
bus: LiveListenerBus)
extends SparkListenerBus
with Logging {
import AsyncEventQueue._
// Cap the capacity of the queue so we get an explicit error (rather than an OOM exception) if
// it's perpetually being added to more quickly than it's being drained.
private val eventQueue = new LinkedBlockingQueue[SparkListenerEvent](
conf.get(LISTENER_BUS_EVENT_QUEUE_CAPACITY))
如果消息太多,超过有界队列怎么办
spark的处理是把消息丢掉,并打印日志.
LiveListenerBus
由于spark中的监听器太多了,一个线程不够用.所以分类成
private[scheduler] val SHARED_QUEUE = “shared”
private[scheduler] val APP_STATUS_QUEUE = “appStatus”
private[scheduler] val EXECUTOR_MANAGEMENT_QUEUE = “executorManagement”
private[scheduler] val EVENT_LOG_QUEUE = “eventLog”
每个类别下面,可以有多个监听器.
/**
* Asynchronously passes SparkListenerEvents to registered SparkListeners.
*
* Until `start()` is called, all posted events are only buffered. Only after this listener bus
* has started will events be actually propagated to all attached listeners. This listener bus
* is stopped when `stop()` is called, and it will drop further events after stopping.
*/
private[spark] class LiveListenerBus(conf: SparkConf) {
import LiveListenerBus._
private var sparkContext: SparkContext = _
private val queues = new CopyOnWriteArrayList[AsyncEventQueue]()
如何使用
- 方法1
- 方法2
/**
* Registers listeners specified in spark.extraListeners, then starts the listener bus.
* This should be called after all internal listeners have been registered with the listener bus
* (e.g. after the web UI and event logging listeners have been registered).
* 注册 在 in spark.extraListeners里指定的listeners ,然后启动 listener bus.
*/
private def setupAndStartListenerBus(): Unit = {
try {
conf.get(EXTRA_LISTENERS).foreach { classNames =>
val listeners = Utils.loadExtensions(classOf[SparkListenerInterface], classNames, conf)
listeners.foreach { listener =>
listenerBus.addToSharedQueue(listener)
logInfo(s"Registered listener ${listener.getClass().getName()}")
}
}
} catch {
case e: Exception =>
try {
stop()
} finally {
throw new SparkException(s"Exception when registering SparkListener", e)
}
}
方法2应该好于方法1,因为在sparkContext中, listenerBus已经被开启了,在开启后在加入listener.
有的消息可能无法消费到.