olap/spark-tungsten：codegen-优快云博客

15721这一章没什么好说的，不再贴课程内容了。codegen和simd在工业界一般只会选一种实现。比如phothon之前用codegen，然后改成了向量化引擎。一般gen的都是weld IR/LLVM IR/当前语言，gen成C++的也要检查是不是有本地预编译版本，要不没法用。因为clickhouse没有codegen，这节课就拿我比较熟悉的spark的tungsten来当例子，tungsten会gen成scala，然后拿janino动态编译。
tungsten主要有两个特色：一个是codegen，另一个是in-heap memory的管理。本文顺便把它的内存管理也分析一下。在jvm堆内自由分配内存，不被free，不受gc影响，还是挺有意思的。

WASG

手写代码的生成过程分为两个步骤：

从父节点到子节点，递归调用 doProduce，生成框架
从子节点到父节点，递归调用 doConsume，向框架填充每一个操作符的运算逻辑

首先，在 Stage 顶端节点也就是 Project 之上，添加 WholeStageCodeGen 节点。WholeStageCodeGen 节点通过调用 doExecute 来触发整个代码生成过程的计算。doExecute 会递归调用子节点的 doProduce 函数，直到遇到 Shuffle Boundary 为止。这里，Shuffle Boundary 指的是 Shuffle 边界，要么是数据源，要么是上一个 Stage 的输出。在叶子节点（也就是 Scan）调用的 doProduce 函数会先把手写代码的框架生成出来。

  override def doExecute(): RDD[InternalRow] = {
   
   
	// 下面这一行将会调用子类的produce完成上述过程。
    val (ctx, cleanedSource) = doCodeGen()
    // try to compile and fallback if it failed
	// 调用janino完成动态编译过程
    val (_, compiledCodeStats) = try {
   
   
      CodeGenerator.compile(cleanedSource)
    } catch {
   
   
      case NonFatal(_) if !Utils.isTesting && conf.codegenFallback =>
        // We should already saw the error message
        logWarning(s"Whole-stage codegen disabled for plan (id=$codegenStageId):\n $treeString")
        return child.execute()
    }

    // Check if compiled code has a too large function
    if (compiledCodeStats.maxMethodCodeSize > conf.hugeMethodLimit) {
   
   
      logInfo(s"Found too long generated codes and JIT optimization might not work: " +
        s"the bytecode size (${compiledCodeStats.maxMethodCodeSize}) is above the limit " +
        s"${conf.hugeMethodLimit}, and the whole-stage codegen was disabled " +
        s"for this plan (id=$codegenStageId). To avoid this, you can raise the limit " +
        s"`${SQLConf.WHOLESTAGE_HUGE_METHOD_LIMIT.key}`:\n$treeString")
      return child.execute()
    }

    val references = ctx.references.toArray

    val durationMs = longMetric("pipelineTime")

    // Even though rdds is an RDD[InternalRow] it may actually be an RDD[ColumnarBatch] with
    // type erasure hiding that. This allows for the input to a code gen stage to be columnar,
    // but the output must be rows.
    val rdds = child.asInstanceOf[CodegenSupport].inputRDDs()
    assert(rdds.size <= 2, "Up to two input RDDs can be supported")
    if (rdds.length == 1) {
   
   
      rdds.head.mapPartitionsWithIndex {
   
    (index, iter) =>
        val (clazz, _) = CodeGenerator.compile(cleanedSource)
        val buffer = clazz.generate(references).asInstanceOf[BufferedRowIterator]
        buffer.init(index, Array(iter))
        new Iterator[InternalRow] {
   
   
          override def hasNext: Boolean = {
   
   
            val v = buffer.hasNext
            if (!v) durationMs += buffer.durationMs()
            v
          }
          override def next: InternalRow = buffer.next()
        }
      }
    } else {
   
   
      // Right now, we support up to two input RDDs.
      rdds.head.zipPartitions(rdds(1)) {
   
    (leftIter, rightIter) =>
        Iterator((leftIter, rightIter))
        // a small hack to obtain the correct partition index
      }.mapPartitionsWithIndex {
   
    (index, zippedIter) =>
        val (leftIter, rightIter) = zippedIter.next()
        val (clazz, _) = CodeGenerator.compile(cleanedSource)
        val buffer = clazz.generate(references).asInstanceOf[BufferedRowIterator]
        buffer.init(index, Array(leftIter, rightIter))
        new Iterator[InternalRow] {
   
   
          override def hasNext: Boolean = {
   
   
            val v = buffer.hasNext
            if (!v) durationMs += buffer.durationMs()
            v
          }
          override def next: InternalRow = buffer.next()
        }
      }
    }
  }

  def doCodeGen(): (CodegenContext, CodeAndComment) = {
   
   
    val startTime = System.nanoTime()
    val ctx = new CodegenContext
    val code = child.asInstanceOf[CodegenSupport].produce(ctx, this)

    // main next function.
    ctx.addNewFunction("processNext",
      s"""
        protected void processNext() throws java.io.IOException {
   
   
          ${
   
   code.trim}
        }
       """, inlineToOuterClass = true)

    val className = generatedClassName