Spark临时表tempView的注册/使用/注销_spark.createorreplacetempview-优快云博客

本文介绍了Spark如何创建、使用和注销临时表(tempView)。临时表在Spark脚本中用于SQL查询，它不占用额外内存，是内存数据的一种别名。文章详细阐述了其创建过程、使用场景及注销方法。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

【背景】

Spark脚本中可以通过sparkContext.sql("xxxx")的方式直接调用SQL代码，但其限制是处理的表必须是在spark context中已注册的临时表。临时表不会占用额外内存，可以理解为是对内存空间重新命名了一下而已。

【临时表的创建】

// 创建它的SparkSession对象终止前有效
df.createOrReplaceTempView("tempViewName")  

// spark应用程序终止前有效
df.createOrReplaceGlobalTempView("tempViewName")

【临时表的注销】

spark.catalog.dropTempView("tempViewName")
spark.catalog.dropGlobalTempView("tempViewName")

【临时表的使用例子】

package high_quality._history

import org.apache.log4j.{Level, Logger}
import org.apache.spark.sql.SparkSession

object test {

  def main(args: Array[String]) {

    Logger.getRootLogger.setLevel(Level.ERROR)
    val spark = SparkSession.builder().master("local[*]").getOrCreate()
    import spark.implicits._
    
    // 构造一个DataFrame
    val df = Seq("1").toDF("value")
    // 注册一个临时表
    df.createOrReplaceTempView("tmp_table")
    // 通过spark SQL使用该临时表
    val ret = spark.sql("SELECT * FROM tmp_table")
    ret.show()
  }
}