How-to: enable spark sql in cdh version spark

本文介绍如何为 CDH 5.4.1 中的 Spark 启用 Spark SQL 功能。主要包括更新 Hive 版本至 0.13.1a,解决编译问题,以及重新打包 Spark 程序。这些步骤能够确保 Spark SQL 在 CDH 5.4.1 上正常运行。
Cloudera spark does not support sparksql. Here, I will take cdh-5.4.1 spark as example to enabled sparkSql.
The overall steps wil be update hive version, resolve compile issue and update spark package.
  1. Update hive version: cdh 5.4.1 is using hive1.1, while current apache spark only support hive0.12 and hive0.13. Here update hive version to 0.13.1a:
    pom.xml:
    -    <hive.group>org.apache.hive</hive.group>
    +    <hive.group>org.spark-project.hive</hive.group>
    -    <hive.version>${cdh.hive.version}</hive.version>
    +    <hive.version>0.13.1a</hive.version>
    -    <hive.version.short>1.1.0</hive.version.short>
    +    <hive.version.short>0.13.1</hive.version.short>
         <profile>
           <id>hive-0.13.1</id>
           <properties>
    +        <hive.group>org.spark-project.hive</hive.group>
             <hive.version>0.13.1a</hive.version>
  2. Resolve compile issues: 
    pom.xml:
    +        <groupId>jline</groupId>
    +        <artifactId>jline</artifactId>
    +        <version>0.9.94</version>
    +      </dependency>
    +      <dependency>
     sql/hive-thriftserver/pom.xml
    +    <dependency>
    +      <groupId>jline</groupId>
    +      <artifactId>jline</artifactId>
    +      <version>0.9.94</version>
    +    </dependency>
    sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLSessionManager.scala
    -    // SessionManager.init() initializes the log manager but this method never actually calls
    -    // super.init(), so fix that here.
    -    val logManager = new LogManager()
    -    setSuperField(this, "logManager", logManager)
    -    addService(logManager)
    -
    sql/hive/v0.13.1/src/main/scala/org/apache/spark/sql/hive/Shim13.scala 
    -  import org.apache.hive.com.esotericsoftware.kryo.Kryo
    +  import com.esotericsoftware.kryo.Kryo
  3. Update spark package:
    - Repackage spark: ./make-distribution.sh --tgz -Pyarn -Phive -Phive-thriftserver
    - copy hive-site.xml to ${SPARK_HOME}/conf
    - Update ${SPARK_HOME}/lib directory
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值