确保scala版本
Spark1.4搭配Scala 2.10
Spark1.6搭配Scala 2.10
Spark2.0搭配Scala 2.11
查看lib
Hive需要三个jar包,分别是datanucleus-api-jdo-3.2.6.jar、datanucleus-core-3.2.10.jar、datanucleus-rdbms-3.2.9.jar,如果已经有了就不需要重新编译了。如果需要重新编译,源码下载地址如下:https://github.com/apache/spark/releases/tag/v1.6.2
复制hive/hdfs配置文件
cd /appl/hive-1.2.1/conf
cp hive-site.xml /appl/spark-1.6.2/conf/
cd /appl/hadoop-2.7.0/etc/hadoop
cp core-site.xml /appl/spark-1.6.2/conf/
cp hdfs-site.xml /appl/spark-1.6.2/conf/
(the datanucleus jars under the lib directory and hive-site.xml under conf/ directory need to be available on the driver and all executors launched by the YARN cluster.)
启动
./bin/spark-shell --jars /appl/hive-1.2.1/lib/mysql-connector-java-5.1.30-bin.jar
测试
import org.apache.spark.sql.SQLContext
val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)
sqlContext.sql("create table if not exists test1 (id int
Spark1.4搭配Scala 2.10
Spark1.6搭配Scala 2.10
Spark2.0搭配Scala 2.11
查看lib
Hive需要三个jar包,分别是datanucleus-api-jdo-3.2.6.jar、datanucleus-core-3.2.10.jar、datanucleus-rdbms-3.2.9.jar,如果已经有了就不需要重新编译了。如果需要重新编译,源码下载地址如下:https://github.com/apache/spark/releases/tag/v1.6.2
复制hive/hdfs配置文件
cd /appl/hive-1.2.1/conf
cp hive-site.xml /appl/spark-1.6.2/conf/
cd /appl/hadoop-2.7.0/etc/hadoop
cp core-site.xml /appl/spark-1.6.2/conf/
cp hdfs-site.xml /appl/spark-1.6.2/conf/
(the datanucleus jars under the lib directory and hive-site.xml under conf/ directory need to be available on the driver and all executors launched by the YARN cluster.)
启动
./bin/spark-shell --jars /appl/hive-1.2.1/lib/mysql-connector-java-5.1.30-bin.jar
测试
import org.apache.spark.sql.SQLContext
val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)
sqlContext.sql("create table if not exists test1 (id int

本文介绍了如何配置Spark使其支持Hive,包括匹配Scala版本、确认必要的Hive JAR包、复制hive/hdfs配置文件到Spark目录,以及启动Spark Shell并进行测试创建和加载Hive表的操作。
最低0.47元/天 解锁文章
4105

被折叠的 条评论
为什么被折叠?



