SaprkSql 集成 hive 两种方式（spark2.0+hive1.22 ）

最新推荐文章于 2023-04-25 17:47:21 发布

原创最新推荐文章于 2023-04-25 17:47:21 发布 · 2.1k 阅读

1 ·

CC 4.0 BY-SA版权

linux 同时被 2 个专栏收录

7 篇文章

订阅专栏

spark

6 篇文章

订阅专栏

首先明确 sparkSQl 关联hive的时候有多种交互方式：

spark SQL 和 thrift

【先说 thrift】

在 $SPARK_HOEM/conf 中的 hive-site.xml中要去设置以下参数：

<property>
<name>hive.metastore.uris</name>
<value>thrift://mach40:9083</value>
<description>Thrift uri for the remote metastore. Used by metastore client to connect to remote metastore.</description>
</property>

<property>
<name>hive.server2.thrift.min.worker.threads</name>
<value>5</value>
<description>Minimum number of Thrift worker threads</description>
</property>

<property>
<name>hive.server2.thrift.max.worker.threads</name>
<value>500</value>
<description>Maximum number of Thrift worker threads</description>
</property>

<property>
<name>hive.server2.thrift.port</name>
<value>10000</value>
<description>Port number of HiveServer2 Thrift interface. Can be overridden by setting $HIVE_SERVER2_THRIFT_PORT</description>
</property>

<property>
<name>hive.server2.thrift.bind.host</name>
<value>mach40</value>
<description>Bind host on which to run the HiveServer2 Thrift interface.Can be overridden by setting$HIVE_SERVER2_THRIFT_BIND_HOST</description>
</property>

</configuration>

在hive中启动：

[root@slaver3 conf]# hive --service hiveserver2 &

[root@slaver3 conf]# jobs

[1]+ 运行中 hive --service hiveserver2 &

[root@slaver3 conf]# jps

6064 RunJar

在 spark 客户端启动：

[root@slaver3 conf]# beeline -u jdbc:hive2://slaver3:10000/default -n root -p root

出现一下

Connecting to jdbc:hive2://slaver3:10000/default
18/04/24 09:35:50 INFO jdbc.Utils: Supplied authorities: slaver3:10000
18/04/24 09:35:50 INFO jdbc.Utils: Resolved authority: slaver3:10000
18/04/24 09:35:50 INFO jdbc.HiveConnection: Will try to open client transport with JDBC Uri: jdbc:hive2://slaver3:10000/default
Connected to: Apache Hive (version 2.0.1)
Driver: Spark Project Core (version 1.6.0)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 1.6.0 by Apache Hive

0: jdbc:hive2://slaver3:10000/default>

【第二种：sparksql】

首先要知晓一些东西：

spark 访问 hive metastore 也就是他的元数据来执行一些操作；

但是spark 现在支持到了 hive的1.2 一旦hive 版本过高是无法关联的

1.两者安装步骤略：

（参见本博客其他资料）

2.hive-site.xml

scp hive-site.xml $SPARK_HOEM/conf

hive-site.xml内容:

（内容实在多的烦就全清空）

<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://mach40:3306/hive?createDatabaseIfNotExist=true&characterEncoding=UTF-8&useSSL=false</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
<description>username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>hive</value>
<description>password to use against metastore database</description>
</property>
<property>
<name>hive.querylog.location</name>
<value>/home/hive1.22/iotmp</value>
</property>
<property>
<name>hive.exec.scratchdir</name>
<value>/tmp/hive</value>
</property>
<property>
<name>hive.exec.local.scratchdir</name>
<value>/home/hive1.22/iotmp</value>
<description>Local scratch space for Hive jobs</description>
</property>
<property>
<name>hive.downloaded.resources.dir</name>
<value>/home/hive1.22/iotmp</value>
<description>Temporary local directory for added resources in the remote file system.</description>

</property>

// 下面这一步在需要访问的客户端的 spark/conf/中的 hive-site.xml 中去添加服务端不能配置此项

</configuration>

修改 hadoop ： etc/hadoop/core-site.xml

<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
<description>Allow the superuser oozie to impersonate any members of the group group1 and group2</description>
</property>

<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
<description>The superuser can connect only from host1 and host2 to impersonate a user</description>
</property>