hive on tez

hive运行模式

  1. hive on mapreduce 离线计算(默认)
  2. hive on tez   YARN之上支持DAG作业的计算框架
  3. hive on spark 内存计算

hive on tez

Tez是一个构建于YARN之上的支持复杂的DAG任务的数据处理框架。它由Hontonworks开源,它把mapreduce的过程拆分成若干个子过程,同时可以把多个mapreduce任务组合成一个较大的DAG任务,减少了mapreduce之间的文件存储,同时合理组合其子过程从而大幅提升MapReduce作业的性能。

安装tez

tez的安装有源码安装和二进制包安装,这里使用二进制包安装。

hadoop版本:2.9.1

hive版本:2.1.1

tez版本:0.9.0

前提:hadoop环境已经搭建好,包括yarn(tez需要运行在yarn上)、hive

下载

wget http://mirror.bit.edu.cn/apache/tez/0.9.0/apache-tez-0.9.0-bin.tar.gz

安装

# tar zxvf apache-tez-0.9.0-bin.tar.gz
# mv apache-tez-0.9.0-bin/ tez-0.9.0
# hdfs dfs -mkdir -p /tez-0.9.0
# cd /tez-0.9.0/
# hdfs dfs -put share/tez.tar.gz /tez-0.9.0

配置tez

# cd /data1/hadoop/hadoop/etc/hadoop/
# cat tez-site.xml

<?xml version="1.0" encoding="UTF-8"?>

<configuration>

  <property>

    <name>tez.lib.uris</name>

    <value>${fs.defaultFS}/apps/tez-0.9.0/tez.tar.gz</value>

  </property>

  <property>

    <name>tez.container.max.java.heap.fraction</name>

    <value>0.2</value>

  </property>

</configuration>

参考:/tez-0.9.0/conf/tez-default-template.xml

环境变量配置(~/.bashrc)

添加如下配置
export TEZ_CONF_DIR=$HADOOP_CONF_DIR

export TEZ_JARS=/tez-0.9.0/*:/tez-0.9.0/lib/*

export HADOOP_CLASSPATH=$TEZ_CONF_DIR:$TEZ_JARS:$HADOOP_CLASSPATH

执行"source ~/.bashrc"让环境变量生效。

hadoop版本兼容问题

[root@hadoop01 ~]# cd /tez-0.9.0/lib

[root@hadoop01 lib]# rm -rf hadoop-mapreduce-client-core-2.7.0.jar hadoop-mapreduce-client-common-2.7.0.jar

 

[root@hadoop01 lib]# cp /data1/hadoop/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.9.1.jar /tez-0.9.0/lib/

[root@hadoop01 lib]# cp /data1/hadoop/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.9.1.jar /tez-0.9.0/lib/

启动hive

#hive
hive> SET hive.execution.engine=tez; 设置执行引擎为tez,默认是MapReduce
或者修改hive的配置文件hive-site.xml,添加如下配置:

<property>
<name>hive.user.install.directory</name>
<value>/user/</value>
</property>
<property>
<name>hive.execution.engine</name>    #配置成默认使用tez
<value>tez</value>
</property>

测试数据

创建表
hive> create table user_info(user_id bigint, firstname string, lastname string, count string);
插入数据
hive> insert into user_info values(1,'dennis','hu','CN'),(2,'Json','Lv','Jpn'),(3,'Mike','Lu','USA');

Query ID = root_20190618043047_bfc41253-60f9-469d-b6a9-c26c93a92e82
Total jobs = 1
Launching Job 1 out of 1
Tez session was closed. Reopening...
Session re-established.


Status: Running (Executing on YARN cluster with App id application_1560826244680_0015)

----------------------------------------------------------------------------------------------
VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
----------------------------------------------------------------------------------------------
Map 1 .......... container SUCCEEDED 1 1 0 0 0 0
----------------------------------------------------------------------------------------------
VERTICES: 01/01 [==========================>>] 100% ELAPSED TIME: 4.55 s
----------------------------------------------------------------------------------------------
Loading data to table default.user_info
OK
Time taken: 9.488 seconds

查询

> select count(1) from user_info;
Query ID = root_20190618043342_5f83efb4-39bf-4d67-bac4-d67205086ae7
Total jobs = 1
Launching Job 1 out of 1


Status: Running (Executing on YARN cluster with App id application_1560826244680_0015)

----------------------------------------------------------------------------------------------
VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
----------------------------------------------------------------------------------------------
Map 1 .......... container SUCCEEDED 1 1 0 0 0 0
Reducer 2 ...... container SUCCEEDED 1 1 0 0 0 0
----------------------------------------------------------------------------------------------
VERTICES: 02/02 [==========================>>] 100% ELAPSED TIME: 4.46 s
----------------------------------------------------------------------------------------------
OK
9
Time taken: 4.979 seconds, Fetched: 1 row(s)
hive> select count(1) from user_info;
Query ID = root_20190618043349_ecee5657-7c95-43ab-80e9-101dd36d6fc7
Total jobs = 1
Launching Job 1 out of 1


Status: Running (Executing on YARN cluster with App id application_1560826244680_0015)

----------------------------------------------------------------------------------------------
VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
----------------------------------------------------------------------------------------------
Map 1 .......... container SUCCEEDED 1 1 0 0 0 0
Reducer 2 ...... container SUCCEEDED 1 1 0 0 0 0
----------------------------------------------------------------------------------------------
VERTICES: 02/02 [==========================>>] 100% ELAPSED TIME: 0.72 s
----------------------------------------------------------------------------------------------
OK
9
Time taken: 1.156 seconds, Fetched: 1 row(s)

yarn web界面查看

 

 由此可看出,引擎类型变成TEZ。

 

 配置tez-ui

修改tez-site.xml文件

添加如下:

<property>
   <name>tez.history.logging.service.class</name>
   <value>org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService</value> 
 </property> 

 <property> 
  <description>URL for where the Tez UI is hosted</description> 
  <name>tez.tez-ui.history-url.base</name> 
  <value>http://master:9999/tez-ui/</value>     #启动tez-ui的地址
 </property> 

<property> 
    <name>tez.runtime.convert.user-payload.to.history-text</name> 
    <value>true</value> 
</property> 
<property> 
    <name>tez.task.generate.counters.per.io</name> 
    <value>true</value> 
</property> 

修改yarn-site.xml文件

添加如下:

<property>
    <name>yarn.timeline-service.enabled</name>
    <value>true</value>
</property>
<property>
    <name>yarn.timeline-service.hostname</name>
    <value>master</value>
</property>
<property>
    <name>yarn.timeline-service.http-cross-origin.enabled</name>
    <value>true</value>
</property>
<property>
    <name>yarn.timeline-service.generic-application-history.enabled</name>
    <value>true</value>
</property>
<property>
    <name>yarn.resourcemanager.system-metrics-publisher.enabled</name>
    <value>true</value>
</property>
<property>
  <name>yarn.timeline-service.address</name>
  <value>${yarn.timeline-service.hostname}:10200</value>
</property>

<property>
  <name>yarn.timeline-service.webapp.address</name>
  <value>${yarn.timeline-service.hostname}:8188</value>
</property>

<property>
  <name>yarn.timeline-service.webapp.https.address</name>
  <value>${yarn.timeline-service.hostname}:8190</value>
</property>

<property>
  <description>Handler thread count to serve the client RPC requests.</description>
  <name>yarn.timeline-service.handler-thread-count</name>
  <value>10</value>
</property>
<property>
  <name>yarn.timeline-service.generic-application-history.enabled</name>
  <value>false</value>
</property>

<property>
  <name>yarn.timeline-service.generic-application-history.store-class</name>
  <value>org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore</value>
</property>

拷贝文件

拷贝tez-site.xml和yarn-site.xml文件到其他机器

安装tomcat

下载地址:https://tomcat.apache.org/download-80.cgi

1、把Tomcat目录下的webapps里的文件删除,再把把上面的tez-0.9.0 下的tez-ui2-0.9.0.war 复制到webapps目录里

#cat /tez-0.9.0/tez-ui-0.9.0.war /data1/apache-tomcat-8.5.42/webapps/tez-ui.war

这里的拷贝过去的名字注意跟上面tez-site.xml配置文件里的要相同,也就是tez.tez-ui.history-url.base对应的地址。

2、修改Tomcat的配置文件:service.xml 修改8080端口为9999,也是跟上面的配置一样;

3、由于上面修改过了配置,所以要重新启动HDFS集群和Hive程序;而且还要启动一个叫:timelineserver服务;

./stop-all.sh     #停止HDFS集群
./start-dfs.sh
./start-yarn.sh
./mr-jobhistory-daemon.sh start historyserver
./yarn-daemon.sh start timelineserver       #必须要先启动HDFS集群后才可以启动起来


root@master:/data1/apache-tomcat-8.5.42/webapps# jps
101719 Bootstrap     #tomcat服务进程
2551 HMaster
99878 DFSZKFailoverController
102662 JobHistoryServer
94729 RunJar
103561 Jps
94392 RunJar
99275 NameNode
99610 JournalNode
588 QuorumPeerMain
102271 ResourceManager
102798 ApplicationHistoryServer    #这个就是timelineserver服务

启动hive

nohup hive --service metastore &
nohup hive --service hiveserver2 &

此时进入到hive里面执行一个任务,执行任务前,需要先切换执行引擎为tez,然后通过http://tez-host:9999/tez-ui访问,tez-host既是安装timelineserver的主机。

如果访问失败,可能是防火墙或者配置错误导致:

如下:

 

 

 

借鉴:https://blog.youkuaiyun.com/duguyiren3476/article/details/46349177

借鉴:https://blog.youkuaiyun.com/gobitan/article/details/85109644

借鉴:http://tez.apache.org/install.html (官网)

借鉴:https://www.58jb.com/html/114.html

转载于:https://www.cnblogs.com/yjt1993/p/11044578.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值