使用Hue创建Spark2的Oozie工作流

本文详细介绍了如何在Oozie中集成Spark2,包括创建Spark2共享库、更新Oozie共享库、添加Spark2 JARs、设置权限、创建Spark2 Oozie工作流等关键步骤。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

1.Oozie共享库添加Spark2

1.查看当前Oozie的share-lib共享库HDFS目录

oozie admin -oozie http://lefincluster-rt1:11000/oozie -sharelibupdate
  1. [ShareLib update status]
  2. sharelibDirOld = hdfs://nameservice1/user/oozie/share/lib/lib_20180605143536
  3. host = http://lefincluster-rt1:11000/oozie
  4. sharelibDirNew = hdfs://nameservice1/user/oozie/share/lib/lib_20180605143536
  5. status = Successful

2.在Oozie的/user/oozie/share/lib/lib_20180605143536创建spark2目录

sudo -u oozie hdfs dfs -mkdir /user/oozie/share/lib/lib_20180605143536/spark2

3.向spark2目录添加spark2的jars和oozie-sharelib-spark*.jar

  1. [root@lefincluster-rt1 jars]# pwd
  2. /opt/cloudera/parcels/SPARK2/lib/spark2/jars
  3. sudo -u oozie hdfs dfs -put *.jar /user/oozie/share/lib/lib_20180605143536/spark2
  1. [root@lefincluster-rt1 spark]# pwd
  2. /opt/cloudera/parcels/CDH/lib/oozie/oozie-sharelib-yarn/lib/spark
  3. sudo -u oozie hdfs dfs -put oozie-sharelib-spark*.jar /user/oozie/share/lib/lib_20180605143536/spark2

4.修改目录权限

sudo -u hdfs hdfs dfs -chmod -R 775 /user/oozie/share/lib/lib_20180605143536/spark2

5.更新Oozie的share-lib

  1. [root@lefincluster-rt1 spark]# oozie admin -oozie http://lefincluster-rt1:11000/oozie -sharelibupdate
  2. [ShareLib update status]
  3. sharelibDirOld = hdfs://nameservice1/user/oozie/share/lib/lib_20180605143536
  4. host = http://lefincluster-rt1:11000/oozie
  5. sharelibDirNew = hdfs://nameservice1/user/oozie/share/lib/lib_20180605143536
  6. status = Successful

6.确认spark2已经添加到共享库

  1. [root@lefincluster-rt1 spark]# oozie admin -oozie http://lefincluster-rt1:11000/oozie -shareliblist
  2. [Available ShareLib]
  3. hive
  4. spark2
  5. distcp
  6. mapreduce-streaming
  7. spark
  8. oozie
  9. hcatalog
  10. hive2
  11. sqoop
  12. pig

2.创建Spark2的Oozie工作流

1.登录Hue,创建Oozie工作流


2.进入WorkSpace



点击lib


在命令行将Spark2自带的example例子上传到/user/hue/oozie/workspaces/hue-oozie-1528256085.53/lib目录

  1. [root@lefincluster-rt1 jars]# pwd
  2. /opt/cloudera/parcels/SPARK2/lib/spark2/examples/jars
  3. sudo -u hdfs hdfs dfs -put spark-examples_2.11-2.1.0.cloudera2.jar /user/hue/oozie/workspaces/hue-oozie-1528256085.53/lib

3.添加Spark2任务





设置使用Spark2,否则默认使用的Spark1



完成配置,点击保存

4.保存完成后,点击运行测试是否正常

运行成功


评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值