spark+hdfs+hive

最新推荐文章于 2025-06-19 14:26:13 发布

原创

最新推荐文章于 2025-06-19 14:26:13 发布 · 2.5k 阅读

·

0

·

CC 4.0 BY-SA版权

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

1 启动hdfs

https://blog.youkuaiyun.com/ssllkkyyaa/article/details/86735817

2启动hive

https://blog.youkuaiyun.com/ssllkkyyaa/article/details/86527365

s200启动hive

$HIVE_HOME/bin/hive

3启动spark

https://blog.youkuaiyun.com/ssllkkyyaa/article/details/89703266

s200
start-all.sh
------------------
start-master.sh //RPC端口 7077
start-slave.sh spark://s201:7077

------dfs创建测试文档

touch test.txt

vi test.txt

hello world
hello world1
hello world2
hello world3
hello world4

------

hadoop fs -ls -R /

hadoop fs -mkdir -p /mycluster/user/centos

hadoop fs -put test.txt /mycluster/user/centos

hadoop fs -chmod 777 /mycluster/user/centos/test.txt

hadoop fs -cat /mycluster/user/centos/test.txt

hdfs dfs -mkdir -p /user/centos

hdfs dfs -cp /mycluster/user/centos/test.txt /user/centos

-----

spark集成hadoop ha
-------------------------
1.复制core-site.xml + hdfs-site.xml到spark/conf目录下
2.分发文件到spark所有work节点

2.2配置hive环境变量到spark

   3.启动spark集群
   4.启动spark-shell,连接spark集群上
       $>spark-shell --master spark://s200:7077
       $scala>sc.textFile("hdfs://mycluster/user/centos/test.txt").collect();

退出：ctrl+d

(注意：必须在“dfs创建测试文档”这一部进行拷贝cp ，默认mycluster位置 mycluster/user/centos/test.txt

否则会出现找不到 mycluster/user/centos/test.txt报错 )

sc.textFile("hdfs://mycluster/user/centos/test.txt").flatMap(_.split(" ")). map((_,1)).map(t=>{import scala.util.Random;val par=Random.nextInt(100);(t._1+"_"+par,1)}).reduceByKey(_+_).map(t=>{val arr=t._1.splite("_");(arr(0),t

最低0.47元/天解锁文章

200万优质内容无限畅学

评论 1

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。