<5>hive 数据导出和动态分区

最新推荐文章于 2024-07-29 21:15:54 发布

原创最新推荐文章于 2024-07-29 21:15:54 发布 · 477 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#hive #数据导出 #动态分区

hive 专栏收录该内容

7 篇文章

订阅专栏

本文详细介绍了使用Hive进行数据导出的多种方法，包括通过Hadoop命令、INSERT语句、Shell命令结合管道，以及第三方工具如Sqoop。此外，还深入探讨了Hive动态分区的概念及实现，展示了如何利用动态分区简化数据加载过程。

hive 数据导出
1.hadoop命令的方式
get
hadoop fs -get path localPath
text(可以对多种不同格式进行操作,相当于输出流改成了text)
hadoop fs -text path > e2.txt
2.通过insert...directory方式
insert overwrite local directory 'path' row format delimited fields terminated by '|'
select id,name
from e1;
3.Shell命令加管道:hive -f/e | sed/gred/awk >file
4.第三方工具(sqoop...)

hive 动态分区
1.不需要为不同的分区添加不同的插入语句
2.分区不确定,需要从数据中获取
create table if not exists g1(id int,name string,age int) row format delimited fields terminated by ',' stored as textfile;
1,tom,24
2,jack,25
3,lc,27
4,ljc,28
load data local inpath '/home/lz/g1.txt' overwrite into table g1;

create table if not exists g2(name string)
partitioned by(id int,age int)
row format delimited fields terminated by ','
stored as textfile;

insert overwrite table g2 partition(id,age) select name,id,age from g1;