hive中的分区针对{表组织}进行规划的;
【静态分区】:
- 创建静态分区
hive>create table logs(st bigint,line string)
>partitioned by (dt string,country string)
>row format delimited fields terminated by ',' ; - 要导入的数据
--要导入的数据如下 -- $>gedit logs.txt 1231,dflksajflkshfdlksdf 123132,asjflkafjlkajflkaf 12131,jkldjflksfdlksfddsf $>gedit logs1.txt 12,ccc 32,aaa
- 加载上述数据,顺便分区
hive>load data local inpath '/home/hyxy/logs.txt' into table hive.logs partition (dt='2018-08-08',country='changchun');
hive>load data local inpath '/home/hyxy/logs1.txt' into table hive.logs partition (dt='2018-08-08',country='haerbin');
- 查看表数据的组织情况:
$>hadoop fs -lsr /user/hive/warehouse
lsr: DEPRECATED: Please use 'ls -R' instead. drwxrwxrwx - hyxy supergroup 0 2018-08-08 14:50 /user/hive/warehouse drwxrwxrwx - hyxy supergroup 0 2018-08-08 14:53 /user/hive/warehouse/hive.db drwxrwxrwx - hyxy supergroup 0 2018-08-08 14:58 /user/hive/warehouse/hive.db/logs drwxrwxrwx - hyxy supergroup 0 2018-08-08 14:59 /user/hive/warehouse/hive.db/logs/dt=2018-08-08 drwxrwxrwx - hyxy supergroup 0 2018-08-08 14:58 /user/hive/warehouse/hive.db/logs/dt=2018-08-08/country=changchun -rwxrwxrwx 3 hyxy supergroup 77 2018-08-08 14:58 /user/hive/warehouse/hive.db/logs/dt=2018-08-08/country=changchun/logs.txt drwxrwxrwx - hyxy supergroup 0 2018-08-08 14:59 /user/hive/warehouse/hive.db/logs/dt=2018-08-08/country=haerbin -rwxrwxrwx 3 hyxy supergroup 22 2018-08-08 14:59 /user/hive/warehouse/hive.db/logs/dt=2018-08-08/country=haerbin/logs1.txt
- 查看分区haerbin分区的数据
hive>select * from hive.logs where country='haerbin'; OK 12 ccc 2018-08-08 haerbin 32 aaa 2018-08-08 haerbin 31 ffff 2018-08-08 haerbin Time taken: 1.051 seconds, Fetched: 3 row(s)
-
查看分区:
hive>show partitions hive.logs;
-
注意:
静态分区的缺点:针对分区列,手动设置,如果分区数据比较多的话,将会计较麻烦!
是在hive数据库执行的。