这里以tradedate作为分区ID导入data到hive
首先创建和csv中的数据一一对应的table
create table t_reverse_repurchase (tradedate string, tradetimestring, securityid string, bidpx1 double, bidsize1 double, offerpx1double, offersize1 double)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
导入本地的csv到hive,如果csv在hdfs,则使用命令load data inpath
load data local inpath '/home/stevie/Downloads/total.csv' intotable t_reverse_repurchase;
创建以tradedate为分区ID的table
create table t_rp (tradetime string, securityid string, bidpx1double, bidsize1 double, offerpx1 double, offersize1 double)partitioned by (tradedate string);
设置hive为动态分区,如果不设置的话,执行下一句就会有exception
set hive.exec.dynamic.partition.mode=nonstrict;
将原来table的data插入到新的分区table
insert into table t_rp partition (tradedate)
select tradetime, securityid, bidpx1, bidsize1, offerpx1,offersize1, tradedate from t_reverse_repurchase;
成功执行后,可以看到hdfs已经分区