经测试可执行sqoop脚本如下:
sqoop import --connect jdbc:oracle:thin:@IP:PORT:SCHEMA --username username -password=password --table XXX --columns "columns" --where " c1>=to_date('2015-01-01 00:00:00','yyyy-mm-dd hh24:mi:ss') and c1<=to_date('2015-02-01 00:00:00','yyyy-mm-dd hh24:mi:ss') " -m 8 --split-by ID --fields-terminated-by '^' --target-dir /importdata/XXX/
原测试sqoop脚本如下:
sqoop import --connect jdbc:oracle:thin:@IP:PORT:SCHEMA --username username -password=password --query ‘select columns from talbename where 1=1 and $CONDTIONS' --where " c1>=to_date('2015-01-0100:00:00','yyyy-mm-dd hh24:mi:ss') and c1<=to_date('2015-02-01 00:00:00','yyyy-mm-dd hh24:mi:ss') " -m 8 --split-by ID --fields-terminated-by '^' --target-dir /importdata/XXX/
或测试脚本为:
sqoop import --connect jdbc:oracle:thin:@IP:PORT:SCHEMA --username username -password=password --query ‘select columns from talbename where 1=1 and c1>=to_date('2015-01-01 00:00:00','yyyy-mm-dd hh24:mi:ss') and c1<=to_date('2015-02-01 00:00:00','yyyy-mm-dd hh24:mi:ss') and $CONDTIONS' -m 8 --split-by ID --fields-terminated-by '^' --target-dir /importdata/XXX/
或测试脚本为:
sqoop import --connect jdbc:oracle:thin:@IP:PORT:SCHEMA --username username -password=password --query ‘select columns from talbename where 1=1 and $CONTIONS and c1>=to_date('2015-01-01 00:00:00','yyyy-mm-dd hh24:mi:ss') and c1<=to_date('2015-02-01 00:00:00','yyyy-mm-dd hh24:mi:ss') ' -m 8 --split-by ID --fields-terminated-by '^' --target-dir /importdata/XXX/
或者其他一些测试脚本,在使用时有心得如下;
--query似乎并不能支持复杂where条件筛选,但必须有where语句与$CONDTIONS,--query即使后面加--where 参数也还是无法起到作用,现在测试通过的情况仅有--columns与--where合用时是满足要求的!
sqoop --query参数中where语句与$CONDTIONS占位符合用是将由--split-by参数划分出的每个map执行SQL补全,如:
a. 通过split-by id指定 map数据划分所依据的列
b. sqoop运行 select max(id), min(id) from
xxx 获得数据范围,比如max(id) = 100, min(id) = 1
c. sqoop若指定使用4个map进行数据的导入,因此在情况下,可将id的区间分为4段,通过不同的map进行导入,分别为:
1 <= id and id <= 25
26 <= id and id <= 50
51 <= id and id <= 75
76 <= id and id <= 100
d. 每个map将SQL补全,比如 select * from xxx where (1 <= id and id <= 25). 用(1 <= id and id <= 25) 替代了--query中的占位符 $CONDITIONS