基于Apache-Sqoop-1.4.6
一、Mysql to Hive
1.2017-10-30 20:19:44,317 [myid:] - WARN [main:TableDefWriter@205] - Column register_date had to be cast to a less precise type in Hive 2017-10-30 20:19:44,317 [myid:] - WARN [main:TableDefWriter@205] - Column request_time had to be cast to a less precise type in Hive 2017-10-30 20:19:44,317 [myid:] - WARN [main:TableDefWriter@205] - Column loan_init_prin had to be cast to a less precise type in Hive
mysql中date、timestamp、浮点类型字段,都会有此类警告,但经验证,对数据并没有影响
2.2017-10-31 11:00:03,406 [myid:] - ERROR [main:ImportTool@627] - Encountered IOException running import job: org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://bdphdp/user/user01/tm_loan already exists
如果没有指定target-dir,则导入文件目录为/user/user/{user}/user/{table},并发导入会产生问题,应该设置不同的target-dir,否则目录存在,sqoop会拒绝导入
3.Append mode for hive imports is not yet supported. Please remove the parameter --append-mode
hive import不支持–append模式
4.启动多个sqoop,query模式导入hdfs报错
Could not remove existing jar file: \tmp\sqoop-lishengping2\compile\75fb523acb119757a9ad13fd7834ac74\QueryResult.jar
Could not load jar QueryResult.jar into JVM. (Could not find class QueryResult.)
解决:
–class-name在Sqoop导入期间未指定,因此已创建默认的类名QueryResult。因此不同进程指定不同–class-name名称即可
5.类型转换错误
20/07/03 19:05:11 ERROR orm.ClassWriter: No Java type for SQL type 2011 for column ORGAN_DESC
20/07/03 19:05:11 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.NullPointerException
java.lang.NullPointerException
at org.apache.sqoop.orm.ClassWriter.parseNullVal(ClassWriter.java:1389)
at org.apache.sqoop.orm.ClassWriter.parseColumn(ClassWriter.java:1414)
at org.apache.sqoop.orm.ClassWriter.myGenerateParser(ClassWriter.java:1540)
at org.apache.sqoop.orm.ClassWriter.generateParser(ClassWriter.java:1503)
at org.apache.sqoop.orm.ClassWriter.generateClassForColumns(ClassWriter.java:1958)
at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1749)
at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:106)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:494)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:621)
at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
at cn.huaan.ods.sync.SqoopSync$sqoopSyncTask.run(SqoopSync.java:198)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
解决;
对于HIVE不支持的oracle或者mysql的类型,手动指定列转换类型,如下
#导入hdfs,逗号分隔
--map-column-java ORGAN_DESC=String,ORGAN_DESC2=String
#导入hive,逗号分隔
--map-column-hive ORGAN_DESC=String,ORGAN_DESC2=String
hive导出mysql,hive为string类型,mysql为date类型,类型转换报错
2020-10-10 14:29:53,919 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.lang.IllegalArgumentException
at java.sql.Date.valueOf(Date.java:143)
at org.apache.sqoop.mapreduce.hcat.SqoopHCatExportHelper.convertStringTypes(SqoopHCatExportHelper.java:284)
at org.apache.sqoop.mapreduce.hcat.SqoopHCatExportHelper.convertToSqoop(SqoopHCatExportHelper.java:215)
at org.apache.sqoop.mapreduce.hcat.SqoopHCatExportHelper.convertToSqoopRecord(SqoopHCatExportHelper.java:138)
at org.apache.sqoop.mapreduce.hcat.SqoopHCatExportMapper.map(SqoopHCatExportMapper.java:56)
at org.apache.sqoop.mapreduce.hcat.SqoopHCatExportMapper.map(SqoopHCatExportMapper.java:35)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
解决:由报错可知, java.sql.Date.valueOf提供转换功能,查看源码可知,转换格式只支持yyyy-[m]m-[d]d,并且年月日都需要是合理的数据,在导入mysql之前处理好数据格式即可