导出Oracle文件到hive中遇到的相关问题解决方法

最新推荐文章于 2022-07-25 14:10:14 发布

原创最新推荐文章于 2022-07-25 14:10:14 发布 · 1.1k 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#oracle #hive #hadoop

Hadoop学习专栏收录该内容

6 篇文章

订阅专栏

本文记录了使用Hadoop进行数据抽取过程中遇到的问题及解决方法，包括网络连接异常、用户名大小写敏感、表无主键等问题，并提供了具体的解决步骤。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

最近在学习hadoop知识，昨天在做数据抽取的时候遇到很多问题，做了记录，整理在博客中，以供日后查询方便，同时希望可以帮助到有需要的朋友。

导入文件相关命令：

导出oracle文件到hive：

sqoop import --connect jdbc:oracle:thin:@192.168.1.136:1521/jist --username UAP --password uap --table YKBZ_DETAIL -- split-by yongdianlb --hive-import

遇到的问题：

1、bleException: IO Error: The Network Adapter could not establish the connection

java.sql.SQLRecoverableException: IO Error: The Network Adapter could not establish the connection

问题描述：网络适配监听错误

原因源码： sqoop import --connect jdbc:oracle:thin://192.168.1.136:1521/jist --username UAP --password uap --table YKBZ_DETAIL --hive-import

解决办法：查看对应的网络连接是否正确，比如我这里就是将@符号换成了 //，导致不能找到监听。只需要将//改为@ 就好。

解决源码： sqoop import --connect jdbc:oracle:thin:@192.168.1.136:1521/jist --username UAP --password uap --table YKBZ_DETAIL --hive-import

2、 ERROR tool.ImportTool: Imported Failed: Attempted to generate class with no columns!

问题描述：没有对象（自己理解，英语不好，见谅Σ( ° △ °|||)︴）

原因源码：sqoop import --connect jdbc:oracle:thin:@192.168.1.136:1521/jist --username uap --password uap --table YKBZ_DETAIL --hive-import

解决办法：将用户名改为大写。

解决源码： sqoop import --connect jdbc:oracle:thin:@192.168.1.136:1521/jist --username UAP --password uap --table YKBZ_DETAIL --hive-import

3、 14/12/23 04:54:32 ERROR tool.ImportTool: Error during import: No primary key could be found for table YKBZ_DETAIL. Please specify one with --split-by or perform a sequential import with '-m 1'.

问题描述：略

原因代码：略

解决办法：个人建议参考这个博主的内容，不管是问题描述还是解决办法都非常详细 http://blog.sina.com.cn/s/blog_6a67b5c501010gd9.html
sqoop import --connect jdbc:oracle:thin:@192.168.1.136:1521/jist --username UAP --password uap --table YKBZ_DETAIL --split-by yongdianlb --hive-import

4、PriviledgedActionException as:root (auth:SIMPLE)

问题描述：略

原因代码：略

解决办法：在 mappred-site.xml 添加属性
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>

5、 ERROR security.UserGroupInformation: PriviledgedActionException as:root (auth:SIMPLE) cause:java.io.IOException: java.net.ConnectException: Call From localhost.localdomain/127.0.0.1 to 0.0.0.0:10020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused

问题描述：

4/12/23 05:49:58 INFO mapreduce.Job: Job job_1419337232716_0001 completed successfully

14/12/23 05:50:06 INFO mapreduce.Job: Counters: 27

File System Counters

FILE: Number of bytes read=0

FILE: Number of bytes written=367120

FILE: Number of read operations=0

FILE: Number of large read operations=0

FILE: Number of write operations=0

HDFS: Number of bytes read=503

HDFS: Number of bytes written=112626288

HDFS: Number of read operations=16

HDFS: Number of large read operations=0

HDFS: Number of write operations=8

Job Counters

Launched map tasks=4

Other local map tasks=4

Total time spent by all maps in occupied slots (ms)=1558361

Total time spent by all reduces in occupied slots (ms)=0

Map-Reduce Framework

Map input records=158424

Map output records=158424

Input split bytes=503

Spilled Records=0

Failed Shuffles=0

Merged Map outputs=0

GC time elapsed (ms)=83987

CPU time spent (ms)=56180

Physical memory (bytes) snapshot=315813888

Virtual memory (bytes) snapshot=3356368896

Total committed heap usage (bytes)=179679232

File Input Format Counters

Bytes Read=0

File Output Format Counters

Bytes Written=112626288

14/12/23 05:50:07 INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server

14/12/23 05:50:08 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)

14/12/23 05:50:38 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/12/23 05:50:38 ERROR security.UserGroupInformation: PriviledgedActionException as:root (auth:SIMPLE) cause:java.io.IOException: java.net.ConnectException: Call From localhost.localdomain/127.0.0.1 to 0.0.0.0:10020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
14/12/23 05:50:38 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: java.net.ConnectException: Call From localhost.localdomain/127.0.0.1 to 0.0.0.0:10020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:331)
at org.apache.hadoop.mapred.ClientServiceDelegate.getJobStatus(ClientServiceDelegate.java:416)

原因代码：

解决办法：

./mr-jobhistory-daemon.sh start historyserver --config $HADOOP_CONF_DIR

再次执行，成功。

另外附上网上其它的解决方案：（本人没测试）

hadoop dfsadmin -report 先查看datanode状态和主机名