最近在学习hadoop知识,昨天在做数据抽取的时候遇到很多问题,做了记录,整理在博客中,以供日后查询方便,同时希望可以帮助到有需要的朋友。
导入文件相关命令:
导出oracle文件到hive:
sqoop import --connect jdbc:oracle:thin:@192.168.1.136:1521/jist --username UAP --password uap --table YKBZ_DETAIL
--
split-by yongdianlb
--hive-import
遇到的问题:
1、bleException: IO Error: The Network Adapter could not establish the connection
java.sql.SQLRecoverableException: IO Error: The Network Adapter could not establish the connection
问题描述:网络适配监听错误
原因源码:
sqoop import --connect jdbc:oracle:thin://192.168.1.136:1521/jist --username UAP --password uap --table YKBZ_DETAIL --hive-import
解决办法:查看对应的网络连接是否正确,比如我这里就是将@符号换成了 //,导致不能找到监听。只需要将//改为@ 就好。
解决源码:
sqoop import --connect jdbc:oracle:thin:@192.168.1.136:1521/jist --username UAP --password uap --table YKBZ_DETAIL --hive-import
2、
ERROR tool.ImportTool: Imported Failed: Attempted to generate class with no columns!
问题描述:没有对象(自己理解,英语不好,见谅Σ( ° △ °|||)︴)
原因源码:sqoop import --connect jdbc:oracle:thin:@192.168.1.136:1521/jist --username uap --password uap --table YKBZ_DETAIL --hive-import
解决办法:将用户名改为大写。
解决源码:
sqoop import --connect jdbc:oracle:thin:@192.168.1.136:1521/jist --username UAP --password uap --table YKBZ_DETAIL --hive-import
3、
14/12/23 04:54:32 ERROR tool.ImportTool: Error during import: No primary key could be found for table YKBZ_DETAIL. Please specify one with --split-by or perform a sequential import with '-m 1'.
问题描述:略
原因代码:略
解决办法:个人建议参考这个博主的内容,不管是问题描述还是解决办法都非常详细
http://blog.sina.com.cn/s/blog_6a67b5c501010gd9.html
sqoop import --connect jdbc:oracle:thin:@192.168.1.136:1521/jist --username UAP --password uap --table YKBZ_DETAIL --split-by yongdianlb --hive-import
sqoop import --connect jdbc:oracle:thin:@192.168.1.136:1521/jist --username UAP --password uap --table YKBZ_DETAIL --split-by yongdianlb --hive-import
4、PriviledgedActionException as:root (auth:SIMPLE)
问题描述:略
原因代码:略
解决办法:
在 mappred-site.xml 添加属性
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
问题描述:
4/12/23 05:49:58 INFO mapreduce.Job: Job job_1419337232716_0001 completed successfully
14/12/23 05:50:06 INFO mapreduce.Job: Counters: 27
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=367120
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=503
HDFS: Number of bytes written=112626288
HDFS: Number of read operations=16
HDFS: Number of large read operations=0
HDFS: Number of write operations=8
Job Counters
Launched map tasks=4
Other local map tasks=4
Total time spent by all maps in occupied slots (ms)=1558361
Total time spent by all reduces in occupied slots (ms)=0
Map-Reduce Framework
Map input records=158424
Map output records=158424
Input split bytes=503
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=83987
CPU time spent (ms)=56180
Physical memory (bytes) snapshot=315813888
Virtual memory (bytes) snapshot=3356368896
Total committed heap usage (bytes)=179679232
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=112626288
14/12/23 05:50:07 INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
14/12/23 05:50:08 INFO ipc.Client: Retrying
connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/12/23 05:50:38 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/12/23 05:50:38 ERROR security.UserGroupInformation: PriviledgedActionException as:root (auth:SIMPLE) cause:java.io.IOException: java.net.ConnectException: Call From localhost.localdomain/127.0.0.1 to 0.0.0.0:10020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
14/12/23 05:50:38 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: java.net.ConnectException: Call From localhost.localdomain/127.0.0.1 to 0.0.0.0:10020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:331)
at org.apache.hadoop.mapred.ClientServiceDelegate.getJobStatus(ClientServiceDelegate.java:416)
14/12/23 05:50:38 ERROR security.UserGroupInformation: PriviledgedActionException as:root (auth:SIMPLE) cause:java.io.IOException: java.net.ConnectException: Call From localhost.localdomain/127.0.0.1 to 0.0.0.0:10020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
14/12/23 05:50:38 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: java.net.ConnectException: Call From localhost.localdomain/127.0.0.1 to 0.0.0.0:10020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:331)
at org.apache.hadoop.mapred.ClientServiceDelegate.getJobStatus(ClientServiceDelegate.java:416)
原因代码:
解决办法:
./mr-jobhistory-daemon.sh start historyserver --config $HADOOP_CONF_DIR
再次执行,成功。
另外附上网上其它的解决方案:(本人没测试)
hadoop dfsadmin -report 先查看datanode状态和主机名
http://blog.youkuaiyun.com/shirdrn/article/details/6562292
在datanode修改hostname
/etc/sysconfig/network
然后重启
/etc/rc.d/init.d/network restart
再次执行 OK