现在想单独的写个类实现读取hdfs文件内容导入mysql,也就是使用java api 来写main方法那种形式来实现。
Configuration conf = new Configuration(true);
conf.set("fs.default.name", "hdfs://<span style="font-family: Arial, Helvetica, sans-serif;">cluster2</span>");
conf.set("fs.hdfs.impl", "org.apache.hadoop.hdfs.DistributedFileSystem");
FileSystem fs = null;//
try {
fs = FileSystem.get(conf);
} catch (Exception e) {
LOG.error("getFileSystem failed :" + e.getMessage());
}
但是上述内容会报错,java.net.UnknownHostException: hdfs://cluster2
至此,因为是hadoop yarn 2.2,所以根据 http://www.oschina.net/code/snippet_121248_34430 博文中的配置,增加了conf中的属性。
修正如下
conf = new Configuration();
conf.set("fs.defaultFS", "hdfs://cluster2");
conf.set("fs.hdfs.impl", "org.apache.hadoop.hdfs.DistributedFileSystem");
conf.set("ha.zookeeper.quorum",
"xx:2181,xx:2181,xx:2181");
conf.set("dfs.nameservices", "cluster2");
conf.set("dfs.ha.namenodes.cluster2", "nn1,nn2");
conf.set("dfs.namenode.rpc-address.cluster2.nn1", "xx:8020");
conf.set("dfs.namenode.rpc-address.cluster2.nn2", "xx:8020");
conf.set("dfs.client.failover.proxy.provider.cluster2",
"org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider");
conf.set("hadoop.security.authentication", "kerberos");
conf.set("yarn.resourcemanager.scheduler.address", "xx:8030");
错误提示终于变了,但是这个错误也没解决。
org.apache.hadoop.security.AccessControlException: SIMPLE authentication is not enabled. Available:[TOKEN, KERBEROS]
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:534)
at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1681)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1106)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1102)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1102)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1397)
at com.netease.weblogOffline.exp.mysql.OrgMediaSQL.main(OrgMediaSQL.java:126)
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): SIMPLE authentication is not enabled. Available:[TOKEN, KERBEROS]
at org.apache.hadoop.ipc.Client.call(Client.java:1347)
at org.apache.hadoop.ipc.Client.call(Client.java:1300)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy7.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:651)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:622)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy8.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1679)
能力有限,没有找到有效的解决方案后,只能回到最初的方法来解决。
利用hadoop命令来跑一个空的任务,主要执行读取hdfs文件内容。
最好自己想了下,java -classpath这种形式组织的configuration中的属性值肯定少于hadoop下的配置文件中的属性值。
还是老老实实的走hadoop吧,我释然了。