安装了pentaho data integration4.4
下载地址点击打开链接
我用的hadoop版本为cdh3u0
新建一个job将文件从本地拷贝到云上
结果报错如下
ERROR 10-04 10:51:15,842 - Hadoop Copy Files - Can not copy file/folder [file://weblogs_rebuild/weblogs_rebuild.txt] to [hdfs://localhost:9000/pentaho/test-weblog]. Exception : [
Unable to get VFS File object for filename 'hdfs://localhost:9000/pentaho/test-weblog' : Could not resolve file "hdfs://localhost/pentaho/test-weblog".
]
ERROR 10-04 10:51:15,842 - Hadoop Copy Files - org.pentaho.di.core.exception.KettleFileException:
Unable to get VFS File object for filename 'hdfs://localhost:9000/pentaho/test-weblog' : Could not resolve file "hdfs://localhost/pentaho/test-weblog".
at org.pentaho.di.core.vfs.KettleVFS.getFileObject(KettleVFS.java:161)
at org.pentaho.di.core.vfs.KettleVFS.getFileObject(KettleVFS.java:104)
at org.pentaho.di.job.entries.copyfiles.JobEntryCopyFiles.ProcessFileFolder(JobEntryCopyFiles.java:376)
at org.pentaho.di.job.entries.copyfiles.JobEntryCopyFiles.execute(JobEntryCopyFiles.java:324)
at org.pentaho.di.job.Job.execute(Job.java:589)
at org.pentaho.di.job.Job.execute(Job.java:728)
at org.pentaho.di.job.Job.execute(Job.java:443)
at org.pentaho.di.job.Job.run(Job.java:363)
按照pentaho网站上的说法只要将hadoop-core的jar拷贝到$PDI_HOME/libext/bigdata目录下
结果还是报错,试了拷贝commons-configuration-1.7.jar 和guava-10.0.1.jar都不行
最后找到$PDI_HOME/plugins/pentaho-big-data-plugin/hadoop-configurations/hadoop-20/lib目录,将这个目录下的hadoop-core0.20.2版本的jar删掉,换成cdh3u0的jar,运行成功
还有一个报错:
WARN 10-04 11:28:48,188 - Unable to load Hadoop Configuration from "file:///pentaho/data-integration/plugins/pentaho-big-data-plugin/hadoop-configurations/mapr". For more information enable debug logging.
不知为何配置文件配置的是
active.hadoop.configuration=hadoop-20
但是提示信息找的是mapr的文件夹,任务能正确通过,数据能正确入云,不过保险起见把hadoop-core的jar拷贝到$PDI_HOME/libext/pentaho目录下,这个warn提示就没有了

在配置Pentaho Data Integration(Kettle)4.4与CDH3u0 Hadoop集成时遇到错误,尝试将hadoop-core JAR复制到指定目录未成功。最终解决方案是替换$PDI_HOME/plugins/pentaho-big-data-plugin/hadoop-configurations/hadoop-20/lib下的hadoop-core0.20.2 JAR为CDH3u0的JAR,从而解决了文件拷贝到HDFS的问题。此外,解决了一个WARN提示,通过将hadoop-core JAR拷贝到$PDI_HOME/libext/pentaho目录下,消除了找不到配置文件的警告。
1819

被折叠的 条评论
为什么被折叠?



