1.1.2 运行crawl报错Job failed
Exception in thread "main" java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:604)
at org.apache.nutch.indexer.DeleteDuplicates.dedup(DeleteDuplicates.java
:439)
at org.apache.nutch.crawl.Crawl.main(Crawl.java:135)
问题解决:
此多为crawl-urlfilter.txt:MY.DOMAIN.NAME的修改不正确
1.1.3 又一个Job failed
Exception in thread "main" java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:604)
at org.apache.nutch.indexer.DeleteDuplicates.dedup(DeleteDuplicates.java
:439)
at org.apache.nutch.crawl.Crawl.main(Crawl.java:135)
问题解决:
多为crawl-urlfilter.txt的MY.DOMAIN.NAME修改不正确
1.1.4 Eclipse中运行nutch:Job failed
Exception in thread "main" java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:604)
at org.apache.nutch.crawl.Injector.inject(Injector.java:162)
at org.apache.nutch.crawl.Crawl.main(Crawl.java:115)
问题解决:
此问题是eclipse的java版本设置问题,解决方法:
如原来使用java1.4,需要改为1.6
project-》properties-》java compiler
右 jdk compliance
compiler compliance level:改为6.0
配置完成nutch容易出现的错误
最新推荐文章于 2021-02-15 22:54:05 发布