No SOLRURL specified. Skipping indexing.
Injecting seed URLs
/engine/nutch/runtime/local/bin/nutch inject urls -crawlId firstcrawl
InjectorJob: starting at 2017-04-12 08:55:47
InjectorJob: Injecting urlDir: urls
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/HBaseConfiguration
at org.apache.gora.hbase.store.HBaseStore.initialize(HBaseStore.java:114)
at org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:102)
at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:161)
at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:135)
at org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:78)
at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:218)
at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:252)at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:275)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:284)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.HBaseConfigurationat java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 10 more
Error running:
/engine/nutch/runtime/local/bin/nutch inject urls -crawlId firstcrawl
Failed with exit value 1
在$NUTCH_HOME/runtime/local目录下运行 nohup bin/crawl urls firstcrawl 4 & 出现如上问题
原因:
linux上预安装了jdk1.6的版本,和之后自行安装的jdk1.8发生冲突
解决:
使用
rpm -qa|grep gcj
查看已安装的jdk,然后用yum -y remove命令删除