解决了
缺py4j.zip和pyspark.zip问题后,结果还是返回exit code 1:
at com.twitter.chill.KryoBase.setInstantiatorStrategy(KryoBase.scala:86)
at com.twitter.chill.EmptyScalaKryoInstantiat or.newKryo(ScalaKryoInstantiator.scala:59)
at org.apache.spark.serializer.KryoSerializer.newKryo(KryoSerializer.scala:84)
at org.apache.spark.serializer.KryoSerializerInstance.borrowKryo(KryoSerializer.scala:273)
at org.apache.spark.serializer.KryoSerializerInstance.(KryoSerializer.scala:258)
at org.apache.spark.serializer.KryoSerializer.newInstance(KryoSerializer.scala:174)
at
at
at
at
at
at
at
…… …… ……
at com.twitter.chill.KryoBase.setInstantiatorStrategy(KryoBase.scala:86)
at com.twitter.chill.EmptyScalaKryoInstantiat or.newKryo(ScalaKryoInstantiator.scala:59)
at org.apache.spark.serializer.KryoSerializer.newKryo(KryoSerializer.scala:84)
at org.apache.spark.serializer.KryoSerializerInstance.borrowKryo(KryoSerializer.scala:273)
at org.apache.spark.serializer.KryoSerializerInstance.(KryoSerializer.scala:258)
at org.apache.spark.serializer.KryoSerializer.newInstance(KryoSerializer.scala:174)
at org.apache.spark.broadcast.TorrentBroadcast$.blockifyObject(TorrentBroadcast.scala:201)
at org.apache.spark.broadcast.TorrentBroadcast.writeBlocks(TorrentBroadcast.scala:102)
…… …… ……
[Compiled Code]
[Compiled Code]
at com.esotericsoftware.kryo.util.DefaultClassResolver.register(DefaultClassResolver.
at com.esotericsoftware.kryo.Kryo.register(Kryo.
at com.esotericsoftware.kryo.Kryo.(Kryo.
at com.esotericsoftware.kryo.Kryo.(Kryo.
at com.twitter.chill.KryoBase.(KryoBase.scala:32)
at com.twitter.chill.EmptyScalaKryoInstantiat or.newKryo(ScalaKryoInstantiator.scala:57)
at org.apache.spark.serializer.KryoSerializer.newKryo(KryoSerializer.scala:84)
at org.apache.spark.serializer.KryoSerializerInstance.borrowKryo(KryoSerializer.scala:273)
【回到exit code返回1的问题】
根据http://community.cloudera.com/t5/Advanced-Analytics-Apache-Spark/SparkStreaming-ExitCodeException-exitCode-13/m-p/32832
的解释:
I got the solution.
In my Spark Streaming application I had set SparkConf.setMaster("local[*]") and in spark-submit I was providing --master yarn-cluster.
So there was conflict in both the masters and it was remaining in ACCEPTED state and exiting.
【尝试一】把setMaster("local[*]")去掉重新上传到oozie进行任务
【依然报错相同的问题】
【尝试二:试着删除冲突的javax.servlet】
首先,在oozie目录下查找
$ find oozie -name javax*.*
oozie/hadooplib/share/hadoop/mapreduce/lib/javax.inject-1.jar
oozie/hadooplib/share/hadoop/yarn/lib/javax.inject-1.jar
oozie/oozie-4.3.0/oozie-server/webapps/oozie/WEB-INF/lib/javax.inject-1.jar
oozie/oozie-4.3.0/libext/javax.inject-1.jar
oozie/oozie-4.3.0/share/lib/hive/javax.inject-1.jar
oozie/oozie-4.3.0/share/lib/hive2/javax.inject-1.jar
oozie/oozie-4.3.0/share/lib/spark/javax.servlet-3.0.0.v201112011016.jar
oozie/oozie-4.3.0/lib/javax.inject-1.jar
接着分别在spark、hadoop等目录下查找:
$ find hadoop-2.7.2 -name javax*.*
hadoop-2.7.2/share/hadoop/mapreduce/lib/javax.inject-1.jar
hadoop-2.7.2/share/hadoop/yarn/lib/javax.inject-1.jar
【spark下直接没找到】
$ find spark-1.6.2-bin-hadoop2.6 -name javax*.*
$
【按照教程提示,先删除hdfs上ooziesharelib的javax.servlet-3.0.0.v201112011016.jar】http://blog.youkuaiyun.com/shuxue051/article/details/47256171
重启oozie,再次运行job
【报错找不到javax.servlet】
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: java.lang.NoClassDefFoundError: javax/servlet/FilterRegistration
【重新搜索整个/usr/share里面的javax.servlet】
share]$ find -name javax.servlet*.*
./presto-server-0.152/lib/javax.servlet-api-3.1.0.jar
./apache-hive-2.1.0-bin/lib/javax.servlet-3.0.0.v201112011016.jar
./sqoop-1.99.7-bin-hadoop200/server/lib/javax.servlet-api-3.1.0.jar
./oozie/oozie-4.3.0/share/lib/spark/javax.servlet-3.0.0.v201112011016.jar
./spark-2.0.0/jars/javax.servlet-api-3.1.0.jar
./apache-drill-1.8.0/jars/classb/javax.servlet-api-3.1.0.jar
可以看到有2个版本:
javax.servlet-api-3.1.0.jar
javax.servlet-3.0.0.v201112011016.jar
javax.servlet-api-3.1.0.jar
javax.servlet-3.0.0.v201112011016.jar
javax.servlet-api-3.1.0.jar
javax.servlet-api-3.1.0.jar
【复制javax.servlet-api-3.1.0.jar版本到hdfs的ooziesharelib】
程序一直运行不停止。kill后发现如下错误:
Exception in thread "dag-scheduler-event-loop" java.lang.NoSuchMethodError: com.esotericsoftware.kryo.Kryo.setInstantiatorStrategy(Lorg/objenesis/strategy/InstantiatorStrategy;)V
根据https://github.com/twitter/chill/issues/209的解释,可能是版本冲突问题
【估摸着javax.servlet-api-3.1.0.jar也不对】
【最终解决】
exit code 返回 1 可能有无数种问题,得在yarn根据运行的applicationId来纠错。
本次遇到的问题,具体查看yarn的log后:
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: java.lang.SecurityException: class "javax.servlet.FilterRegistration"'s signer information does not match signer information of other classes in the same package
是javax.servlet.FilterRegistration版本问题。
【步骤一:用javax.servlet-api-3.1.0.jar替换掉hdfs中oozieShareLib中的javax.servlet-3.0.0.v201112011016.jar】
【接着run不停止,查看yarn的log也看不出问题(因为没有停止运行)出现关于包Kryo的问题】
可以考虑kill掉application再查看log
也可以考虑进yarn的界面点击各个运行的container查看运行状态log页面:
提示:
Exception in thread "dag-scheduler-event-loop" java.lang.NoSuchMethodError: com.esotericsoftware.kryo.Kryo.setInstantiatorStrategy(Lorg/objenesis/strategy/InstantiatorStrategy;)V
【这问题极其卧槽】
查看hdfs上的oozieShareLib,发觉是有kryo-2.22.jar包的,加载到netbeans里查看
其中的com.esotericsoftware.kryo.Kryo类里面的setInstantiatorStrategy方法长这样的:
public void setInstantiatorStrategy(InstantiatorStrategy strategy) {
}
从网上下载了一个kryo-2.24.0.jar
其中的com.esotericsoftware.kryo.Kryo类里面的setInstantiatorStrategy方法长这样的:
public void setInstantiatorStrategy(org.objenesis.strategy.InstantiatorStrategy strategy) {
}
WTF?!?!?!?!??!
InstantiatorStrategy???!?!?
org.objenesis.strategy.InstantiatorStrategy?!?!!?
你就认不出来了?!?!?!?
然后2.22版本开头进行了如下引用:
package com.esotericsoftware.kryo;
import com.esotericsoftware.kryo.factories.SerializerFactory;
import com.esotericsoftware.kryo.io.Input;
import com.esotericsoftware.kryo.io.Output;
import com.esotericsoftware.kryo.util.FastestStreamFactory;
import com.esotericsoftware.kryo.util.IdentityMap;
import com.esotericsoftware.kryo.util.IntArray;
import com.esotericsoftware.kryo.util.ObjectMap;
import com.esotericsoftware.shaded.org.objenesis.instantiator.ObjectInstantiator;
import com.esotericsoftware.shaded.org.objenesis.strategy.InstantiatorStrategy;
import java.util.ArrayList;
而2.24.0版本开头是如下引用:
package com.esotericsoftware.kryo;
import com.esotericsoftware.kryo.factories.SerializerFactory;
import com.esotericsoftware.kryo.io.Input;
import com.esotericsoftware.kryo.io.Output;
import com.esotericsoftware.kryo.util.IdentityMap;
import com.esotericsoftware.kryo.util.IntArray;
import com.esotericsoftware.kryo.util.ObjectMap;
import java.util.ArrayList;
?!@?#!?@¥?#@¥?@#¥?%¥#%……干!!!!!
好吧,是不是我替换掉就好了呢?
【于是当你满心欢喜地把kryo-2.22.jar删了,上传kryo-2.24.0.jar上去的时候】
重启oozie,再跑一遍job,它给你报错:
Exception in thread "dag-scheduler-event-loop" java.lang.NoClassDefFoundError: com/esotericsoftware/minlog/Log
【卧槽啊!这个com.esotericsoftware.minlog.Log类只有kryo-2.22.jar里面才有啊!】
所以你两个kryo都得放在sharelib里面,它们不是替换关系!
你让我是无语呢?还是无语呢?还是无语呢?
这就像你开发一个新版本,结果把旧版本的某个功能删除了。
而某人开发的功能,又要用到你新版本和旧版本的功能。
那新旧版本都有的功能客户该用哪个?是都一样的吗?
会不会有冲突?
冲突发生这今后问题会如何扩散?
【最终解决】
一、用javax.servlet-api-3.1.0.jar替换掉hdfs中oozieShareLib中的javax.servlet-3.0.0.v201112011016.jar
二、确保HDFS上oozie的spark的sharelib有kryo-2.22.jar和kryo-2.24.0.jar
(本文指的是:/user/oozie/share/lib/spark)
二、发觉我简直是误人子弟,果断打脸。再次运行py果然由于两个包冲突,还是发生了找不到Kryo类的问题。正确答案是去Kryo项目主页下载Minlog扔到hdfs里面(而不是留着kryo-2.22.jar):
https://github.com/EsotericSoftware/kryo/blob/master/build/minlog-1.2.jar