Spark job jar包冲突解决方案

本文介绍了在Spark集群中因Snappy版本冲突导致的UnsatisfiedLinkError异常,并提供了详细的解决方案,包括如何通过调整类路径来避免版本冲突。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

最近在spark上部署程序使用logback发送日志到graylog2的过程中,碰到异常:

java.lang.UnsatisfiedLinkError: org.xerial.snappy.SnappyNative.uncompressedLength(Ljava/lang/Object;II)I
	at org.xerial.snappy.SnappyNative.uncompressedLength(Native Method)
	at org.xerial.snappy.Snappy.uncompressedLength(Snappy.java:541)
	at org.xerial.snappy.SnappyInputStream.hasNextChunk(SnappyInputStream.java:350)
	at org.xerial.snappy.SnappyInputStream.rawRead(SnappyInputStream.java:158)
	at org.xerial.snappy.SnappyInputStream.read(SnappyInputStream.java:142)
	at com.esotericsoftware.kryo.io.Input.fill(Input.java:140)
	at com.esotericsoftware.kryo.io.Input.require(Input.java:155)
	at com.esotericsoftware.kryo.io.Input.readInt(Input.java:337)
	at com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:109)
	at com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:610)
	at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:721)
	at org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:228)
	at org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(TorrentBroadcast.scala:217)
	at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:178)
	at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1206)
	at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:165)
	at org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64)
	at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64)
	at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:88)
	at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)
	at org.apache.spark.scheduler.Task.run(Task.scala:89)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:748)

另外一个异常:

java.lang.UnsatisfiedLinkError: org.xerial.snappy.SnappyNative.maxCompressedLength(I)I
	at org.xerial.snappy.SnappyNative.maxCompressedLength(Native Method)
	at org.xerial.snappy.Snappy.maxCompressedLength(Snappy.java:316)
	at org.xerial.snappy.SnappyOutputStream.<init>(SnappyOutputStream.java:79)
	at org.apache.spark.io.SnappyCompressionCodec.compressedOutputStream(CompressionCodec.scala:156)
	at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$4.apply(TorrentBroadcast.scala:200)
	at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$4.apply(TorrentBroadcast.scala:200)
	at scala.Option.map(Option.scala:145)
	at org.apache.spark.broadcast.TorrentBroadcast$.blockifyObject(TorrentBroadcast.scala:200)
	at org.apache.spark.broadcast.TorrentBroadcast.writeBlocks(TorrentBroadcast.scala:102)
	at org.apache.spark.broadcast.TorrentBroadcast.<init>(TorrentBroadcast.scala:85)
	at org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34)
	at org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:63)
	at org.apache.spark.SparkContext.broadcast(SparkContext.scala:1334)

 

问题原因:

应用程序使用到的snappy-java版本是1.1.1.6,但是默认加载的spark系统默认的包

CDH-5.10.2-1.cdh5.10.2.p0.5/lib/hadoop/lib/snappy-java-1.0.4.1.jar

在1.0.4.1包里没有相应的方法,所以报错。

 

TIPS:

查看具体使用的是哪个版本方法:spark-submit 的时候添加

--driver-java-options -verbose:class

 

解决方法:

spark-submit的时候添加:

--jars yourSnappyJar   \
--conf "spark.driver.extraClassPath=snappy-java-version.jar" \
--conf "spark.executor.extraClassPath=snappy-java-version.jar" \

使用spark.{driver,executor}.extraClassPath显示的把需要引用的包加入到类路径的最前面,这样就解决了spark classpath和user classpath里有相同包冲突的问题。

 

参考文献2的方法也能解决问题,但是修改了spark的全局环境,容易引起其他应用错误,故不推荐。

spark 的包依赖关系总结,请移步:https://blog.youkuaiyun.com/adorechen/article/details/80110272#summary

 

参考文献:

https://github.com/broadinstitute/gatk/issues/1873

http://community.cloudera.com/t5/Advanced-Analytics-Apache-Spark/Override-libraries-for-spark/td-p/32125

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值