Spark - 解决序列化问题

探讨了在Spark项目中遇到的序列化问题,当尝试序列化包含未标记为可序列化的Unit对象的ArrayList时,引发了NotSerializableException。通过使用广播变量,问题得以解决,广播数据使序列化过程正常进行。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

项目中,有个类序列化有问题:

Caused by: java.io.NotSerializableException: entity.Unit
Serialization stack:
	- object not serializable (class: entity.Unit, value: Unit(id=1, name=m2))
	- writeObject data (class: java.util.ArrayList)
	- object (class java.util.ArrayList, [Unit(id=1, name=m2), Unit(id=7, name=m3), Unit(id=8, name=kg), Unit(id=37, name=m)])
	- field (class: service.impl.ORToBSMap, name: units, type: interface java.util.List)
	- object (class service.impl.ORToBSMap, tz.lion.service.impl.ORToBSMap@74130456)
	- field (class: org.apache.spark.sql.execution.MapElementsExec, name: func, type: class java.lang.Object)
	- object (class org.apache.spark.sql.execution.MapElementsExec, MapElements tz.lion.service.impl.ORToBSMap@74130456, obj#37: tz.lion.entity.BuildingStruct
+- DeserializeToObject createexternalrow(floor#21.toString, struct#24.toString, StructField(floor,StringType,true), StructField(struct,StringType,true)), obj#36: org.apache.spark.sql.Row
   +- HashAggregate(keys=[floor#21, struct#24], functions=[], output=[floor#21, struct#24])
      +- Exchange hashpartitioning(floor#21, struct#24, 200)
         +- *(1) HashAggregate(keys=[floor#21, struct#24], functions=[], output=[floor#21, struct#24])
            +- *(1) Project [floor#21, struct#24]
               +- *(1) Sort [id#22L ASC NULLS FIRST], true, 0
                  +- Exchange rangepartitioning(id#22L ASC NULLS FIRST, 200)
                     +- LocalTableScan [floor#21, id#22L, struct#24]

SparkConf的配置是正确的:

.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
.registerKryoClasses(new Class[]{Unit.class})

跟踪kryo日志,注册成功了

00:04 TRACE: [kryo] Register class ID 39: entity.Unit (com.esotericsoftware.kryo.serializers.FieldSerializer)

后来发现,广播数据,就不会报错了

        final JavaSparkContext sc = new JavaSparkContext(session.sparkContext());

        final Broadcast<List<Unit>> bcUnit = sc.broadcast(unitRepository.loadAll());

跟踪kryo日志,序列化正常:

00:04 DEBUG: [kryo] Write: Unit(id=1, name=m2)
00:04 TRACE: [kryo] FieldSerializer.write fields of class: entity.Unit
00:04 TRACE: [kryo] Write field: id (entity.Unit) pos=25
00:04 TRACE: [kryo] Write: 1
00:04 TRACE: [kryo] Write field: name (entity.Unit) pos=27
00:04 TRACE: [kryo] Write initial object reference 2: m2
00:04 TRACE: [kryo] Write: m2
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值