序列化缓存溢出
Causedby:org.apache.spark.SparkException:Kryo序列化失败:缓冲区溢出。可用:0,必需:21.要避免此情况,请增加spark.kryoserializer.buffer.max
Caused by:org.apache.spark.SparkException: Kryo serialization failed: Buffer overflow.Available: 0, required: 21. To avoid this, increasespark.kryoserializer.buffer.max value.
atorg.apache.spark.serializer.KryoSerializerInstance.serialize(KryoSerializer.scala:299)
atorg.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:240)
atjava.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
atjava.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
atjava.lang.Thread.run(Thread.java:745)
val sparkConf = newSparkConf().setAppName(Constants.SPARK_NAME_APP)
.set("spark.kryoserializer.buffer.max","128");
原因分析: RDD extends scala.AnyRef withscala.Serializable ,所以在使用textFile ,读取表的数据,等大量创建新的rdd,df,ds等 数据集的时候,注意把 这个值调大