spark scala.util.control.BreakControl

在 Spark 应用程序中遇到 org.apache.spark.SparkException,原因是 Task 在 stage 失败,与 scala.util.control.BreakControl 相关。错误追踪显示在任务中使用了 `break` 语句,这在分布式环境中可能不适用,导致无法正确控制循环的退出,从而引发此异常。经过一天的排查,问题定位为 `break` 语句在集群执行中的不兼容性。

org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 12.0 failed 1 times, most recent failure: Lost task 0.0 in stage 12.0 (TID 18, localhost, executor driver): scala.util.control.BreakControl

Driver stacktrace:
    at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1703)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1691)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1690)
    at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
    at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
    at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1690)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:873)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:873)
    at scala.Option.foreach(Option.scala:257)
    at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:873)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1924)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1873)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1862)
    at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
    at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:682)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:2047)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:2068)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:2087)
    at org.apache.spark.rdd.RDD$$anonfun$take$1.apply(RDD.scala:1368)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
    at org.apache.spark.rdd.RDD.withScope(RDD.scala:367)
    at org.apache.spark.rdd.RDD.take(RDD.scala:1341)
    at com.kingyea.datac.process.Explorations$$anonfun$main$2.apply(Explorations.scala:99)
    at com.kingyea.datac.process.Explorations$$anonfun$main$2.apply(Explorations.scala:83)
    at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
    at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
    at com.kingyea.datac.process.Explorations$.main(Explorations.scala:83)
    at com.kingyea.datac.process.Explorations.main(Explorations.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:920)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:195)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:220)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:140)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: scala.util.control.BreakControl

 

 

查阅了很多资料

这个错误刚开始以为触发action 把数据都写到Driver端 排查了结果不是这个报错

查阅其他资料说数据量太大,在程序中也排除了这个错误 

结果是在代码里面写了跳出循环 也就是break()

可能是在集群中都是分布式跑,break没办法控制哪个点跳出,所以就一直报这个错误,排查了一整天了!!!!!!!

使用scala编写spark任务 报错 2025-10-10 13:39:36,753 WARN org.apache.spark.scheduler.TaskSetManager - Lost task 0.1 in stage 434.0 (TID 2053, 0c2m10306.cloud.m8.am281, executor 1): java.sql.BatchUpdateException: Batch entry 150 update analy.vs_cust_org_voltrate_range set IS_SEASONAL='0' where OBJ_ID='765353ce-a110-48c3-9064-baa6a096d8ab' was aborted: ERROR: deadlock detected Detail: Process 11247 waits for ShareLock on transaction 3913661038; blocked by process 7481. Process 7481 waits for ShareLock on transaction 3913661019; blocked by process 7461. Process 7461 waits for ShareLock on transaction 3913661031; blocked by process 7471. Process 7471 waits for ShareLock on transaction 3913661024; blocked by process 7500. Process 7500 waits for ShareLock on transaction 3913661022; blocked by process 7486. Process 7486 waits for ShareLock on transaction 3913661017; blocked by process 7510. Process 7510 waits for ShareLock on transaction 3913661036; blocked by process 7474. Process 7474 waits for ShareLock on transaction 3913673043; blocked by process 11247. Hint: See server log for query details. Where: while updating tuple (350,9) in relation "vs_cust_org_voltrate_range_partition_20240131" Call getNextException to see other errors in the batch. at org.postgresql.jdbc.BatchResultHandler.handleError(BatchResultHandler.java:165) at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2348) at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2080) at org.postgresql.core.v3.QueryExecutorImpl.flushIfDeadlockRisk(QueryExecutorImpl.java:1437) at org.postgresql.core.v3.QueryExecutorImpl.sendQuery(QueryExecutorImpl.java:1462) at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:527) at org.postgresql.jdbc.PgStatement.internalExecuteBatch(PgStatement.java:881) at org.postgresql.jdbc.PgStatement.executeBatch(PgStatement.java:904) at org.postgresql.jdbc.PgPreparedStatement.executeBatch(PgPreparedStatement.java:1634) at com.bmsoft.scala.utils.SaveToPostgreSQL$$anonfun$insertCustom$1$$anonfun$apply$8.apply(SaveToPostgreSQL.scala:538) at com.bmsoft.scala.utils.SaveToPostgreSQL$$anonfun$insertCustom$1$$anonfun$apply$8.apply(SaveToPostgreSQL.scala:532) at scala.collection.Iterator$class.foreach(Iterator.scala:893) at scala.collection.AbstractIterator.foreach(Iterator.scala:1336) at com.bmsoft.scala.utils.SaveToPostgreSQL$$anonfun$insertCustom$1.apply(SaveToPostgreSQL.scala:532) at com.bmsoft.scala.utils.SaveToPostgreSQL$$anonfun$insertCustom$1.apply(SaveToPostgreSQL.scala:524) at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:929) at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:929) at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2067) at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2067) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:109) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.postgresql.util.PSQLException: ERROR: deadlock detected Detail: Process 11247 waits for ShareLock on transaction 3913661038; blocked by process 7481. Process 7481 waits for ShareLock on transaction 3913661019; blocked by process 7461. Process 7461 waits for ShareLock on transaction 3913661031; blocked by process 7471. Process 7471 waits for ShareLock on transaction 3913661024; blocked by process 7500. Process 7500 waits for ShareLock on transaction 3913661022; blocked by process 7486. Process 7486 waits for ShareLock on transaction 3913661017; blocked by process 7510. Process 7510 waits for ShareLock on transaction 3913661036; blocked by process 7474. Process 7474 waits for ShareLock on transaction 3913673043; blocked by process 11247. Hint: See server log for query details. Where: while updating tuple (350,9) in relation "vs_cust_org_voltrate_range_partition_20240131" at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2657) at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2347) ... 23 more
最新发布
10-11
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值