spark-submit 超时 Executor heartbeat timed out after 123574 ms

解决Spark集群Executor心跳超时问题

最新推荐文章于 2024-03-26 18:23:48 发布

原创

最新推荐文章于 2024-03-26 18:23:48 发布 · 1.8w 阅读

1 ·

CC 4.0 BY-SA版权

文章标签：

#spark

当提交的任务在Spark集群上出现Executor心跳超时，可能由于计算负载过高导致。为解决此问题，可以通过增加配置参数`--conf spark.network.timeout=10000000`来延长超时时间，参考Stack Overflow上的解决方案。

提交任务到spark集群，老是超时。至于超时原因，可能是计算量太大。
解决方案：提交时，加一个参数“–conf spark.network.timeout=10000000”。

spark-submit
–conf spark.network.timeout=10000000 \

参考：
https://stackoverflow.com/questions/37260230/spark-cluster-full-of-heartbeat-timeouts-executors-exiting-on-their-own

2018-10-29 18:07:02 ERROR TaskSchedulerImpl:70 - Lost executor driver on localhost: Executor heartbeat timed out after 123574 ms
2018-10-29 18:07:02 ERROR TaskSetManager:70 - Task 3 in stage 110.0 failed 1 times; aborting job
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 110.0 failed 1 times, most recent failure: Lost task 3.0 in stage 110.0 (TID 496, localhost, executor driver): ExecutorLostFailure (executor driver exited caused by one of the running tasks) Reason: Executor