spark：Failed to create local dir

最新推荐文章于 2024-07-01 22:00:00 发布

转载最新推荐文章于 2024-07-01 22:00:00 发布 · 2.9k 阅读

3 ·

CC 4.0 BY-SA版权

原文链接：https://www.jianshu.com/p/e87d2d3354bd

Spark 专栏收录该内容

49 篇文章

订阅专栏

近日莫名遭遇异常一枚，如下：

org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 271.0 failed 1 times, most recent failure: Lost task 0.0 in stage 271.0 (TID 544, localhost): java.io.IOException: Failed to create local dir in /tmp/blockmgr-4223dca8-7355-4ab2-98b9-87e763c7becd/1d.
        at org.apache.spark.storage.DiskBlockManager.getFile(DiskBlockManager.scala:87)
        at org.apache.spark.storage.DiskBlockManager.getFile(DiskBlockManager.scala:97)
        at org.apache.spark.shuffle.IndexShuffleBlockResolver.getIndexFile(IndexShuffleBlockResolver.scala:58)
        at org.apache.spark.shuffle.IndexShuffleBlockResolver.writeIndexFileAndCommit(IndexShuffleBlockResolver.scala:140)
        at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:127)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:87)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:52)
        at org.apache.spark.scheduler.Task.run(Task.scala:107)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:277)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

原因分析：
1 Failed to create local dir，什么时候spark会创建临时文件呢？
shuffle时需要通过diskBlockManage将map结果写入本地，优先写入memory store，在memore store空间不足时会创建临时文件(二级目录，如异常中的blockmgr-4223dca8-7355-4ab2-98b9-87e763c7becd/1d)。
2 shuffle又是咋回事呢？
spark作为并行计算框架，同一个作业会被划分为多个任务在多个节点执行，reduce的输入可能存在于多个节点，因此需要shuffle将所有reduce的输入汇总起来。
3 memory store的大小是多少，什么情况下会超出使用disk store？
memory store的大小取决于spark.excutor.memory大小，默认为spark.excutor.memory*0.6
4 临时文件默认创建于/temp，如果修改？
spark.env中添加配置SPARK_LOCAL_DIRS或程序中配置，可配置多个路径，逗号分隔增强io效率

SPARK_LOCAL_DIRS:
Directory to use for "scratch" space in Spark, including map output files and RDDs that get stored on disk. This should be on a fast, local disk in your system. It can also be a comma-separated list of multiple directories on different disks.

5 保证磁盘空间充足和磁盘读写权限。磁盘空间按需配置。