Spark job fails: storage.DiskBlockObjectWriter: Un

I have a Spark (1.4.1) application, running on Yarn, that fails with the following executor log entry:

16/07/21 23:09:08 ERROR executor.CoarseGrainedExecutorBackend: Driver 9.4.136.20:55995 disassociated! Shutting down.
16/07/21 23:09:08 ERROR storage.DiskBlockObjectWriter: Uncaught exception while reverting partial writes to file /dfs1/hadoop/yarn/local/usercache/mitchus/appcache/application_1465987751317_1172/blockmgr-f367f43b-f4c8-4faf-a829-530da30fb040/1c/temp_shuffle_581adb36-1561-4db8-a556-c4ac0e6400ed
java.io.FileNotFoundException: /dfs1/hadoop/yarn/local/usercache/mitchus/appcache/application_1465987751317_1172/blockmgr-f367f43b-f4c8-4faf-a829-530da30fb040/1c/temp_shuffle_581adb36-1561-4db8-a556-c4ac0e6400ed (No such file or directory)
    at java.io.FileOutputStream.open0(Native Method)
    at java.io.FileOutputStream.open(FileOutputStream.java:270)
    at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
    at org.apache.spark.storage.DiskBlockObjectWriter.revertPartialWritesAndClose(BlockObjectWriter.scala:189)
    at org.apache.spark.util.collection.ExternalSorter.spillToMergeableFile(ExternalSorter.scala:328)
    at org.apache.spark.util.collection.ExternalSorter.spill(ExternalSorter.scala:257)
    at org.apache.spark.util.collection.ExternalSorter.spill(ExternalSorter.scala:95)
    at org.apache.spark.util.collection.Spillable$class.maybeSpill(Spillable.scala:83)
    at org.apache.spark.util.collection.ExternalSorter.maybeSpill(ExternalSorter.scala:95)
    at org.apache.spark.util.collection.ExternalSorter.maybeSpillCollection(ExternalSorter.scala:240)
    at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:220)
    at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:62)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
    at org.apache.spark.scheduler.Task.run(Task.scala:70)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

Any clues as to what might have gone wrong?

标签： apache-spark yarn

2条回答

SAY GOODBYE

2楼-- · 2019-05-07 01:16

The reason caused by temp shuffle file is deleted. There are many reasons, for one which I met is because the other executor was killed by Yarn. After the executor killed, a SHUT_DOWN signal will be sent to other executors, then the ShutdownHookManager will delete all the temp files which have registered to ShutdownHookManager. That's why you see the error. So you maybe need to check whether there are any ShutdownHookManager called log.

0人赞添加讨论(0) 举报

祖国的老花朵

3楼-- · 2019-05-07 01:19

You can try to improve spark.yarn.executor.memoryOverhead.

0人赞添加讨论(0) 举报

Spark job fails: storage.DiskBlockObjectWriter: Un

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间