Spark: java.io.IOException: No space left on devic

2019-02-13 23:46发布

Now I am learning how to use spark.I have a piece of code which can invert a matrix and it works when the order of the matrix is small like 100.But when the order of the matrix is big like 2000 I have an exception like this:

15/05/10 20:31:00 ERROR DiskBlockObjectWriter: Uncaught exception while reverting partial writes to file /tmp/spark-local-20150510200122-effa/28/temp_shuffle_6ba230c3-afed-489b-87aa-91c046cadb22

java.io.IOException: No space left on device

In my program I have lots of lines like this:

val result1=matrix.map(...).reduce(...)
val result2=result1.map(...).reduce(...)
val result3=matrix.map(...)

(sorry about that because the code is to many to write there)

So I think when I do this Spark create some new rdds,and in my program Spark creates too many rdds so I have the exception.I am not sure if what I thought is correct.

How can I delete the rdds that I won't use any more?Like result1 and result2?

I have tried rdd.unpersist(), it doesn't work.

标签： apache-spark rdd

2条回答

ら.Afraid

2楼-- · 2019-02-14 00:12

According to the Error message you have provided, your situation is no disk space left on your hard-drive. However, it's not caused by RDD persistency, but shuffle which you implicitly required when calling reduce.

Therefore, you should clear your drive and make more spaces for your tmp folder

0人赞添加讨论(0) 举报

等我变得足够好

3楼-- · 2019-02-14 00:31

This is because Spark create some temp shuffle files under /tmp directory of you local system.You can avoid this issue by setting below properties in your spark conf files.

Set this property in spark-evn.sh.

SPARK_JAVA_OPTS+=" -Dspark.local.dir=/mnt/spark,/mnt2/spark -Dhadoop.tmp.dir=/mnt/ephemeral-hdfs"

export SPARK_JAVA_OPTS

0人赞添加讨论(0) 举报

Spark: java.io.IOException: No space left on devic

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间