Now I am learning how to use spark.I have a piece of code which can invert a matrix and it works when the order of the matrix is small like 100.But when the order of the matrix is big like 2000 I have an exception like this:
15/05/10 20:31:00 ERROR DiskBlockObjectWriter: Uncaught exception while reverting partial writes to file /tmp/spark-local-20150510200122-effa/28/temp_shuffle_6ba230c3-afed-489b-87aa-91c046cadb22
java.io.IOException: No space left on device
In my program I have lots of lines like this:
val result1=matrix.map(...).reduce(...)
val result2=result1.map(...).reduce(...)
val result3=matrix.map(...)
(sorry about that because the code is to many to write there)
So I think when I do this Spark create some new rdds,and in my program Spark creates too many rdds so I have the exception.I am not sure if what I thought is correct.
How can I delete the rdds that I won't use any more?Like result1 and result2?
I have tried rdd.unpersist(), it doesn't work.
According to the
Error message
you have provided, your situation is no disk space left on your hard-drive. However, it's not caused by RDD persistency, but shuffle which you implicitly required when callingreduce
.Therefore, you should clear your drive and make more spaces for your tmp folder
This is because Spark create some temp shuffle files under /tmp directory of you local system.You can avoid this issue by setting below properties in your spark conf files.
Set this property in spark-evn.sh.