In brief:
- How to find which Java/Scala thread has locked a file? I know that a class/thread in JVM has locked a concrete file (overlaps a region of file), but I don't know how. It's possible to find out what class/thread is doing this when I stop application in breakpoint?
The following code throws OverlappingFileLockException:
FileChannel.open(Paths.get("thisfile"), StandardOpenOption.APPEND).tryLock().isValid();
FileChannel.open(Paths.get("thisfile"), StandardOpenOption.APPEND).tryLock()..isShared();
- How can Java/Scala lock this file (Spark)? I know about how to lock files using java.nio.channels, but I didn't found appropriate invocation in github repository of Spark.
More about my problem: 1. When I run Spark in Windows OS with Hive it works correctly, however every time Spark shuts down, it cannot delete one temporary directory (other temporary directory before that is deleted correctly) and outputs following exception:
2015-12-11 15:04:36 [Thread-13] INFO org.apache.spark.SparkContext - Successfully stopped SparkContext
2015-12-11 15:04:36 [Thread-13] INFO o.a.spark.util.ShutdownHookManager - Shutdown hook called
2015-12-11 15:04:36 [Thread-13] INFO o.a.spark.util.ShutdownHookManager - Deleting directory C:\Users\MyUser\AppData\Local\Temp\spark-9d564520-5370-4834-9946-ac5af3954032
2015-12-11 15:04:36 [Thread-13] INFO o.a.spark.util.ShutdownHookManager - Deleting directory C:\Users\MyUser\AppData\Local\Temp\spark-42b70530-30d2-41dc-aff5-8d01aba38041
2015-12-11 15:04:36 [Thread-13] ERROR o.a.spark.util.ShutdownHookManager - Exception while deleting Spark temp dir: C:\Users\MyUser\AppData\Local\Temp\spark-42b70530-30d2-41dc-aff5-8d01aba38041
java.io.IOException: Failed to delete: C:\Users\MyUser\AppData\Local\Temp\spark-42b70530-30d2-41dc-aff5-8d01aba38041
at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:884) [spark-core_2.11-1.5.0.jar:1.5.0]
at org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:63) [spark-core_2.11-1.5.0.jar:1.5.0]
at org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:60) [spark-core_2.11-1.5.0.jar:1.5.0]
at scala.collection.mutable.HashSet.foreach(HashSet.scala:78) [scala-library-2.11.6.jar:na]
at org.apache.spark.util.ShutdownHookManager$$anonfun$1.apply$mcV$sp(ShutdownHookManager.scala:60) [spark-core_2.11-1.5.0.jar:1.5.0]
at org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:264) [spark-core_2.11-1.5.0.jar:1.5.0]
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ShutdownHookManager.scala:234) [spark-core_2.11-1.5.0.jar:1.5.0]
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:234) [spark-core_2.11-1.5.0.jar:1.5.0]
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:234) [spark-core_2.11-1.5.0.jar:1.5.0]
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699) [spark-core_2.11-1.5.0.jar:1.5.0]
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(ShutdownHookManager.scala:234) [spark-core_2.11-1.5.0.jar:1.5.0]
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:234) [spark-core_2.11-1.5.0.jar:1.5.0]
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:234) [spark-core_2.11-1.5.0.jar:1.5.0]
at scala.util.Try$.apply(Try.scala:191) [scala-library-2.11.6.jar:na]
at org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:234) [spark-core_2.11-1.5.0.jar:1.5.0]
at org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:216) [spark-core_2.11-1.5.0.jar:1.5.0]
at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54) [hadoop-common-2.4.1.jar:na]
I try to do search in the internet, but found just in progress issues in Spark (one user try to do some patch, but it isn't working, if I undenstand comment to this pull request correctly) and some unanswered questions in SO.
It looks like the problem is in deleteRecursively() method of Utils.scala class. I set breakpoint to this method and rewrite it to Java:
public class Test {
public static void deleteRecursively(File file) {
if (file != null) {
try {
if (file.isDirectory()) {
for (File child : listFilesSafely(file)) {
deleteRecursively(child);
}
//ShutdownHookManager.removeShutdownDeleteDir(file)
}
} finally {
if (!file.delete()) {
if (file.exists()) {
throw new RuntimeException("Failed to delete: " + file.getAbsolutePath());
}
}
}
}
}
private static List<File> listFilesSafely(File file) {
if (file.exists()) {
File[] files = file.listFiles();
if (files == null) {
throw new RuntimeException("Failed to list files for dir: " + file);
}
return Arrays.asList(files);
} else {
return Collections.emptyList();
}
}
public static void main(String [] arg) {
deleteRecursively(new File("C:\\Users\\MyUser\\AppData\\Local\\Temp\\spark-9ba0bb0c-1e20-455d-bc1f-86c696661ba3"));
}
When Spark is stopped at breakpoint of this method, I found out that JVM of one thread of Spark locked "C:\Users\MyUser\AppData\Local\Temp\spark-9ba0bb0c-1e20-455d-bc1f-86c696661ba3\metastore\db.lck" file and Windows Process Explorer also show that Java locks this file. Also FileChannel shows that file is locked in JVM.
Now, I have to:
Find out what thread/class has locked this file
Find out what way to locking files Spark is using to lock "metastore\db.lck", what class does it and how to unlock it before shutdown
Do some pull request to Spark or Hive to unlock this file ("metastore\db.lck") before a call to deleteRecursively() method or at least leave a comment about the problem
If you need any other information, please, ask in comments.
I have some problem and find out this solution: All locked object you can see at least in Thread.threadLocals field.
If file locked following code:
In
Thread.threadLocals
you can seesun.nio.fs.NativeBuffer
class with fieldowner
= ".../newFile.lock".So you can try following code, that returns all thread with all class in threadLocals, you need find what Threads have NativeBuffer class or Spark/Hive objects and so on (and after check this threadLocals of this threads in Eclipse or IDEA debug mode):
Or you can try to use following code, but this code find only
NativeBuffer
class withowner
parameter (so it isn't work in all case):I give you information what I find out about my own guestion unsing other answer (thank you Basilevs, tploter very much), may be it's help someone in the same case:
Every time when JVM thread lock a file exclusively, also JVM lock some Jave object, for example, I find in my case:
So you need just find this locked Java object and analyzed them and you find what thread locked your file.
I not sure that it work if file just open (without locked exclusively), but I'm sure that is work if file be locked exclusively by Thread (using java.nio.channels.FileLock, java.nio.channels.FileChannel and so on)
Hive
object (org.apache.hadoop.hive.ql.metadata.Hive
,org.apache.hadoop.hive.metastore.ObjectStore
,org.apache.hadoop.hive.ql.session.SessionState
,org.apache.hadoop.hive.ql.metadata.Hive
and so on) when aSpark
try to delete db.lck, and that means theSpark
hadn't shutdownHive
correctly at all, before it tried to deleteHive's
files. Fortunately, this problem absent inLinux OS
(may beLinux
allowed delete locked files).See How to find out which thread is locking a file in java?
Files are locked by Windows process. Threads may open files for reading writing, but a class that holds a reference to file handle is responsible to close it. Therefore you should look for an object, not a thread.
See How can I figure out what is holding on to unfreed objects? to find out how.