In brief:
- How to find which Java/Scala thread has locked a file? I know that a class/thread in JVM has locked a concrete file (overlaps a region of file), but I don't know how. It's possible to find out what class/thread is doing this when I stop application in breakpoint?
The following code throws OverlappingFileLockException:
FileChannel.open(Paths.get("thisfile"), StandardOpenOption.APPEND).tryLock().isValid();
FileChannel.open(Paths.get("thisfile"), StandardOpenOption.APPEND).tryLock()..isShared();
- How can Java/Scala lock this file (Spark)? I know about how to lock files using java.nio.channels, but I didn't found appropriate invocation in github repository of Spark.
More about my problem: 1. When I run Spark in Windows OS with Hive it works correctly, however every time Spark shuts down, it cannot delete one temporary directory (other temporary directory before that is deleted correctly) and outputs following exception:
2015-12-11 15:04:36 [Thread-13] INFO org.apache.spark.SparkContext - Successfully stopped SparkContext
2015-12-11 15:04:36 [Thread-13] INFO o.a.spark.util.ShutdownHookManager - Shutdown hook called
2015-12-11 15:04:36 [Thread-13] INFO o.a.spark.util.ShutdownHookManager - Deleting directory C:\Users\MyUser\AppData\Local\Temp\spark-9d564520-5370-4834-9946-ac5af3954032
2015-12-11 15:04:36 [Thread-13] INFO o.a.spark.util.ShutdownHookManager - Deleting directory C:\Users\MyUser\AppData\Local\Temp\spark-42b70530-30d2-41dc-aff5-8d01aba38041
2015-12-11 15:04:36 [Thread-13] ERROR o.a.spark.util.ShutdownHookManager - Exception while deleting Spark temp dir: C:\Users\MyUser\AppData\Local\Temp\spark-42b70530-30d2-41dc-aff5-8d01aba38041
java.io.IOException: Failed to delete: C:\Users\MyUser\AppData\Local\Temp\spark-42b70530-30d2-41dc-aff5-8d01aba38041
at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:884) [spark-core_2.11-1.5.0.jar:1.5.0]
at org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:63) [spark-core_2.11-1.5.0.jar:1.5.0]
at org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:60) [spark-core_2.11-1.5.0.jar:1.5.0]
at scala.collection.mutable.HashSet.foreach(HashSet.scala:78) [scala-library-2.11.6.jar:na]
at org.apache.spark.util.ShutdownHookManager$$anonfun$1.apply$mcV$sp(ShutdownHookManager.scala:60) [spark-core_2.11-1.5.0.jar:1.5.0]
at org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:264) [spark-core_2.11-1.5.0.jar:1.5.0]
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ShutdownHookManager.scala:234) [spark-core_2.11-1.5.0.jar:1.5.0]
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:234) [spark-core_2.11-1.5.0.jar:1.5.0]
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:234) [spark-core_2.11-1.5.0.jar:1.5.0]
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699) [spark-core_2.11-1.5.0.jar:1.5.0]
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(ShutdownHookManager.scala:234) [spark-core_2.11-1.5.0.jar:1.5.0]
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:234) [spark-core_2.11-1.5.0.jar:1.5.0]
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:234) [spark-core_2.11-1.5.0.jar:1.5.0]
at scala.util.Try$.apply(Try.scala:191) [scala-library-2.11.6.jar:na]
at org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:234) [spark-core_2.11-1.5.0.jar:1.5.0]
at org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:216) [spark-core_2.11-1.5.0.jar:1.5.0]
at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54) [hadoop-common-2.4.1.jar:na]
I try to do search in the internet, but found just in progress issues in Spark (one user try to do some patch, but it isn't working, if I undenstand comment to this pull request correctly) and some unanswered questions in SO.
It looks like the problem is in deleteRecursively() method of Utils.scala class. I set breakpoint to this method and rewrite it to Java:
public class Test {
public static void deleteRecursively(File file) {
if (file != null) {
try {
if (file.isDirectory()) {
for (File child : listFilesSafely(file)) {
deleteRecursively(child);
}
//ShutdownHookManager.removeShutdownDeleteDir(file)
}
} finally {
if (!file.delete()) {
if (file.exists()) {
throw new RuntimeException("Failed to delete: " + file.getAbsolutePath());
}
}
}
}
}
private static List<File> listFilesSafely(File file) {
if (file.exists()) {
File[] files = file.listFiles();
if (files == null) {
throw new RuntimeException("Failed to list files for dir: " + file);
}
return Arrays.asList(files);
} else {
return Collections.emptyList();
}
}
public static void main(String [] arg) {
deleteRecursively(new File("C:\\Users\\MyUser\\AppData\\Local\\Temp\\spark-9ba0bb0c-1e20-455d-bc1f-86c696661ba3"));
}
When Spark is stopped at breakpoint of this method, I found out that JVM of one thread of Spark locked "C:\Users\MyUser\AppData\Local\Temp\spark-9ba0bb0c-1e20-455d-bc1f-86c696661ba3\metastore\db.lck" file and Windows Process Explorer also show that Java locks this file. Also FileChannel shows that file is locked in JVM.
Now, I have to:
Find out what thread/class has locked this file
Find out what way to locking files Spark is using to lock "metastore\db.lck", what class does it and how to unlock it before shutdown
Do some pull request to Spark or Hive to unlock this file ("metastore\db.lck") before a call to deleteRecursively() method or at least leave a comment about the problem
If you need any other information, please, ask in comments.