How to find which Java/Scala thread has locked a f

2019-04-04 06:36发布

In brief:

  1. How to find which Java/Scala thread has locked a file? I know that a class/thread in JVM has locked a concrete file (overlaps a region of file), but I don't know how. It's possible to find out what class/thread is doing this when I stop application in breakpoint?

The following code throws OverlappingFileLockException:

FileChannel.open(Paths.get("thisfile"), StandardOpenOption.APPEND).tryLock().isValid();
FileChannel.open(Paths.get("thisfile"), StandardOpenOption.APPEND).tryLock()..isShared();
  1. How can Java/Scala lock this file (Spark)? I know about how to lock files using java.nio.channels, but I didn't found appropriate invocation in github repository of Spark.


More about my problem: 1. When I run Spark in Windows OS with Hive it works correctly, however every time Spark shuts down, it cannot delete one temporary directory (other temporary directory before that is deleted correctly) and outputs following exception:

2015-12-11 15:04:36 [Thread-13] INFO  org.apache.spark.SparkContext - Successfully stopped SparkContext
2015-12-11 15:04:36 [Thread-13] INFO  o.a.spark.util.ShutdownHookManager - Shutdown hook called
2015-12-11 15:04:36 [Thread-13] INFO  o.a.spark.util.ShutdownHookManager - Deleting directory C:\Users\MyUser\AppData\Local\Temp\spark-9d564520-5370-4834-9946-ac5af3954032
2015-12-11 15:04:36 [Thread-13] INFO  o.a.spark.util.ShutdownHookManager - Deleting directory C:\Users\MyUser\AppData\Local\Temp\spark-42b70530-30d2-41dc-aff5-8d01aba38041
2015-12-11 15:04:36 [Thread-13] ERROR o.a.spark.util.ShutdownHookManager - Exception while deleting Spark temp dir: C:\Users\MyUser\AppData\Local\Temp\spark-42b70530-30d2-41dc-aff5-8d01aba38041
java.io.IOException: Failed to delete: C:\Users\MyUser\AppData\Local\Temp\spark-42b70530-30d2-41dc-aff5-8d01aba38041
    at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:884) [spark-core_2.11-1.5.0.jar:1.5.0]
    at org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:63) [spark-core_2.11-1.5.0.jar:1.5.0]
    at org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:60) [spark-core_2.11-1.5.0.jar:1.5.0]
    at scala.collection.mutable.HashSet.foreach(HashSet.scala:78) [scala-library-2.11.6.jar:na]
    at org.apache.spark.util.ShutdownHookManager$$anonfun$1.apply$mcV$sp(ShutdownHookManager.scala:60) [spark-core_2.11-1.5.0.jar:1.5.0]
    at org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:264) [spark-core_2.11-1.5.0.jar:1.5.0]
    at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ShutdownHookManager.scala:234) [spark-core_2.11-1.5.0.jar:1.5.0]
    at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:234) [spark-core_2.11-1.5.0.jar:1.5.0]
    at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:234) [spark-core_2.11-1.5.0.jar:1.5.0]
    at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699) [spark-core_2.11-1.5.0.jar:1.5.0]
    at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(ShutdownHookManager.scala:234) [spark-core_2.11-1.5.0.jar:1.5.0]
    at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:234) [spark-core_2.11-1.5.0.jar:1.5.0]
    at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:234) [spark-core_2.11-1.5.0.jar:1.5.0]
    at scala.util.Try$.apply(Try.scala:191) [scala-library-2.11.6.jar:na]
    at org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:234) [spark-core_2.11-1.5.0.jar:1.5.0]
    at org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:216) [spark-core_2.11-1.5.0.jar:1.5.0]
    at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54) [hadoop-common-2.4.1.jar:na]

I try to do search in the internet, but found just in progress issues in Spark (one user try to do some patch, but it isn't working, if I undenstand comment to this pull request correctly) and some unanswered questions in SO.

It looks like the problem is in deleteRecursively() method of Utils.scala class. I set breakpoint to this method and rewrite it to Java:

public class Test {
    public static void deleteRecursively(File file) {
        if (file != null) {
            try {
                if (file.isDirectory()) {
                    for (File child : listFilesSafely(file)) {
                        deleteRecursively(child);
                    }
                    //ShutdownHookManager.removeShutdownDeleteDir(file)
                }
            } finally {
                if (!file.delete()) {
                    if (file.exists()) {
                        throw new RuntimeException("Failed to delete: " + file.getAbsolutePath());
                    }
                }
            }
        }
    }

    private static List<File> listFilesSafely(File file) {
        if (file.exists()) {
            File[] files = file.listFiles();
            if (files == null) {
                throw new RuntimeException("Failed to list files for dir: " + file);
            }
            return Arrays.asList(files);
        } else {
            return Collections.emptyList();
        }
    }

    public static void main(String [] arg) {
        deleteRecursively(new File("C:\\Users\\MyUser\\AppData\\Local\\Temp\\spark-9ba0bb0c-1e20-455d-bc1f-86c696661ba3")); 
    }

When Spark is stopped at breakpoint of this method, I found out that JVM of one thread of Spark locked "C:\Users\MyUser\AppData\Local\Temp\spark-9ba0bb0c-1e20-455d-bc1f-86c696661ba3\metastore\db.lck" file and Windows Process Explorer also show that Java locks this file. Also FileChannel shows that file is locked in JVM.

Now, I have to:

  1. Find out what thread/class has locked this file

  2. Find out what way to locking files Spark is using to lock "metastore\db.lck", what class does it and how to unlock it before shutdown

  3. Do some pull request to Spark or Hive to unlock this file ("metastore\db.lck") before a call to deleteRecursively() method or at least leave a comment about the problem

If you need any other information, please, ask in comments.

3条回答
放荡不羁爱自由
2楼-- · 2019-04-04 06:40
  1. How to find which Java/Scala thread has locked a file?

I have some problem and find out this solution: All locked object you can see at least in Thread.threadLocals field.

If file locked following code:

    File newFile = new File("newFile.lock");
    newFile.createNewFile();
    FileLock fileLock = FileChannel.open(Paths.get(newFile.getAbsolutePath()), StandardOpenOption.APPEND).tryLock();

In Thread.threadLocals you can see sun.nio.fs.NativeBuffer class with field owner = ".../newFile.lock".

So you can try following code, that returns all thread with all class in threadLocals, you need find what Threads have NativeBuffer class or Spark/Hive objects and so on (and after check this threadLocals of this threads in Eclipse or IDEA debug mode):

private static String getThreadsLockFile() {
    Set<Thread> threads = Thread.getAllStackTraces().keySet();
    StringBuilder builder = new StringBuilder();
    for (Thread thread : threads) {
        builder.append(getThreadsLockFile(thread));
    }
    return builder.toString();
}

private static String getThreadsLockFile(Thread thread) {
    StringBuffer stringBuffer = new StringBuffer();
    try {
        Field field = thread.getClass().getDeclaredField("threadLocals");
        field.setAccessible(true);
        Object map = field.get(thread);
        Field table = Class.forName("java.lang.ThreadLocal$ThreadLocalMap").getDeclaredField("table");
        table.setAccessible(true);
        Object tbl = table.get(map);
        int length = Array.getLength(tbl);
        for (int i = 0; i < length; i++) {
            try {
                Object entry = Array.get(tbl, i);
                if (entry != null) {
                    Field valueField = Class.forName("java.lang.ThreadLocal$ThreadLocalMap$Entry").getDeclaredField("value");
                    valueField.setAccessible(true);
                    Object value = valueField.get(entry);
                    if (value != null) {
                        stringBuffer.append(thread.getName()).append(" : ").append(value.getClass()).
                                append(" ").append(value).append("\n");
                       }
                }
            } catch (Exception exp) {
                // skip, do nothing
            }
        }
    } catch (Exception exp) {
        // skip, do nothing
    }
    return stringBuffer.toString();
}

Or you can try to use following code, but this code find only NativeBuffer class with owner parameter (so it isn't work in all case):

private static String getThreadsLockFile(String fileName) {
    Set<Thread> threads = Thread.getAllStackTraces().keySet();
    StringBuilder builder = new StringBuilder();
    for (Thread thread : threads) {
        builder.append(getThreadsLockFile(thread, fileName));
    }
    return builder.toString();
}

private static String getThreadsLockFile(Thread thread, String fileName) {
    StringBuffer stringBuffer = new StringBuffer();
    try {
        Field field = thread.getClass().getDeclaredField("threadLocals");
        field.setAccessible(true);
        Object map = field.get(thread);
        Field table = Class.forName("java.lang.ThreadLocal$ThreadLocalMap").getDeclaredField("table");
        table.setAccessible(true);
        Object tbl = table.get(map);
        int length = Array.getLength(tbl);
        for (int i = 0; i < length; i++) {
            try {
                Object entry = Array.get(tbl, i);
                if (entry != null) {
                    Field valueField = Class.forName("java.lang.ThreadLocal$ThreadLocalMap$Entry").getDeclaredField("value");
                    valueField.setAccessible(true);
                    Object value = valueField.get(entry);
                    if (value != null) {
                        int length1 = Array.getLength(value);
                        for (int j = 0; j < length1; j++) {
                            try {
                                Object entry1 = Array.get(value, j);
                                Field ownerField = Class.forName("sun.nio.fs.NativeBuffer").getDeclaredField("owner");
                                ownerField.setAccessible(true);
                                String owner = ownerField.get(entry1).toString();
                                if (owner.contains(fileName)) {
                                    stringBuffer.append(thread.getName());
                                }
                            } catch (Exception exp) {
                                // skip, do nothing
                            }
                        }
                    }
                }
            } catch (Exception exp) {
                // skip, do nothing
            }
        }
    } catch (Exception exp) {
        // skip, do nothing
    }
    return stringBuffer.toString();
}
查看更多
霸刀☆藐视天下
3楼-- · 2019-04-04 06:44

I give you information what I find out about my own guestion unsing other answer (thank you Basilevs, tploter very much), may be it's help someone in the same case:

  1. Every time when JVM thread lock a file exclusively, also JVM lock some Jave object, for example, I find in my case:

    • sun.nio.fs.NativeBuffer
    • sun.nio.ch.Util$BufferCache

    So you need just find this locked Java object and analyzed them and you find what thread locked your file.

I not sure that it work if file just open (without locked exclusively), but I'm sure that is work if file be locked exclusively by Thread (using java.nio.channels.FileLock, java.nio.channels.FileChannel and so on)

  1. Unfortunately, about Spark, I find a lot of other locked Hive object (org.apache.hadoop.hive.ql.metadata.Hive, org.apache.hadoop.hive.metastore.ObjectStore, org.apache.hadoop.hive.ql.session.SessionState, org.apache.hadoop.hive.ql.metadata.Hive and so on) when a Spark try to delete db.lck, and that means the Spark hadn't shutdown Hive correctly at all, before it tried to delete Hive's files. Fortunately, this problem absent in Linux OS (may be Linux allowed delete locked files).
查看更多
Explosion°爆炸
4楼-- · 2019-04-04 07:03

See How to find out which thread is locking a file in java?

Files are locked by Windows process. Threads may open files for reading writing, but a class that holds a reference to file handle is responsible to close it. Therefore you should look for an object, not a thread.

See How can I figure out what is holding on to unfreed objects? to find out how.

查看更多
登录 后发表回答