Multiple threads acquiring the same monitor?

2019-08-29 08:37发布

问题:

The question is around the discussion "Multiple Java threads seemingly locking same monitor". In our application, we are facing similar issue. Sometimes the application is running extremely slow. Multiple thread dumps have been captured. The thread dumps indicate that 2/3 threads have acquired the same lock object at the same point of time and are in the BLOCKED state. Other threads (10 to 20 in number at different point of time) are BLOCKED while waiting for the very same lock object. Pseudo thread dump looks like the following:

"MyThread-91" prio=3 tid=0x07552800 nid=0xc7 waiting for monitor entry [0xc4dff000]
   java.lang.Thread.State: BLOCKED (on object monitor)
    at com.myCompany.abc.util.MySharedLinkedList$MySharedIterator.hasNext(MySharedLinkedList.java:177)
    - locked <0xce1fb810> (a com.myCompany.abc.util.MySharedLinkedList)
    at com.myCompany.abc.util.MyEventProcessor.notifyListeners(MyEventProcessor.java:2644)
    ...............................................................................................

"MyThread-2" prio=3 tid=0x07146400 nid=0x6e waiting for monitor entry [0xc6aef000]
   java.lang.Thread.State: BLOCKED (on object monitor)
    at com.myCompany.abc.util.MySharedLinkedList$MySharedIterator.hasNext(MySharedLinkedList.java:177)
    - locked <0xce1fb810> (a com.myCompany.abc.util.MySharedLinkedList)
    at com.myCompany.abc.util.MyEventProcessor.notifyListeners(MyEventProcessor.java:2644)
    ................................................................................................

"MyThread-14" prio=3 tid=0x074b9400 nid=0x7a waiting for monitor entry [0xc64ef000]
   java.lang.Thread.State: BLOCKED (on object monitor)
    at com.myCompany.abc.util.MySharedLinkedList$MySharedIterator.next(MySharedLinkedList.java:194)
    - waiting to lock <0xce1fb810> (a com.myCompany.abc.util.MySharedLinkedList)
    at com.myCompany.abc.util.MyEventProcessor.notifyListeners(MyEventProcessor.java:2646)
    ................................................................................................

MyThread-91 and MyThread-2 are BLOCKED while having lock on <0xce1fb810>. MyThread-14 in BLOCKED while waiting for the same lock <0xce1fb810>.

We are not running into any thread deadlock issue here for sure. Please note, threads which are BLOCKED on the lock (0xce1fb810) at any point of time are releasing it subsequently. But, some other threads are getting BLOCKED after acquiring the same lock object. According to the discussion mentioned above (& sample code provided by Gray), it could be because of invoking wait() within a synchronized block. But, We inspected our code and we don't see any wait() getting invoked within the synchronized block. In our case, it's an internal implementation of Linked List which in turn has an inner class implementing iterator. The next() and hasNext() of the iterator implementation locks on the same instance of the outer class, i.e., the instance of custom linked list. When multiple threads are invoking next() and hasNext(), they are moving into the BLOCKED state after "acquiring" the same lock.

Here goes the pseudo code:

public final class MySharedLinkedList<E> implements Collection<E> {
    /**
     * Represents an entry in the list. 
     */
    private static final class Entry<E> {
        //Instance variables and methods for Entry goes here.
    }

    /**
     * Non fail-fast implementation of iterator for this list.
     */
    public final class MySharedIterator implements Iterator<E> {

        public boolean hasNext() {
            //Some code goes here.
            synchronized (MySharedLinkedList.this) {
                //Some code goes here.
            }
        }

        public E next() {
            //Some code goes here.
            synchronized (MySharedLinkedList.this) {
                //Some code goes here.
            }
        }
    }

    public synchronized Iterator<E> iterator() {
        //Returns a new instance of the iterator.
    }
}

/**
 * Singleton Instance
 */
public class MyEventProcessor {

    //listeners contains a number of Node objects
    private final SharedLinkedList<Node> listeners = new SharedLinkedList<Node>();

    private void notifyListeners() {

        final SharedLinkedList<ProvAPIEventNode>.SharedIterator iterator = listeners.sharedIterator();
        try {
            while (iterator.hasNext()) {
                final Node node = iterator.next();
                //Lots of other things go here
            } catch (Exception e) {
                //Handle the exception
            }
        }
    }
}

So, the question is what else (other than wait()) might lead to this scenario?

This blog talks about a similar situation (under the section "Example 2: When the Processing Performance is Abnormally Slow"). But not sure if something similar is happening here.

Don't close this thread as a duplicate of this or this. As mentioned, the behavior is similar, but I guess root cause might not be.

Thoughts??

回答1:

You shouldn't be hitting one lock so hard, I would restucture your program so that typically there is not more than one thread accessing the lock. To have such heavy lock contention suggests you shouldn't have so many threads in the first place as you are not using them effectively and you are better off with less threads, possibly only one (because one thread doesn't need a lock)

I suggest you start with one thread and only add threads when you know it helps performance. Don't assume more threads will help because you can have code like yours where the overhead of having to use locks exceeds any gain you might have.

BTW how many cores do you have?



回答2:

You provided the vital information in a comment below Peter's answer: your code notifies registered listeners. This means it cedes control to alien code while holding a lock, which is a known bad practice, as documented in Effective Java, Item 67.

Rework your code so you first make a safe copy of the listener list while holding a lock, then release the lock, and only then call into alien code.



回答3:

The thread dumps indicate that 2/3 threads have acquired the same lock object at the same point of time and are in the BLOCKED state.

This means that they are ready to run but are blocked waiting to get the lock. This is lock contention and as @Peter mentioned, you should reduce your synchronized section of code or lock on different objects. Be sure to move logging or other IO outside of the synchronized block for example.

Other threads (10 to 20 in number at different point of time) are blocked while WAITING for the very same lock object.

This means that they are waiting for the object to be notified by some other thread. This is not a problem.

But, some other threads are getting blocked after acquiring the same lock object.

This not technically possible. BLOCKED means that they are trying to lock the object. Only one thread can lock a specific object at one time. All other threads trying to lock it are BLOCKED.

One of the important points that was discussed in the other discussion that you reference, is that when you call synchronized (obj) { obj.wait(); } this acquires the lock and then releases it until it gets notified (or the wait times out or the thread gets interrupted). Even though the stack trace shows locked, the lock will be released when the wait() causes the thread to be WAITING.

We inspected our code and we don't see any wait() getting invoked within the synchronized block...

Huh. My quick answer is that if a thread is in the WAITING state, it has to have called wait() on something. To quote from the javadocs on Thread state:

A thread in the waiting state is waiting for another thread to perform a particular action. For example, a thread that has called Object.wait() on an object is waiting for another thread to call Object.notify() or Object.notifyAll() on that object. A thread that has called Thread.join() is waiting for a specified thread to terminate.

Could it be that there is an internal call to wait when you are accessing another object? Could the wait be on another object and not your list?



回答4:

Assuming you are using OpenJDK or Oracle's HotSpot, it looks to me like you are running into this cosmetic bug. The symptom is that multiple RUNNABLE or BLOCKED threads may incorrectly report having obtained the same monitor, which is not possible. To understand what is happening, mentally replace - locked <0xce1fb810> by - waiting to lock <0xce1fb810> wherever the thread header is in the waiting for monitor entry state.

(Multiple WAITING threads may report having the lock, it means that they have successfully acquired the lock but then gave it up to enter the waiting state and will attempt to reacquire on exiting the waiting state.)