Multiple mutex locking strategies and why librarie

2019-01-22 19:46发布

问题:

There is a widely known way of locking multiple locks, which relies on choosing fixed linear ordering and aquiring locks according to this ordering.

That was proposed, for example, in the answer for "Acquire a lock on two mutexes and avoid deadlock". Especially, the solution based on address comparison seems to be quite elegant and obvious.

When I tried to check how it is actually implemented, I've found, to my surprise, that this solution in not widely used.

To quote the Kernel Docs - Unreliable Guide To Locking:

Textbooks will tell you that if you always lock in the same order, you will never get this kind of deadlock. Practice will tell you that this approach doesn't scale: when I create a new lock, I don't understand enough of the kernel to figure out where in the 5000 lock hierarchy it will fit.

PThreads doesn't seem to have such a mechanism built in at all.

Boost.Thread came up with completely different solution, lock() for multiple (2 to 5) mutexes is based on trying and locking as many mutexes as it is possible at the moment.

This is the fragment of the Boost.Thread source code (Boost 1.48.0, boost/thread/locks.hpp:1291):

template<typename MutexType1,typename MutexType2,typename MutexType3>
void lock(MutexType1& m1,MutexType2& m2,MutexType3& m3)
{    
    unsigned const lock_count=3;
    unsigned lock_first=0;
    for(;;)
    {    
        switch(lock_first)
        {    
        case 0:
            lock_first=detail::lock_helper(m1,m2,m3);
            if(!lock_first)
                return;
            break;
        case 1:
            lock_first=detail::lock_helper(m2,m3,m1);
            if(!lock_first)
                return;
            lock_first=(lock_first+1)%lock_count;
            break;
        case 2:
            lock_first=detail::lock_helper(m3,m1,m2);
            if(!lock_first)
                return;
            lock_first=(lock_first+2)%lock_count;
            break;
        }    
    }    
}    

where lock_helper returns 0 on success and number of mutexes that weren't successfully locked otherwise.

Why is this solution better, than comparing addresses or any other kind of ids? I don't see any problems with pointer comparison, which can be avoided using this kind of "blind" locking.

Are there any other ideas on how to solve this problem on a library level?

回答1:

From the bounty text:

I'm not even sure if I can prove correctness of the presented Boost solution, which seems more tricky than the one with linear order.

The Boost solution cannot deadlock because it never waits while already holding a lock. All locks but the first are acquired with try_lock. If any try_lock call fails to acquire its lock, all previously acquired locks are freed. Also, in the Boost implementation the new attempt will start from the lock failed to acquire the previous time, and will first wait till it is available; it's a smart design decision.

As a general rule, it's always better to avoid blocking calls while holding a lock. Therefore, the solution with try-lock, if possible, is preferred (in my opinion). As a particular consequence, in case of lock ordering, the system at whole might get stuck. Imagine the very last lock (e.g. the one with the biggest address) was acquired by a thread which was then blocked. Now imagine some other thread needs the last lock and another lock, and due to ordering it will first get the other one and will wait on the last lock. Same can happen with all other locks, and the whole system makes no progress until the last lock is released. Of course it's an extreme and rather unlikely case, but it illustrates the inherent problem with lock ordering: the higher a lock number the more indirect impact the lock has when acquired.

The shortcoming of the try-lock-based solution is that it can cause livelock, and in extreme cases the whole system might also get stuck for at least some time. Therefore it is important to have some back-off schema that make pauses between locking attempts longer with time, and perhaps randomized.



回答2:

Sometimes, lock A needs to be acquired before lock B does. Lock B might have either a lower or a higher address, so you can't use address comparison in this case.

Example: When you have a tree data-structure, and threads try to read and update nodes, you can protect the tree using a reader-writer lock per node. This only works if your threads always acquire locks top-down root-to-leave. The address of the locks does not matter in this case.

You can only use address comparison if it does not matter at all which lock gets acquired first. If this is the case, address comparison is a good solution. But if this is not the case you can't do it.

I guess the Linux kernel requires certain subsystems to be locked before others are. This cannot be done using address comparison.



回答3:

The "address comparison" and similar approaches, although used quite often, are special cases. They works fine if you have

  1. a lock-free mechanism to get
  2. two (or more) "items" of the same kind or hierarchy level
  3. any stable ordering schema between those items

For example: You have a mechanism to get two "accounts" from a list. Assume that the access to the list is lock-free. Now you have pointers to both items and want to lock them. Since they are "siblings" you have to choose which one to lock first. Here the approach using addresses (or any other stable ordering schema like "account id") is OK.

But the linked Linux text talks about "lock hierarchies". This means locking not between "siblings" (of the same kind) but between "parent" and "children" which might be from different types. This may happen in actual tree structures as well in other scenarios. Contrived example: To load a program you must

  1. lock the file inode,
  2. lock the process table
  3. lock the destination memory

These three locks are not "siblings" not in a clear hierarchy. The locks are also not taken directly one after the other - each subsystem will take the locks at free will. If you consider all usecases where those three (and more) subsystems interact you see, that there is no clear, stable ordering you can think of.

The Boost library is in the same situation: It strives to provide generic solutions. So they cannot assume the points from above and must fall back to a more complicated strategy.



回答4:

One scenario when address compare will fail is if you use the proxy pattern. You can delegate the locks to the same object and the addresses will be different.

Consider the following example

template<typename MutexType>
class MutexHelper
{
    MutexHelper(MutexType &m) : _m(m) {}
    void lock() 
    {
     std::cout <<"locking ";
     m.lock();
    }

    void unlock() 
    {
     std::cout <<"unlocking ";
     m.unlock();
    }

    MutexType &_m;
};

if the function

template<typename MutexType1,typename MutexType2,typename MutexType3>
void lock(MutexType1& m1,MutexType2& m2,MutexType3& m3);

will actually use address compare the following code ca produce a deadlock

Mutex m1;
Mutex m1;

thread1

MutexHelper hm1(m1);
MutexHelper hm2(m2);

lock(hm1, hm2);

thread2:

MutexHelper hm2(m2);
MutexHelper hm1(m1);
lock(hm1, hm2);

EDIT:

this is an interesting thread that share some light on boost::lock implementation thread-best-practice-to-lock-multiple-mutexes

Address compare does not work for inter-process shared mutexes (named synchronization objects).