Why setArray() method call required in CopyOnWrite

2020-02-23 03:18发布

问题:

In CopyOnWriteArrayList.java, in the method set(int index, E element) below

public E set(int index, E element) {
    final ReentrantLock lock = this.lock;
    lock.lock();
    try {
        Object[] elements = getArray();
        Object oldValue = elements[index];

        if (oldValue != element) {
            int len = elements.length;
            Object[] newElements = Arrays.copyOf(elements, len);
            newElements[index] = element;
            setArray(newElements);
        } else {
            // Not quite a no-op; ensures volatile write semantics
            setArray(elements);----? Why this call required?
        }
        return (E)oldValue;
    } finally {
        lock.unlock();
    }
}

Why the call to setArray() is required? I couldn't understand the comment written above that method call. Is it because we are not using synchronised block, we have to flush manually all the variable we use? In the above method they are using re-entrant locks. If they had used synchronised statement do they still need to call setArray() method?. I think no.

Question2: If we end up in else, it means we didn't modified elements array, then why we need to flush the value of variable array?

回答1:

This code uses deep Java Memory Model voodoo, as it mixes both locks and volatiles.

The lock usage in this code is easy to dispense with, though. Locking provides memory ordering among threads that use the same lock. Specifically, the unlock at the end of this method provides happens-before semantics with other threads that acquire the same lock. Other code paths through this class, though, don't use this lock at all. Therefore, the memory model implications for the lock are irrelevant to those code paths.

Those other code paths do use volatile reads and writes, specifically to the array field. The getArray method does a volatile read of this field, and the setArray method method does a volatile write of this field.

The reason this code calls setArray even when it's apparently unnecessary is so that it establishes an invariant for this method that it always performs a volatile write to this array. This establishes happens-before semantics with other threads that perform volatile reads from this array. This is important because the volatile write-read semantics apply to reads and writes other than those of the volatile field itself. Specifically, writes to other (non-volatile) fields before a volatile write happen-before reads from those other fields after a volatile read of the same volatile variable. See the JMM FAQ for an explanation.

Here's an example:

// initial conditions
int nonVolatileField = 0;
CopyOnWriteArrayList<String> list = /* a single String */

// Thread 1
nonVolatileField = 1;                 // (1)
list.set(0, "x");                     // (2)

// Thread 2
String s = list.get(0);               // (3)
if (s == "x") {
    int localVar = nonVolatileField;  // (4)
}

Let's assume that line (3) gets the value set by line (2), the interned string "x". (For the sake of this example we use identity semantics of interned strings.) Assuming this is true, then the memory model guarantees that the value read at line (4) will be 1 as set by line (1). This is because the volatile write at (2), and every earlier write, happen-before the volatile read at line (3), and every subsequent read.

Now, suppose that the initial condition were that the list already contained a single element, the interned string "x". And further suppose that the set() method's else clause didn't make the setArray call. Now, depending on the initial contents of the list, the list.set() call at line (2) might or might not perform a volatile write, therefore the read at line (4) might or might not have any visibility guarantees!

Clearly you don't want these memory visibility guarantees to depend upon the current contents of the list. To establish the guarantee in all cases, set() needs to do a volatile write in all cases, and that's why it calls setArray() even if it didn't do any writing itself.



回答2:

TLDR; The call to setArray is required to provide the guarantee specified in the Javadoc of CopyOnWriteArrayList (even when the contents of the list is not changed)


CopyOnWriteArrayList has a memory-consistency guarantee specified in the Javadoc:

Memory consistency effects: As with other concurrent collections, actions in a thread prior to placing an object into a CopyOnWriteArrayList happen-before actions subsequent to the access or removal of that element from the CopyOnWriteArrayList in another thread.

The call to setArray is necessary to enforce this guarantee.

As the Java Memory Model specification in the JLS states:

A write to a volatile field (§8.3.1.4) happens-before every subsequent read of that field.

So the write to array (using the setArray) method is necessary to ensure that other threads reading from the list now have a happens-before (or rather, happens-after) relationship with the thread that called the set method, even when the element in the set method was already identical (using ==) with the element that was already in the list at that position.

Updated explanation

Going back to the guarantee in the Javadoc. There is this order of things (assuming an access, not a removal, as the last action - a removal is already taken care of because of the use of lock, but an access doesn't use lock) :

  1. An action in thread A prior to placing an object into the CopyOnWriteArrayList
  2. Placing and object into the CopyOnWriteArrayList (presumably on thread A, although the Javadoc could be clearer about this)
  3. Accessing [reading] an element from the CopyOnWriteArrayList on thread B

Assuming that step 2 places an element into the list that was already there, we see that the code goes into this branch:

} else {
    // Not quite a no-op; ensures volatile write semantics
    setArray(elements);
}

This call to setArray ensures a volatile write on field array from thread A. Since thread B will do a volatile read on field array, a happens-before relationship is created between thread A and thread B, which wouldn't have been created if the else-branch wasn't there.



回答3:

I believe it is because other methods that read the array do not obtain the lock, so there is no guarantee of happens before ordering. The way to preserve such ordering is to update the volatile field which does guarantee such ordering. (This is the write semantics that it is referring to)



回答4:

In JDK 11 this useless operation is allready removd from the source code. see code below.

//code from JDK 11.0.1
public E set(int index, E element) {
    synchronized (lock) {
        Object[] es = getArray();
        E oldValue = elementAt(es, index);

        if (oldValue != element) {
            es = es.clone();
            es[index] = element;
            setArray(es);
        }
        return oldValue;
    }
}


回答5:

It is not required AFAICS. This is for two reasons.

  • write semantics are only need if you perform a write, this this doesn't.
  • the lock.unlock() performs write semantics in a finally block, unavoidably.

The method

lock.unlock()

always calls through to

private volatile int state;

protected final void setState(int newState) {
    state = newState;
}

and this gives the has happens before semantics as setArray() making the set array redundant. You might claim that you don't want to depend on the implementation of ReentrantLock, but if you are worried that a future version of ReentrantLock is not thread safe, you might have bigger problems if this is the case.