A sensitive operation in my lab today went completely wrong. An actuator on an electron microscope went over its boundary, and after a chain of events I lost $12 million of equipment. I've narrowed down over 40K lines in the faulty module to this:
import java.util.*;
class A {
static Point currentPos = new Point(1,2);
static class Point {
int x;
int y;
Point(int x, int y) {
this.x = x;
this.y = y;
}
}
public static void main(String[] args) {
new Thread() {
void f(Point p) {
synchronized(this) {}
if (p.x+1 != p.y) {
System.out.println(p.x+" "+p.y);
System.exit(1);
}
}
@Override
public void run() {
while (currentPos == null);
while (true)
f(currentPos);
}
}.start();
while (true)
currentPos = new Point(currentPos.x+1, currentPos.y+1);
}
}
Some samples of the output I'm getting:
$ java A
145281 145282
$ java A
141373 141374
$ java A
49251 49252
$ java A
47007 47008
$ java A
47427 47428
$ java A
154800 154801
$ java A
34822 34823
$ java A
127271 127272
$ java A
63650 63651
Since there isn't any floating point arithmetic here, and we all know signed integers behave well on overflow in Java, I'd think there's nothing wrong with this code. However, despite the output indicating that the program didn't reach the exit condition, it reached the exit condition (it was both reached and not reached?). Why?
I've noticed this doesn't happen in some environments. I'm on OpenJDK 6 on 64-bit Linux.
currentPos = new Point(currentPos.x+1, currentPos.y+1);
does a few things, including writing default values tox
andy
(0) and then writing their initial values in the constructor. Since your object is not safely published those 4 write operations can be freely reordered by the compiler / JVM.So from the perspective of the reading thread, it is a legal execution to read
x
with its new value buty
with its default value of 0 for example. By the time you reach theprintln
statement (which by the way is synchronized and therefore does influence the read operations), the variables have their initial values and the program prints the expected values.Marking
currentPos
asvolatile
will ensure safe publication since your object is effectively immutable - if in your real use case the object is mutated after construction,volatile
guarantees won't be enough and you could see an inconsistent object again.Alternatively, you can make the
Point
immutable which will also ensure safe publication, even without usingvolatile
. To achieve immutability, you simply need to markx
andy
final.As a side note and as already mentioned,
synchronized(this) {}
can be treated as a no-op by the JVM (I understand you included it to reproduce the behaviour).You have ordinary memory, the 'currentpos' reference and the Point object and its fields behind it, shared between 2 threads, without synchronisation. Thus, there is no defined ordering between the writes that happen to this memory in the main thread and the reads in the created thread (call it T).
Main thread is doing the following writes (ignoring the initial setup of point, will result in p.x and p.y having default values):
Because there is nothing special about these writes in terms of synchronisation/barriers, the runtime is free to allow the T thread see them occur in any order (the main thread of course always sees writes and reads ordered according to programme order), and occur at any point between the reads in T.
So T is doing:
Given there's no ordering relationships between the writes in main, and the reads in T, there are clearly several ways this can produce your result, as T may see main's write to currentpos before the writes to currentpos.y or currentpos.x:
and so on... There are a number of data races here.
I suspect the flawed assumption here is thinking that the writes that result from this line are made visible across all the threads in the programme order of the thread executing it:
Java makes no such guarantee (it'd be terrible for performance). Something more must be added if your programme needs a guaranteed ordering of the writes relative to reads in other threads. Others have suggested making the x,y fields final, or alternatively making currentpos volatile.
Using final has the advantage that it makes the fields immutable, and thus allows the values to be cached. Using volatile leads to synchronisation on every write and read of currentpos, which might hurt performance.
See chapter 17 of the Java Language Spec for the gory details:http://docs.oracle.com/javase/specs/jls/se7/html/jls-17.html
(Initial answer assumed a weaker memory model, as I was not sure the JLS guaranteed volatile was sufficient. Answer edited to reflect comment from assylias, pointing out the Java model is stronger - happens-before is transitive - and so volatile on currentpos also suffices).
You are accessing currentPos twice, and providing no guarantee that it is not updated in between those two accesses.
For example:
You are essentially comparing two different Points.
Note that even making currentPos volatile won't protect you from this, since it's two separate reads by the worker thread.
Add an
method to your points class. This will ensure that only one value of currentPos is used when checking x+1 == y.
You could use an object to synchronize the writes and reads. Otherwise, as others said before, a write to currentPos will occur in the middle of the two reads p.x+1 and p.y.
Since
currentPos
is being changed outside of the thread it should be marked asvolatile
:Without volatile the thread is not guaranteed to read in updates to currentPos that are being made in the main thread. So new values continue to be written for currentPos but thread is continuing to use the previous cached versions for performance reasons. Since only one thread modifies currentPos you can get away without locks which will improve performance.
The results look much different if you read the values only a single time within the thread for use in the comparison and subsequent displaying of them. When I do the following
x
always displays as1
andy
varies between0
and some large integer. I think the behavior of it at this point is somewhat undefined without thevolatile
keyword and it's possible that the JIT compilation of the code is contributing to it acting like this. Also if I comment out the emptysynchronized(this) {}
block then the code works as well and I suspect it is because the locking causes sufficient delay thatcurrentPos
and its fields are reread rather than used from the cache.