I get a NullPointerException in a piece of code which can't throw it. I start thinking to have found a bug in JRE. I am using javac 1.8.0_51 as compiler, and the problem occurs both in jre 1.8.0_45 and the latest 1.8.0_60.
The line throwing the exception is inside a loop, which is inside a closure lambda function. We are running such closure in spark 1.4. The line is executed 1-2 million times, and I get the error not deterministically, with the same input, once every 3 or 4 run.
I'm pasting relevant piece of code here:
JavaRDD .... mapValues(iterable -> {
LocalDate[] dates = ...
long[] dateDifferences = ...
final double[] fooArray = new double[dates.length];
final double[] barArray = new double[dates.length];
for (Item item : iterable) {
final LocalDate myTime = item.getMyTime();
final int largerIndex = ...
if (largerIndex == 0) {
...
} else if (largerIndex >= dates.length - 1) {
...
} else {
final LocalDate largerDate = dates[largerIndex];
final long daysBetween = ...
if (daysBetween == 0) {
...
} else {
double factor = ...
// * * * NULL POINTER IN NEXT LINE * * * //
fooArray[largerIndex - 1] += item.getFoo() * factor;
fooArray[largerIndex] += item.getFoo() * (1 - factor);
barArray[largerIndex - 1] += item.getBar() * factor;
barArray[largerIndex] += item.getBar() * (1 - factor);
}
}
}
return new NewItem(fooArray, barArray);
})
...
I started analysing code and found that:
- fooArray is never null since you have "new" few lines above
- largerIndex is primitive
- item is never null as it is already used few lines above
- getFoo() returns double with no unboxing
- factor is primitive
I can't run the same input locally and debug it: this is run on a spark cluster. So I added some debug println before the throwing line:
System.out.println("largerIndex: " + largerIndex);
System.out.println("foo: " + Arrays.toString(foo));
System.out.println("foo[1]: " + foo[1]);
System.out.println("largerIndex-1: " + (largerIndex-1));
System.out.println("foo[largerIndex]: " + foo[largerIndex]);
System.out.println("foo[largerIndex - 1]: " + foo[largerIndex - 1]);
And this is the output:
largerIndex: 2
foo: [0.0, 0.0, 0.0, 0.0, ...]
foo[1]: 0.0
largerIndex-1: 1
foo[largerIndex]: 0.0
15/10/01 12:36:11 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 7.0 (TID 17162, host13): java.lang.NullPointerException
at my.class.lambda$mymethod$87560622$1(MyFile.java:150)
at my.other.class.$$Lambda$306/764841389.call(Unknown Source)
at org.apache.spark.api.java.JavaPairRDD$$anonfun$toScalaFunction$1.apply(JavaPairRDD.scala:1027)
...
So foo[largerIndex - 1] is currently throwing the null-pointer. Note that also the following throws it:
int idx = largerIndex - 1;
foo[idx] += ...;
But not the following:
foo[1] += ....;
I gave a look at bytecode in class file and found nothing strange. You correctly have the reference to foo and largerIndex in the stack before iconst_1, isub, and daload.
I'm just posting this to collect ideas before thinking to a jre bug. Does anyone of you experienced same class of problems using spark? or lambda function in general. Is it possible to run jvm with some debug flag to help me understand this strange behavior? Or should I file the issue to someone somewhere?
This looks to me like it is a very similar problem to the one described here (a JIT problem): http://kingsfleet.blogspot.com.br/2014/11/but-thats-impossible-or-finding-out.html
Your observation, that it does not occur every time and that it is "impossible" to occur when reading the code is exactly the same as described there. To find it out, use the commandline options to exclude your method from being JIT'ed like (you need to specify the correct Class/method name):
Or by switching it off completely using
which may be too drastic.