I have already spent two days to short out this error, even I tried workaround which are suggested in several stackoverflow posts "-Djava.util.Arrays.useLegacyMergeSort=true" but it also doesnt work.
this is the details of my command and its returning error:
Command:
hadoop jar CloudBrush.jar -Djava.awt.headless=true -Djava.util.Arrays.useLegacyMergeSort=true -reads /Ec10k -asm Ec10k_Brush -k 21 -readlen 36
Error:
Error: java.lang.IllegalArgumentException: Comparison method violates its general contract!
at java.util.TimSort.mergeHi(TimSort.java:895)
at java.util.TimSort.mergeAt(TimSort.java:512)
at java.util.TimSort.mergeCollapse(TimSort.java:437)
at java.util.TimSort.sort(TimSort.java:241)
at java.util.Arrays.sort(Arrays.java:1512)
at java.util.ArrayList.sort(ArrayList.java:1454)
at java.util.Collections.sort(Collections.java:175)
at Brush.VerifyOverlap$VerifyOverlapReducer.reduce(VerifyOverlap.java:252)
at Brush.VerifyOverlap$VerifyOverlapReducer.reduce(VerifyOverlap.java:1)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Exception in thread "main" java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:836)
at Brush.VerifyOverlap.run(VerifyOverlap.java:381)
at Brush.BrushAssembler.buildOverlap(BrushAssembler.java:326)
at Brush.BrushAssembler.run(BrushAssembler.java:838)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at Brush.BrushAssembler.main(BrushAssembler.java:913)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
This is the Comparator
:
class OvelapSizeComparator implements Comparator {
public int compare(Object element1, Object element2) {
OverlapInfo obj1 = (OverlapInfo) element1;
OverlapInfo obj2 = (OverlapInfo) element2;
if ((int)(obj1.overlap_size - obj2.overlap_size) >= 0) {
return -1;
} else {
return 1;
}
}
}
The real problem is that your OvelapSizeComparator
[sic] is defective. If the two objects' overlap_size
values are equal, it returns -1
when it should return 0
. And if they're not equal, it returns a value with the incorrect sign.
To fix it, replace this:
if ((int)(obj1.overlap_size - obj2.overlap_size) >= 0) {
return -1;
} else {
return 1;
}
...with this:
return obj1.overlap_size - obj2.overlap_size;
You can reproduce this error by running the main method in Test.java code on JVM 7 or later versions. To summarize what this code does. This code tries to sort 40 person objects with a compare method which does not hold transitive property
//Person.java
public class Person implements Comparable{
public String name;
public int age ;
public int salary;
@Override
public int compareTo(Person o) {
if(o instanceof Person){
int ret=0;
if(age == 25 && ((Person)o).age ==27) ret = 1;
else if(age == 27 && ((Person)o).age ==29) ret = 1;
else if(age == 25 && ((Person)o).age ==29) ret = -1;
else{
if( age < ((Person)o).age) ret = -1;
if(age > ((Person)o).age) ret = 1;
if(salary < ((Person)o).salary) ret = -1;
if(salary > ((Person)o).salary) ret = 1;
}
return ret;
}
return 0;
}
@Override
public String toString(){
return "name="+name+":age="+age+";";
}
}
//Test.java
import java.util.Arrays;
public class Test {
public static void main(String args[]) {
Test t = new Test();
t.sortPersons(args);
}
public void sortPersons(String args[]) {
Person p1 = new Person();
p1.age = 25;
p1.name = "ABC";
Person p2 = new Person();
p2.age = 29;
p2.name = "ABZ";
Person p3 = new Person();
p3.age = 27;
p3.name = "AZ";
Person p4 = new Person();
p4.age = 27;
p4.name = "AZ";
Person p5 = new Person();
//p5.age = 22;
//p5.name="ZZ";
Person[] p = new Person[40];
p[0] = p2;
p[1] = p3;
p[2] = p4;
p[3] = p1;
p[4] = p5;
for (int i = 1; i < 8; i++) {
p[i * 5] = p[0];
p[i * 5 + 1] = p[1];
p[i * 5 + 2] = p[2];
p[i * 5 + 3] = p[3];
p[i * 5 + 4] = p[4];
}
System.out.println("\nSortingInput\n");
Arrays.sort(p);
System.out.println("\nSorting complete\n");
}
}
I got the error fixed. I thought it was problem with Hadoop, I was wrong. It is a problem with Java version. We also upgraded Java version along with Hadoop. Arrays.sort() method of java 7 and java 8 uses TimSort if the number of elements to be sorted are more than 32. And Tim sort enforces strict transitive property in comparing.
If ((compare(x, y)>0) && (compare(y, z)>0)) then compare(x, z) should be greater than 0. Else "java.lang.IllegalArgumentException: Comparison method violates its general contract" error is thrown.
Either you should change your compare method to be consistent with transitive property or use older version sort by setting "java.util.Arrays.useLegacyMergeSort" to true for "Map Task Java Opts Base" or "Reduce Task Java Opts Base" and it then should apply to all the JVM's started for map or reduce.
For 2.6.0-cdh5.4.2 Haddop you can add this setting to map task by adding
-D mapreduce.map.java.opts= "-Djava.util.Arrays.useLegacyMergeSort=true"
-D mapreduce.reduce.java.opts= "-Djava.util.Arrays.useLegacyMergeSort=true"
or via code
job.getConfiguration().set("mapreduce.map.java.opts","-Djava.util.Arrays.useLegacyMergeSort=true");
job.getConfiguration().set("mapreduce.reduce.java.opts","-Djava.util.Arrays.useLegacyMergeSort=true");