I need to perform Diffs between Java strings. I would like to be able to rebuild a string using the original string and diff versions. Has anyone done this in Java? What library do you use?
String a1; // This can be a long text
String a2; // ej. above text with spelling corrections
String a3; // ej. above text with spelling corrections and an additional sentence
Diff diff = new Diff();
String differences_a1_a2 = Diff.getDifferences(a,changed_a);
String differences_a2_a3 = Diff.getDifferences(a,changed_a);
String[] diffs = new String[]{a,differences_a1_a2,differences_a2_a3};
String new_a3 = Diff.build(diffs);
a3.equals(new_a3); // this is true
This library seems to do the trick: google-diff-match-patch. It can create a patch string from differences and allow to reapply the patch.
edit: Another solution might be to https://code.google.com/p/java-diff-utils/
Apache Commons has String diff
org.apache.commons.lang.StringUtils
StringUtils.difference("foobar", "foo");
As Torsten Says you can use
org.apache.commons.lang.StringUtils;
System.err.println(StringUtils.getLevenshteinDistance("foobar", "bar"));
The java diff utills library might be useful.
If you need to deal with differences between big amounts of data and have the differences efficiently compressed, you could try a Java implementation of xdelta, which in turn implements RFC 3284 (VCDIFF) for binary diffs (should work with strings too).
Use the Levenshtein distance and extract the edit logs from the matrix the algorithm builds up. The Wikipedia article links to a couple of implementations, I'm sure there's a Java implementation among in.
Levenshtein is a special case of the Longest Common Subsequence algorithm, you might also want to have a look at that.
public class Stringdiff {
public static void main(String args[]){
System.out.println(strcheck("sum","sumsum"));
}
public static String strcheck(String str1,String str2){
if(Math.abs((str1.length()-str2.length()))==-1){
return "Invalid";
}
int num=diffcheck1(str1, str2);
if(num==-1){
return "Empty";
}
if(str1.length()>str2.length()){
return str1.substring(num);
}
else{
return str2.substring(num);
}
}
public static int diffcheck1(String str1,String str2)
{
int i;
String str;
String strn;
if(str1.length()>str2.length()){
str=str1;
strn=str2;
}
else{
str=str2;
strn=str1;
}
for(i=0;i<str.length() && i<strn.length();i++){
if(str1.charAt(i)!=str2.charAt(i)){
return i;
}
}
if(i<str1.length()||i<str2.length()){
return i;
}
return -1;
}
}