in my application i am comparing two docx files and creating one html comparison file, when i tried with below 150 or 170 lines of file then there is no issue, while i try to compare the big files like 200 lines or more than that then that time it showing the
java.lang.OutOfMemoryError: Java heap space error,
can any one please help on this?
You are running out of memory because you aren't using the Docx4jDriver class, which makes the diff problem more tractable by doing a paragraph level diff first.
Use it like so:
Body newerBody = ((Document)newerPackage.getMainDocumentPart().getJaxbElement()).getBody();
Body olderBody = ((Document)olderPackage.getMainDocumentPart().getJaxbElement()).getBody();
// 2. Do the differencing
java.io.StringWriter sw = new java.io.StringWriter();
Docx4jDriver.diff( XmlUtils.marshaltoW3CDomDocument(newerBody).getDocumentElement(),
XmlUtils.marshaltoW3CDomDocument(olderBody).getDocumentElement(),
sw);
// 3. Get the result
String contentStr = sw.toString();
System.out.println("Result: \n\n " + contentStr);
Body newBody = (Body) org.docx4j.XmlUtils
.unmarshalString(contentStr);
you can make the heap space bigger with -Xmx and -Xmx as VM Arguments
Here more about Heap Size Tuning or here Heap size
Try increasing the Java heap size using the command line arguments -Xmx<maximum heap size>
and -Xms<minimum heap size>
.
Also in your code, test that you actually have increased the heap size with the following:
long heapSize = Runtime.getRuntime().totalMemory();
System.out.println("Heap Size = " + heapSize);
Do this before calling Differencer.diff
on line 117.
Try profiling your application rather than making assumptions or intelligent guess. You can use visualvm or console that ships with the Jdk.
Also, you can take a heap dump of your application using jmap and then use either jhat or eclipse mat (I prefer this, google it out) to see what's consuming the memory and look out for any unusual behavior.