OutOfMemoryError while doing docx comparison using

2019-01-28 23:21发布

问题:

in my application i am comparing two docx files and creating one html comparison file, when i tried with below 150 or 170 lines of file then there is no issue, while i try to compare the big files like 200 lines or more than that then that time it showing the

java.lang.OutOfMemoryError: Java heap space error,

can any one please help on this?

回答1:

You are running out of memory because you aren't using the Docx4jDriver class, which makes the diff problem more tractable by doing a paragraph level diff first.

Use it like so:

        Body newerBody = ((Document)newerPackage.getMainDocumentPart().getJaxbElement()).getBody();
        Body olderBody = ((Document)olderPackage.getMainDocumentPart().getJaxbElement()).getBody();

        // 2. Do the differencing
        java.io.StringWriter sw = new java.io.StringWriter();
        Docx4jDriver.diff( XmlUtils.marshaltoW3CDomDocument(newerBody).getDocumentElement(),
                        XmlUtils.marshaltoW3CDomDocument(olderBody).getDocumentElement(),
                           sw);

        // 3. Get the result
        String contentStr = sw.toString();
        System.out.println("Result: \n\n " + contentStr);
        Body newBody = (Body) org.docx4j.XmlUtils
                        .unmarshalString(contentStr);


回答2:

you can make the heap space bigger with -Xmx and -Xmx as VM Arguments

Here more about Heap Size Tuning or here Heap size



回答3:

Try increasing the Java heap size using the command line arguments -Xmx<maximum heap size> and -Xms<minimum heap size>.

Also in your code, test that you actually have increased the heap size with the following:

long heapSize = Runtime.getRuntime().totalMemory();
System.out.println("Heap Size = " + heapSize);

Do this before calling Differencer.diff on line 117.



回答4:

Try profiling your application rather than making assumptions or intelligent guess. You can use visualvm or console that ships with the Jdk.

Also, you can take a heap dump of your application using jmap and then use either jhat or eclipse mat (I prefer this, google it out) to see what's consuming the memory and look out for any unusual behavior.



标签: java docx4j