Merge Two XML Files in Java

2019-01-04 02:26发布

I have two XML files of similar structure which I wish to merge into one file. Currently I am using EL4J XML Merge which I came across in this tutorial. However it does not merge as I expect it to for instances the main problem is its not merging the from both files into one element aka one that contains 1, 2, 3 and 4. Instead it just discards either 1 and 2 or 3 and 4 depending on which file is merged first.

So I would be grateful to anyone who has experience with XML Merge if they could tell me what I might be doing wrong or alternatively does anyone know of a good XML API for Java that would be capable of merging the files as I require?

Many Thanks for Your Help in Advance

Edit:

Could really do with some good suggestions on doing this so added a bounty. I've tried jdigital's suggestion but still having issues with XML merge.

Below is a sample of the type of structure of XML files that I am trying to merge.

<run xmloutputversion="1.02">
    <info type="a" />
    <debugging level="0" />
    <host starttime="1237144741" endtime="1237144751">
        <status state="up" reason="somereason"/>
        <something avalue="test" test="alpha" />
        <target>
            <system name="computer" />
        </target>
        <results>
            <result id="1">
                <state value="test" />
                <service value="gamma" />
            </result>
            <result id="2">
                <state value="test4" />
                <service value="gamma4" />
            </result>
        </results>
        <times something="0" />
    </host>
    <runstats>
        <finished time="1237144751" timestr="Sun Mar 15 19:19:11 2009"/>
        <result total="0" />
    </runstats>
</run>

<run xmloutputversion="1.02">
    <info type="b" />
    <debugging level="0" />
    <host starttime="1237144741" endtime="1237144751">
        <status state="down" reason="somereason"/>
        <something avalue="test" test="alpha" />
        <target>
            <system name="computer" />
        </target>
        <results>
            <result id="3">
                <state value="testagain" />
                <service value="gamma2" />
            </result>
            <result id="4">
                <state value="testagain4" />
                <service value="gamma4" />
            </result>
        </results>
        <times something="0" />
    </host>
    <runstats>
        <finished time="1237144751" timestr="Sun Mar 15 19:19:11 2009"/>
        <result total="0" />
    </runstats>
</run>

Expected output

<run xmloutputversion="1.02">
    <info type="a" />
    <debugging level="0" />
    <host starttime="1237144741" endtime="1237144751">
        <status state="down" reason="somereason"/>
        <status state="up" reason="somereason"/>
        <something avalue="test" test="alpha" />
        <target>
            <system name="computer" />
        </target>
        <results>
            <result id="1">
                <state value="test" />
                <service value="gamma" />
            </result>
            <result id="2">
                <state value="test4" />
                <service value="gamma4" />
            </result>
            <result id="3">
                <state value="testagain" />
                <service value="gamma2" />
            </result>
            <result id="4">
                <state value="testagain4" />
                <service value="gamma4" />
            </result>
        </results>
        <times something="0" />
    </host>
    <runstats>
        <finished time="1237144751" timestr="Sun Mar 15 19:19:11 2009"/>
        <result total="0" />
    </runstats>
</run>

12条回答
爷的心禁止访问
2楼-- · 2019-01-04 03:02

So, you're only interested in merging the 'results' elements? Everything else is ignored? The fact that input0 has an <info type="a"/> and input1 has an <info type="b"/> and the expected result has an <info type="a"/> seems to suggest this.

If you're not worried about scaling and you want to solve this problem quickly then I would suggest writing a problem-specific bit of code that uses a simple library like JDOM to consider the inputs and write the output result.

Attempting to write a generic tool that was 'smart' enough to handle all of the possible merge cases would be pretty time consuming - you'd have to expose a configuration capability to define merge rules. If you know exactly what your data is going to look like and you know exactly how the merge needs to be executed then I would imagine your algorithm would walk each XML input and write to a single XML output.

查看更多
▲ chillily
3楼-- · 2019-01-04 03:03

You might be able to write a java app that deserilizes the XML documents into objects, then "merge" the individual objects programmatically into a collection. You can then serialize the collection object back out to an XML file with everything "merged."

The JAXB API has some tools that can convert an XML document/schema into java classes. The "xjc" tool might be able to do this, although I can't remember if you can create classes directly from the XML doc, or if you have to generate a schema first. There are tools out there than can generate a schema from an XML doc.

Hope this helps... not sure if this is what you were looking for.

查看更多
何必那么认真
4楼-- · 2019-01-04 03:04

In addition to using Stax (which does make sense), it'd probably be easier with StaxMate (http://staxmate.codehaus.org/Tutorial). Just create 2 SMInputCursors, and child cursor if need be. And then typical merge sort with 2 cursors. Similar to traversing DOM documents in recursive-descent manner.

查看更多
Explosion°爆炸
5楼-- · 2019-01-04 03:04

Sometimes you need just concatenate XML-files into one, for example with similar structure, like this:

File xml1:

<root>
    <level1>
        ...
    </level1>
    <!--many records-->
    <level1>
        ...
    </level1>
</root>

File xml2:

<root>
    <level1>
        ...
    </level1>
    <!--many records-->
    <level1>
        ...
    </level1>
</root>

In this case, the next procedure that uses jdom2 library can help you:

void concatXML(Path fSource,Path fDest) {
     Document jdomSource = null;
     Document jdomDest = null;
     List<Element> elems = new LinkedList<Element>();
     SAXBuilder jdomBuilder = new SAXBuilder();
     try {
         jdomSource  = jdomBuilder.build(fSource.toFile());
         jdomDest    = jdomBuilder.build(fDest.toFile());
         Element root = jdomDest.getRootElement();
         root.detach();
         String sourceNextElementName=((Element) jdomSource.getRootElement().getContent().get(1)).getName();
         for (Element record:jdomSource.getRootElement().getDescendants(new ElementFilter(sourceNextElementName)))
                elems.add(record);
            for (Element elem : elems) (elem).detach();
            root.addContent(elems);

            Document newDoc = new Document(root);
            XMLOutputter xmlOutput = new XMLOutputter();

            xmlOutput.output(newDoc, System.out);
            xmlOutput.setFormat(Format.getPrettyFormat());
            xmlOutput.output(newDoc, Files.newBufferedWriter(fDest, Charset.forName("UTF-8")));
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
查看更多
我欲成王,谁敢阻挡
6楼-- · 2019-01-04 03:07

It might help if you were explicit about the result that you're interested in achieving. Is this what you're asking for?

Doc A:

<root>
  <a/>
  <b>
    <c/>
  </b>
</root>

Doc B:

<root>
  <d/>
</root>

Merged Result:

<root>
  <a/>
  <b>
    <c/>
  </b>
  <d/>
</root>

Are you worried about scaling for large documents?

The easiest way to implement this in Java is to use a streaming XML parser (google for 'java StAX'). If you use the javax.xml.stream library you'll find that the XMLEventWriter has a convenient method XMLEventWriter#add(XMLEvent). All you have to do is loop over the top level elements in each document and add them to your writer using this method to generate your merged result. The only funky part is implementing the reader logic that only considers (only calls 'add') on the top level nodes.

I recently implemented this method if you need hints.

查看更多
疯言疯语
7楼-- · 2019-01-04 03:12

This is how it should look like using XML Merge:

action.default=MERGE

xpath.info=/run/info
action.info=PRESERVE

xpath.result=/run/host/results/result
action.result=MERGE
matcher.result=ID

You have to set ID matcher for //result node and set PRESERVE action for //info node. Also beware that .properties XML Merge uses are case sensitive - you have to use "xpath" not "XPath" in your .properties.

Don't forget to define -config parameter like this:

java -cp lib\xmlmerge-full.jar; ch.elca.el4j.services.xmlmerge.tool.XmlMergeTool -config xmlmerge.properties example1.xml example2.xml 
查看更多
登录 后发表回答