XML file containing multiple root elements

2020-03-27 08:13发布

问题:

I have a file which contains multiple sets of root elements. How can I extract the root element one by one?

This is my XML

<Person>
    <empid></empid>
    <name></name>
</Person>
<Person>
    <empid></empid>
    <name></name>
</Person>
<Person>
    <empid></empid>
    <name></name>
</Person>

How can I extract one set of Person at a time?

回答1:

Use java.io.SequenceInputStream to trick xml parser:

import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;

import javax.xml.parsers.DocumentBuilderFactory;
import java.io.ByteArrayInputStream;
import java.io.FileInputStream;
import java.io.InputStream;
import java.io.SequenceInputStream;
import java.util.Arrays;
import java.util.Collections;
import java.util.List;

public class MultiRootXML{
    public static void main(String[] args) throws Exception{
        List<InputStream> streams = Arrays.asList(
                new ByteArrayInputStream("<root>".getBytes()),
                new FileInputStream("persons.xml"),
                new ByteArrayInputStream("</root>".getBytes())
        );
        InputStream is = new SequenceInputStream(Collections.enumeration(streams));
        Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(is);
        NodeList children = doc.getDocumentElement().getChildNodes();
        for(int i=0; i<children.getLength(); i++){
            Node child = children.item(i);
            if(child.getNodeType()==Node.ELEMENT_NODE){
                System.out.println("persion: "+child);
            }
        }
    }
}


回答2:

You cannot parse your file using an XML parser because your file is not XML. XML cannot have more than one root element.

You have to treat it as text, repair it to be well-formed, and then you can parse it with an XML parser.



回答3:

If your XML is valid, using a SAX or DOM parser. Please consult the XML Developer's Kit Programmer's Guide for more details.