Get XML only immediate children elements by name

2019-01-17 19:37发布

My question is: How can I get elements directly under a specific parent element when there are other elements with the same name as a "grandchild" of the parent element.

I'm using the Java DOM library to parse XML Elements and I'm running into trouble. Here's some (a small portion) of the xml I'm using:

<notifications>
  <notification>
    <groups>
      <group name="zip-group.zip" zip="true">
        <file location="C:\valid\directory\" />
        <file location="C:\another\valid\file.doc" />
        <file location="C:\valid\file\here.txt" />
      </group>
    </groups>
    <file location="C:\valid\file.txt" />
    <file location="C:\valid\file.xml" />
    <file location="C:\valid\file.doc" />
  </notification>
</notifications>

As you can see, there are two places you can place the <file> element. Either in groups or outside groups. I really want it structured this way because it's more user-friendly.

Now, whenever I call notificationElement.getElementsByTagName("file"); it gives me all the <file> elements, including those under the <group> element. I handle each of these kinds of files differently, so this functionality is not desirable.

I've thought of two solutions:

  1. Get the parent element of the file element and deal with it accordingly (depending on whether it's <notification> or <group>.
  2. Rename the second <file> element to avoid confusion.

Neither of those solutions are as desirable as just leaving things the way they are and getting only the <file> elements which are direct children of <notification> elements.

I'm open to IMPO comments and answers about the "best" way to do this, but I'm really interested in DOM solutions because that's what the rest of this project is using. Thanks.

8条回答
做个烂人
2楼-- · 2019-01-17 19:46

There is a nice LINQ solution:

For Each child As XmlElement In From cn As XmlNode In xe.ChildNodes Where cn.Name = "file"
    ...
Next
查看更多
爱情/是我丢掉的垃圾
3楼-- · 2019-01-17 19:48

I had the same problem in one of my projects and wrote a little function which will return a List<Element> containing only the immediate children. Basically it checks for each node returned by getElementsByTagName if it's parentNode is actually the node we are searching childs of:

public static List<Element> getDirectChildsByTag(Element el, String sTagName) {
        NodeList allChilds = el.getElementsByTagName(sTagName);
        List<Element> res = new ArrayList<>();

        for (int i = 0; i < allChilds.getLength(); i++) {
            if (allChilds.item(i).getParentNode().equals(el))
                res.add((Element) allChilds.item(i));
        }

        return res;
    }

The accepted answer by kentcdodds will return wrong results (e.g. grandchilds) if there is a childnode called "notification" - e.g. returning grandchilds when the element "group" would have the name "notification". I was facing that setup in my project, which is why I came up with my function.

查看更多
Viruses.
4楼-- · 2019-01-17 19:50

You can use XPath for this, using two path to get them and process them differently.

To get the <file> nodes direct children of <notification> use //notification/file and for the ones in <group> use //groups/group/file.

This is a simple sample:

public class SO10689900 {
    public static void main(String[] args) throws Exception {
        DocumentBuilder db = DocumentBuilderFactory.newInstance().newDocumentBuilder();
        Document doc = db.parse(new InputSource(new StringReader("<notifications>\n" + 
                "  <notification>\n" + 
                "    <groups>\n" + 
                "      <group name=\"zip-group.zip\" zip=\"true\">\n" + 
                "        <file location=\"C:\\valid\\directory\\\" />\n" + 
                "        <file location=\"C:\\this\\file\\doesn't\\exist.grr\" />\n" + 
                "        <file location=\"C:\\valid\\file\\here.txt\" />\n" + 
                "      </group>\n" + 
                "    </groups>\n" + 
                "    <file location=\"C:\\valid\\file.txt\" />\n" + 
                "    <file location=\"C:\\valid\\file.xml\" />\n" + 
                "    <file location=\"C:\\valid\\file.doc\" />\n" + 
                "  </notification>\n" + 
                "</notifications>")));
        XPath xpath = XPathFactory.newInstance().newXPath();
        XPathExpression expr1 = xpath.compile("//notification/file");
        NodeList nodes = (NodeList)expr1.evaluate(doc, XPathConstants.NODESET);
        System.out.println("Files in //notification");
        printFiles(nodes);

        XPathExpression expr2 = xpath.compile("//groups/group/file");
        NodeList nodes2 = (NodeList)expr2.evaluate(doc, XPathConstants.NODESET);
        System.out.println("Files in //groups/group");
        printFiles(nodes2);
    }

    public static void printFiles(NodeList nodes) {
        for (int i = 0; i < nodes.getLength(); ++i) {
            Node file = nodes.item(i);
            System.out.println(file.getAttributes().getNamedItem("location"));
        }
    }
}

It should output:

Files in //notification
location="C:\valid\file.txt"
location="C:\valid\file.xml"
location="C:\valid\file.doc"
Files in //groups/group
location="C:\valid\directory\"
location="C:\this\file\doesn't\exist.grr"
location="C:\valid\file\here.txt"
查看更多
我命由我不由天
5楼-- · 2019-01-17 19:50

If you stick with the DOM API

NodeList nodeList = doc.getElementsByTagName("notification")
    .item(0).getChildNodes();

// get the immediate child (1st generation)
for (int i = 0; i < nodeList.getLength(); i++)
    switch (nodeList.item(i).getNodeType()) {
        case Node.ELEMENT_NODE:

            Element element = (Element) nodeList.item(i);
            System.out.println("element name: " + element.getNodeName());
            // check the element name
            if (element.getNodeName().equalsIgnoreCase("file"))
            {

                // do something with you "file" element (child first generation)

                System.out.println("element name: "
                    + element.getNodeName() + " attribute: "
                    + element.getAttribute("location"));

            }
    break;

}

Our first task is to get an element "Notification" (in this case the first -item (0)-) and all of its children:

NodeList nodeList = doc.getElementsByTagName("notification")
    .item(0).getChildNodes();

(later you can work with all elements using getting all the elements).

For every child of "Notification":

for (int i = 0; i < nodeList.getLength(); i++)

you first get its type in order to see whether it is an element:

switch (nodeList.item(i).getNodeType()) {
    case Node.ELEMENT_NODE:
        //.......
        break;  
}

If it's the case, then you got your children "file" , that are not grand children "Notification"

and your can check them out:

if (element.getNodeName().equalsIgnoreCase("file"))
{

    // do something with you "file" element (child first generation)

    System.out.println("element name:"
        + element.getNodeName() + " attribute: "
        + element.getAttribute("location"));

}

and the ouptut is:

element name: file
element name:file attribute: C:\valid\file.txt
element name: file
element name:file attribute: C:\valid\file.xml
element name: file
element name:file attribute: C:\valid\file.doc
查看更多
何必那么认真
6楼-- · 2019-01-17 19:50

I encountered a related problem where I needed to process just the immediate child nodes even though the treatment of all "file" nodes is similar. For my solution, I compare the Element's parent node with the node that is being processed in order to determine whether the Element is an immediate child.

NodeList fileNodes = parentNode.getElementsByTagName("file");
for(int i = 0; i < fileNodes.getLength(); i++){
            if(parentNode.equals(fileNodes.item(i).getParentNode())){
                if (fileNodes.item(i).getNodeType() == Node.ELEMENT_NODE) {

                    //process the child node...
                }
            }
        }
查看更多
混吃等死
7楼-- · 2019-01-17 19:56

I realise you found something of a solution to this in May @kentcdodds but I just had a fairly similar problem which I've now found, I think (perhaps in my usecase, but not in yours), a solution to.

a very simplistic example of my XML format is shown below:-

<?xml version="1.0" encoding="utf-8"?>
<rels>
    <relationship num="1">
        <relationship num="2">
            <relationship num="2.1"/>
            <relationship num="2.2"/>
        </relationship>
    </relationship>
    <relationship num="1.1"/>
    <relationship num="1.2"/>

</rels>

As you can hopefully see from this snippet, the format I want can have N-levels of nesting for [relationship] nodes, so obviously the problem I had with Node.getChildNodes() was that I was getting all nodes from all levels of the hierarchy, and without any sort of hint as to Node depth.

Looking at the API for a while , I noticed there are actually two other methods that might be of some use:-

Together, these two methods seemed to offer everything that was required to get all of the immediate descendant elements of a Node. The following jsp code should give a fairly basic idea of how to implement this. Sorry for the JSP. I'm rolling this into a bean now but didn't have time to create a fully working version from picked apart code.

<%@page import="javax.xml.parsers.DocumentBuilderFactory,
                javax.xml.parsers.DocumentBuilder,
                org.w3c.dom.Document,
                org.w3c.dom.NodeList,
                org.w3c.dom.Node,
                org.w3c.dom.Element,
                java.io.File" %><% 
try {

    File fXmlFile = new File(application.getRealPath("/") + "/utils/forms-testbench/dom-test/test.xml");
    DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
    DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
    Document doc = dBuilder.parse(fXmlFile);
    doc.getDocumentElement().normalize();

    Element docEl = doc.getDocumentElement();       
    Node childNode = docEl.getFirstChild();     
    while( childNode.getNextSibling()!=null ){          
        childNode = childNode.getNextSibling();         
        if (childNode.getNodeType() == Node.ELEMENT_NODE) {         
            Element childElement = (Element) childNode;             
            out.println("NODE num:-" + childElement.getAttribute("num") + "<br/>\n" );          
        }       
    }

} catch (Exception e) {
    out.println("ERROR:- " + e.toString() + "<br/>\n");
}

%>

This code would give the following output, showing only direct child elements of the initial root node.

NODE num:-1
NODE num:-1.1
NODE num:-1.2

Hope this helps someone anyway. Cheers for the initial post.

查看更多
登录 后发表回答