Relative XPath node selection with C# XmlDocument

2019-08-10 12:19发布

Imagine the following XML document:

<root>
    <person_data>
        <person>
            <name>John</name>
            <age>35</age>
        </person>
        <person>
            <name>Jim</name>
            <age>50</age>
        </person>
    </person_data>
    <locations>
        <location>
            <name>John</name>
            <country>USA</country>
        </location>
        <location>
            <name>Jim</name>
            <country>Japan</country>
        </location>
    </locations>
</root>

I then select the person node for Jim:

XmlNode personNode = doc.SelectSingleNode("//person[name = 'Jim']");

And now from this node with a single XPath select I would like to retrieve Jim's location node. Something like:

XmlNode locationNode = personNode.SelectSingleNode("//location[name = {reference to personNode}/name]");

Since I am selecting based on the personNode it would be handy if I could reference it in the select. Is this possible?.. is the connection there?

Sure I could put in a few extra lines of code and put the name into a variable and use this in the XPath string but that is not what I am asking.

5条回答
The star\"
2楼-- · 2019-08-10 12:56

I am not very sure why you want refer location from personNode. Since, the name already exists in location node you can very well use the same to get the location node corresponding to 'Jim'.

XPath would be: //location[name = 'Jim']
查看更多
别忘想泡老子
3楼-- · 2019-08-10 13:01

You are not selecting the location node based on the person node, rather you are selecting it based on the value of the node. The value is just a string and in this case, it can be used to formulate a predicate condition that selects the location node based on the value within ("Jim").

查看更多
姐就是有狂的资本
4楼-- · 2019-08-10 13:07
XmlNode locationNode = personNode.SelectSingleNode(".."); 

Should do it.

查看更多
何必那么认真
5楼-- · 2019-08-10 13:10

I appreciate that your real XML document is more complex than your example, but one thing does strike me about it. It resembles a relational data store containing the following:

  • A person table with two columns - name and age.
  • A location table with two columns - name and country.
  • A 1-1 relationship between the two tables joining on the two name columns.

With that in mind, the XPath becomes obvious. You just select on the primary key value of the table whose data you want.

//location[name = 'Jim']

I know that aJ already proposed that solution, and it was rejected, but if you generalise the idea to the real XML schema, you get this:

//real_2nd_table_name[real_2nd_pk_column_name_1 = real_1st_pk_column_value_1 and real_2nd_pk_column_name_2 = real_1st_pk_column_value_2 and real_2nd_pk_column_name_3 = real_1st_pk_column_value_3 ...]

In other words:

  1. You already know the PK values used to find the row in the first table.
  2. You know how the two tables are related.
  3. Therefore you should be able to work out how to express a PK query on the second table using the same values that you would have used to select the row in the first table.
查看更多
爱情/是我丢掉的垃圾
6楼-- · 2019-08-10 13:15

This is not very efficient, but it should work. The larger the file gets, the slower will this be.

string xpath = "//location[name = //person[name='Jim']/name]";
XmlNode locationNode = doc.SelectSingleNode(xpath);

Here is why this is inefficient:

  • The "//" shorthand causes a document-wide scan of all nodes.
  • The "[]" predicate runs in a loop, once for each <person> matched by "//person".
  • The second "//" causes a causes a document-wide scan again, this time once for each <person>.

This means you get quadratic O(n²) worst-case performance, which is bad. If there are n <person>s and n <location>s in your document, n x n document wide scans happen. All out of one innocent looking XPath expression.

I'd recommend against that approach. A two-step selection (first, find the person, then the location) will perform better.

查看更多
登录 后发表回答