Imagine the following XML document:
<root>
<person_data>
<person>
<name>John</name>
<age>35</age>
</person>
<person>
<name>Jim</name>
<age>50</age>
</person>
</person_data>
<locations>
<location>
<name>John</name>
<country>USA</country>
</location>
<location>
<name>Jim</name>
<country>Japan</country>
</location>
</locations>
</root>
I then select the person node for Jim:
XmlNode personNode = doc.SelectSingleNode("//person[name = 'Jim']");
And now from this node with a single XPath select I would like to retrieve Jim's location node. Something like:
XmlNode locationNode = personNode.SelectSingleNode("//location[name = {reference to personNode}/name]");
Since I am selecting based on the personNode it would be handy if I could reference it in the select. Is this possible?.. is the connection there?
Sure I could put in a few extra lines of code and put the name into a variable and use this in the XPath string but that is not what I am asking.
This is not very efficient, but it should work. The larger the file gets, the slower will this be.
string xpath = "//location[name = //person[name='Jim']/name]";
XmlNode locationNode = doc.SelectSingleNode(xpath);
Here is why this is inefficient:
- The "
//
" shorthand causes a document-wide scan of all nodes.
- The "
[]
" predicate runs in a loop, once for each <person>
matched by "//person
".
- The second "
//
" causes a causes a document-wide scan again, this time once for each <person>
.
This means you get quadratic O(n²) worst-case performance, which is bad. If there are n <person>
s and n <location>
s in your document, n x n document wide scans happen. All out of one innocent looking XPath expression.
I'd recommend against that approach. A two-step selection (first, find the person, then the location) will perform better.
You are not selecting the location
node based on the person
node, rather you are selecting it based on the value of the node. The value is just a string and in this case, it can be used to formulate a predicate condition that selects the location
node based on the value within ("Jim").
I am not very sure why you want refer location from personNode
. Since, the name already exists in location
node you can very well use the same to get the location node corresponding to 'Jim'.
XPath would be: //location[name = 'Jim']
XmlNode locationNode = personNode.SelectSingleNode("..");
Should do it.
I appreciate that your real XML document is more complex than your example, but one thing does strike me about it. It resembles a relational data store containing the following:
- A
person
table with two columns - name
and age
.
- A
location
table with two columns - name
and country
.
- A 1-1 relationship between the two tables joining on the two
name
columns.
With that in mind, the XPath becomes obvious. You just select on the primary key value of the table whose data you want.
//location[name = 'Jim']
I know that aJ already proposed that solution, and it was rejected, but if you generalise the idea to the real XML schema, you get this:
//real_2nd_table_name[real_2nd_pk_column_name_1 = real_1st_pk_column_value_1 and real_2nd_pk_column_name_2 = real_1st_pk_column_value_2 and real_2nd_pk_column_name_3 = real_1st_pk_column_value_3 ...]
In other words:
- You already know the PK values used to find the row in the first table.
- You know how the two tables are related.
- Therefore you should be able to work out how to express a PK query on the second table using the same values that you would have used to select the row in the first table.