Need a complex Xpath using sibling children, ances

2019-07-13 10:27发布

问题:

I need to find attribute values based on other values pulled from parent's/grand-parent's sibling's children. I think it's going to take 2 different expressions.

So given the following XML (which is derived from a log file that can be thousands of lines long):

<p:log xmlns:p="urn:NamespaceInfo">
 <p:entries>
   <p:entry timestamp="2012-12-31T09:39:25">
     <p:attributes>
       <p:attrib name="Position" value="1B2" />
       <p:attrib name="Something" value="Something_else" />
     </p:attributes>
     <p:msg>
     </p:msg>
   </p:entry>
   <p:entry timestamp="2012-12-31T09:39:25">
     <p:attributes>
       <p:attrib name="Form" value="FormA" />
     </p:attributes>
     <p:msg>
     </p:msg>
   </p:entry>
   <p:entry timestamp="2012-12-31T09:39:25">
     <p:msg>Successful....</p:msg>
   </p:entry>
   <p:entry timestamp="2012-12-31T12:12:12">
     <p:attributes>
       <p:attrib name="Position" value="1B3" />
       <p:attrib name="Something" value="Something_else" />
     </p:attributes>
     <p:msg>
     </p:msg>
   </p:entry>
   <p:entry timestamp="2012-12-31T09:39:25">
     <p:attributes>
       <p:attrib name="Form" value="FormB" />
     </p:attributes>
     <p:msg>
     </p:msg>
   </p:entry>
   <p:entry timestamp="2012-12-31T09:39:25">
     <p:msg>Processing....</p:msg>
   </p:entry>
   <p:entry timestamp="2012-12-31T09:39:25">
     <p:msg>Error1</p:msg>
   </p:entry>
   <p:entry timestamp="2012-12-31T09:39:25">
    <p:msg>Error1</p:msg>
   </p:entry>
 </p:entries>
     ...
</p:log>
  • (<p:attributes> parent tags can have multiple <p:attrib> child tags)
  • (<p:event> tags can only have one <p:msg> tag)

First, I need to grab the value of the value attribute that has a corresponding name attribute of Position, but only if the grand-parent's sibling p:entry has a child p:msg with the text of Error1. Also, it needs to stay within that section. For instance, I don't want the first occurrence of the Position'/'Value pair because a new Position/Value pair appears before the Error1, even though technically the p:msg with the Error1 is a sibling of both grand-parents.

Next, I need the timestamp attributes' value of the parent of the child whose Position/Value I just grabbed. So, find the position, then find the timestamp attribute value of the grand-parent p:entry tag.

So for this example, I should be able to retrieve the following values only:

1B3

2012-12-31T12:12:12 (the date/time stamps given are arbitrary values. This one is different so you know which one I was referencing).

Kind of confusing I know. I will also need to make sure I grab just one instance because I am using XQuery to get the data out of a database and each expression has to result to a singular value.

I can get to the first timestamp associated with the p:msg with Error1 with the following: //p:entry[descendant::p:msg='Error1.'][1]/@timestamp

but can't seem to get back up the tree to get the other values.

I can get the all of timestamps of the p:events that have p:attrib grand-children with: //p:entry[descendant::p:attrib[@name=''Position'']]/@timestamp)[1]

but I can't seem to limit it to just the one that has the 'Error1' following it. I can't base my selection on position. I have to base it first on content.

BONUS QUESTION

How could I do this again on the next instance down the log file? (not just the second Error1 message, the next time down the log file where the Error1 msg shows up for the next 'parent/sibling' match). This may be obvious once I get the answer to the questions above.

回答1:

UPDATED:

OK I think I got this. Here's the answer to the first one:

//p:msg[text()="Error1"]/../preceding-sibling::p:entry[./*/p:attrib[@name="Position"]][1]/*/p:attrib[@name="Position"]/@value

This is working back from the p:msg tag, which makes it easier to select the first (that's the [1] in there) of the preceding parent p:entry tags which satisfy the condition that they have a grandchild p:attrib with a name Position.

Getting the timestamp is just a tad simpler:

//p:msg[text()="Error1"]/../preceding-sibling::p:entry[./*/p:attrib[@name="Position"]][1]/@timestamp

Try that out and see what you think.

ORIGINAL ANSWER:

Normally I don't post half-finished answers, but my guess is that you won't get anything else since this question is so complicated, so here's the xpath for what you describe in the first paragraph:

//p:entry[following-sibling::p:entry/p:msg/text()="Error1"]/*/p:attrib[@name="Position"]/@value

This will get

the value of the value attribute that has a corresponding name attribute of Position, but only if the grand-parent's sibling p:entry has a child p:msg with the text of Error1.

However I don't know what you mean when you say "it needs to stay within that section". Can you clarify? This will return both 1B2 and 1B3.

For the second part of your question, you can get the timestamp for the entries above with this:

//p:entry[following-sibling::p:entry/p:msg/text()="Error1" and ./*/p:attrib[@name="Position"]]/@timestamp

Again though, this won't do the "section" thing you mentioned. That's a bit more tricky, beyond my (current) knowledge of xpath unfortunately.