Find xml node's full xpath

2020-03-27 06:16发布

Suppose the following PowerShell line of code

$node = Select-Xml -Path $filePath -XPath "//*[@$someAttribute]"

How can I get the node's xpath? I figured I could just traverse up using its ParentNode property, but is there a better way to do it?

3条回答
你好瞎i
2楼-- · 2020-03-27 07:04

To extend Ansgar's answer slightly, the code can be made to find a partial Xpath that doesn't extend to the XML root. Only the child node and some definite attribute (a name in my code) of the "super-parent" (a parent at an unknown level) is required.

Function Get-XPath($node_xml, [string]$parent_type){
    if ($node_xml.GetType().Name -ne 'XmlDocument' -and $node_xml.Name -ne "$parent_type"){
    "{0}/{1}" -f (Get-XPath $node_xml.ParentNode $parent_type), $node_xml.name
    }
}
查看更多
▲ chillily
3楼-- · 2020-03-27 07:12

I don't think there's anything built into PowerShell to do what you want. Recursing upwards isn't too difficult, though. A function like this should work:

function Get-XPath($n) {
  if ( $n.GetType().Name -ne 'XmlDocument' ) {
    "{0}/{1}" -f (Get-XPath $n.ParentNode), $n.Name
  }
}
查看更多
Lonely孤独者°
4楼-- · 2020-03-27 07:14

Ansgar Wiechers' helpful answer provides an elegant recursive function.

The following function builds on it while trying to remove some of its limitations:

  • It properly reflects a node's index among siblings of the same name so as to reliable target it; for instance, if the given node is an element named foo and there are two other, sibling foo elements that come before it, the returned path ends in .../foo[3]

  • It supports not just element nodes, but also attributes and text/CDATA nodes.

  • It avoids potential name collisions with the properties that PowerShell adds to provide direct, name-based access to the XML DOM, by using the get_*() methods to access type-native properties - see this answer for background information.

# Given a [System.Xml.XmlNode] instance, returns the path to it
# inside its document in XPath form.
# Supports element, attribute, and text/CDATA nodes.
function Get-NodeXPath {
  param (
      [ValidateNotNull()]
      [System.Xml.XmlNode] $node
  )

  if ($node -is [System.Xml.XmlDocument]) { return '' } # Root reached
  $isAttrib = $node -is [System.Xml.XmlAttribute]

  # IMPORTANT: Use get_*() accessors for all type-native property access,
  #            to prevent name collision with Powershell's adapted-DOM ETS properties.

  # Get the node's name.
  $name = if ($isAttrib) {
      '@' + $node.get_Name()
    } elseif ($node -is [System.Xml.XmlText] -or $node -is [System.Xml.XmlCDataSection]) {
      'text()'
    } else { # element
      $node.get_Name()
    }

  # Count any preceding siblings with the same name.
  # Note: To avoid having to provide a namespace manager, we do NOT use
  #       an XPath query to get the previous siblings.
  $prevSibsCount = 0; $prevSib = $node.get_PreviousSibling()
  while ($prevSib) {
    if ($prevSib.get_Name() -ceq $name) { ++$prevSibsCount }
    $prevSib = $prevSib.get_PreviousSibling()
  }

  # Determine the (1-based) index among like-named siblings, if applicable.
  $ndx = if ($prevSibsCount) { '[{0}]' -f (1 + $prevSibsCount) }

  # Determine the owner / parent element.
  $ownerOrParentElem = if ($isAttrib) { $node.get_OwnerElement() } else { $node.get_ParentNode() }

  # Recurse upward and concatenate with "/"
  "{0}/{1}" -f (Get-NodeXPath $ownerOrParentElem), ($name + $ndx)
}

Here's an example of its use:

$xml = @'
  <foo>
    <bar name='b1'>bar1</bar>
    <other>...</other>
    <bar name='b2'>bar2</bar>
  </foo>
'@

# Get a reference to the 2nd <bar> element:
$node = (Select-Xml -XPath '//bar[@name="b2"]' -Content $xml).Node

# Output the retrieved element's XML text.
"original node: $($node.OuterXml)"

# Obtain the path to that element as an XPath path.
$nodePath = Get-NodeXPath $node

# Output the path.
"path: $nodePath"

# Test the resulting path to see if it finds the original node:
$node = (Select-Xml -XPath $nodePath -Content $xml).Node

"re-queried node: $($node.OuterXml)"

The above yields:

original node: <bar name="b2">bar2</bar>
path: /foo/bar[2]
re-queried node: <bar name="b2">bar2</bar>
查看更多
登录 后发表回答