Exclude certain child nodes when data structure is

2019-06-23 06:37发布


EDIT - I've figured out the solution to my problem and posted a Q&A here.

I'm looking to process XML conforming to the Library of Congress EAD standard (found here). Unfortunately, the standard is very loose regarding the structure of the XML.

For example the <bioghist> tag can exist within the <archdesc> tag, or within a <descgrp> tag, or nested within another <bioghist> tag, or a combination of the above, or can be left out entirely. I've found it to be very difficult to select just the bioghist tag I'm looking for without also selecting others.

Below are a few different possible EAD XML documents my XSLT might have to process:

First example


Second example


Third example


As you can see, an EAD XML file might have a <bioghist> tag almost anywhere. The actual output I'm suppose to produce is too complicated to post here. A simplified example of the output for the above three EAD examples might be like:

Output for First example


Output for Second example


Output for Third example


If I want to pull the "first" bioghist value and put that in the <primary_record>, I can't simply <xsl:apply-templates select="/ead/eadheader/archdesc/bioghist", as that tag might not be a direct descendant of the <archdesc> tag. It might be wrapped by a <descgrp> or a <bioghist> or a combination thereof. And I can't select="//bioghist", because that will pull all the <bioghist> tags. I can't even select="//bioghist[1]" because there might not actually be a <bioghist> tag there and then I'll be pulling the value below the <c01>, which is "Second" and should be processed later.

This is already a long post, but one other wrinkle is that there can be an unlimited number of <cxx> nodes, nested up to twelve levels deep. I'm currently processing them recursively. I've tried saving the node I'm currently processing (<c01> for example) as a variable called 'RN', then running <xsl:apply-templates select=".//bioghist [name(..)=name($RN) or name(../..)=name($RN)]">. This works for some forms of EAD, where the <bioghist> tag isn't nested too deeply, but it will fail if it ever has to process an EAD file created by someone who loves wrapping tags in other tags (which is totally fine according to the EAD Standard).

What I'd love is someway of saying

  • Get any <bioghist> tag anywhere below the current node but
  • don't dig deeper if you hit a <c??> tag

I hope that I've made the situation clear. Please let me know if I've left anything ambiguous. Any assistance you can provide would be greatly appreciated. Thanks.


As the requirements are rather vague, any answer only reflects the guesses its author has made.

Here is mine:

<xsl:stylesheet version="1.0"
 xmlns:my="my:my" exclude-result-prefixes="my">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>


 <xsl:variable name="vNames" select="document('')/*/my:names/*"/>

 <xsl:template match="/">
  <xsl:apply-templates select=

 <xsl:template match="bioghist">
  <xsl:variable name="vPos" select="position()"/>

  <xsl:element name="{$vNames[position() = $vPos]}">
   <xsl:value-of select="."/>

 <xsl:template match="text()"/>

When this transformation is applied on the provided XML document:


the wanted result is produced:



I worked out a solution on my own and posted it at this Q&A because the solution is quite specific to a certain XML standard and seemed out of the scope of this question. If people feel it would be best to post it here as well, I can update this answer with a copy.