I've been having a hell of a week trying to write XSLT code that can process XML documents that conform to the (very permissive) EAD standards.
The useful information in an EAD document is hard to locate precisely. Different EAD documents can place the same bit of information in entirely different parts of the data tree. In addition, within a single EAD document, the same tag can be used numerous times in different locations for different information. For an example of this, please see this SO post. This makes it hard to design a single XSLT file that properly handles these different files.
In general terms, the problem can be described as:
- How do I select a specific EAD node which is in an unknown location,
- Without accidentally selecting unwanted nodes that have the same
name()
?
I've finally put together the XSLT I needed and thought it would be best to drop a generic version of the code here so others can benifit from it or improve upon it.
I'd love to tag this question with an "EAD" tag, but I don't have enough rep. If anyone with the appropriate amount of rep thinks it would be useful, please do so.
First a quick description of the solution, followed by the code.
<cXX>
). If not, we don't have to worry about duplicate EAD tags. The tags can still be burried under arbitrary wrappers. To find them, see step 3.<dsc>
tag until other tags are processed. To find the other tags, see step 3, then step 4 to process child records.apply-template
on any element node farther down the tree.Here's the (generic version of the) XSLT code I came up with: