I'm trying to split a tree of elements based on the location of a descendent element. (In particular, I'm trying to parse Adobe's IDML.) I'd like to be able to convert a tree that looks like:
<ParagraphStyleRange style="foo">
<CharacterStyleRange style="bar">
<Content>foo</Content>
<Br />
<Content>bar</Content>
</CharacterStyleRange>
<CharacterStyleRange style="bop">
<Content>baz</Content>
<Br />
<Hyperlink>
<Content>boo</Content>
<Br />
<Content>meep</Content>
</Hyperlink>
</ParagraphStyleRange>
into split trees:
<ParagraphStyleRange style="foo">
<CharacterStyleRange style="bar">
<Content>foo</Content>
</CharacterStyleRange>
</ParagraphStyleRange>
<ParagraphStyleRange style="foo">
<CharacterStyleRange style="bar">
<Content>bar</Content>
</CharacterStyleRange>
<CharacterStyleRange style="bop">
<Content>baz</Content>
</CharacterStyleRange>
</ParagraphStyleRange>
<ParagraphStyleRange style="foo">
<CharacterStyleRange style="bop">
<Hyperlink>
<Content>boo</Content>
</Hyperlink>
</CharacterStyleRange>
</ParagraphStyleRange>
<ParagraphStyleRange style="foo">
<CharacterStyleRange style="bop">
<Hyperlink>
<Content>meep</Content>
</Hyperlink>
</CharacterStyleRange>
</ParagraphStyleRange>
which I can then parse using normal XSL. (EDIT: I originally showed the <Br/>
tags in their original place, but it doesn't really matter if they are there or not, since the information they contained is now represented by the split elements. I think it's probably easier to solve this problem without worrying about keeping them in.)
I tried using xsl:for-each-group
as suggested in the XSLT 2.0 spec (e.g. <xsl:for-each-group select="CharacterStyleRange/*" group-ending-with="Br">
), but I can't figure out how to apply that at every level of the tree (<Br />
tags can appear at any level, e.g. inside a <Hyperlink>
element inside of a <CharacterStyleRange>
element, and it also limits me to only having templates that apply at the chosen depth.
EDIT: My example code shows only one place where the tree needs to be split, but there can be any number of split points (always the same element, though.)
EDIT 2: I've added some a more detailed example, to show some of complications.
This XSLT 1.0 (and of course, also XSLT 2.0) transformation:
when applied on the provided XML document:
produces the wanted, correct result: