EDIT: for those that come to this in the future, this was a poorly written question. It was not what I was after. This Question may also be of use to you.
So, I have been trying to brush up on my XSLT
the past few days. I am very unfamiliar with it, spending most of my past using XQuery
to transform my XML. I am stuck on a rather simple problem, but looking around I have not found a clear solution. Simply, I want to split some elements into two depending on its children.
For example, if my XML looks like the following:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<p>
Bacon ipsum dolor sit amet bacon chuck pastrami swine pork rump, shoulder beef ribs doner tri-tip
tongue. Tri-tip ground round short ribs capicola meatloaf shank drumstick short loin pastrami t-
bone. Sirloin turducken short ribs t-bone andouille strip steak pork loin corned beef hamburger
bacon filet mignon pork chop tail.
<note.ref id="0001"><super>1</super></note.ref>
<note id="0001">
<p>
You may need to consult a latin butcher. Good Luck.
</p>
</note>
Pork loin ribeye bacon pastrami drumstick sirloin, shoulder pig jowl. Salami brisket rump ham, tail
hamburger strip steak pig ham hock short ribs jerky shank beef spare ribs. Capicola short ribs swine
beef meatball jowl pork belly. Doner leberkas short ribs, flank chuck pancetta bresaola bacon ham
hock pork hamburger fatback.
</p>
</root>
after I run my xsl
I am left with something like the following:
<html>
<body>
<p>
Bacon ipsum dolor sit amet bacon chuck pastrami swine pork rump, shoulder beef ribs doner tri-tip
tongue. Tri-tip ground round short ribs capicola meatloaf shank drumstick short loin pastrami t-
bone. Sirloin turducken short ribs t-bone andouille strip steak pork loin corned beef hamburger
bacon filet mignon pork chop tail.
<span class="noteRef" id="0001"><sup>1</sup></span>
<div id="note-0001">
<p>
You may need to consult a latin butcher. Good Luck.
</p>
</div>
Pork loin ribeye bacon pastrami drumstick sirloin, shoulder pig jowl. Salami brisket rump ham, tail
hamburger strip steak pig ham hock short ribs jerky shank beef spare ribs. Capicola short ribs swine
beef meatball jowl pork belly. Doner leberkas short ribs, flank chuck pancetta bresaola bacon ham
hock pork hamburger fatback.
</p>
</body>
</html>
The problem with this is obviously an HTML
<p>
cannot have a <div>
as a child, let a lone another <p>
as a grandchild. This is just invalid. A browser, such as chromium, may render the first paragraph ending when it hits the <div>
, wrapping, appropriately, the note in its own <p>
, but leaving the text after the note orpahened. So that any CSS applied to the <p>
will fail to be applied.
How would I split one <p>
element into two depending on the elements descendants?
Desired output
<html>
<body>
<p>
Bacon ipsum dolor sit amet bacon chuck pastrami swine pork rump, shoulder beef ribs doner tri-tip
tongue. Tri-tip ground round short ribs capicola meatloaf shank drumstick short loin pastrami t-
bone. Sirloin turducken short ribs t-bone andouille strip steak pork loin corned beef hamburger
bacon filet mignon pork chop tail.
<span class="noteRef" id="0001"><sup>1</sup></span><
</p>
<div id="note-0001">
<p>
You may need to consult a latin butcher. Good Luck.
</p>
</div>
<p>
Pork loin ribeye bacon pastrami drumstick sirloin, shoulder pig jowl. Salami brisket rump ham, tail
hamburger strip steak pig ham hock short ribs jerky shank beef spare ribs. Capicola short ribs swine
beef meatball jowl pork belly. Doner leberkas short ribs, flank chuck pancetta bresaola bacon ham
hock pork hamburger fatback.
</p>
</body>
</html>
I have abstracted my question slightly, so the following XSL
of what I have tried could be slightly off.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xd="http://www.oxygenxml.com/ns/doc/xsl"
exclude-result-prefixes="xs xd" version="2.0">
<xsl:template match="/">
<html>
<body>
<xsl:apply-templates/>
</body>
</html>
</xsl:template>
<xsl:template match="p">
<p>
<xsl:apply-templates/>
</p>
</xsl:template
<xsl:template match="note.ref">
<span class="noteRef" id="{@id}">
<xsl:apply-templates/>
</span>
</xsl:template>
<xsl:template match="super">
<sup>
<xsl:apply-templates/>
</sup>
</xsl:template>
<xsl:template match="note">
<div id="note-{@id}">
<xsl:apply-templates/>
</div>
</xsl:template>
</xsl:stylesheet>
This may be too simplified, but you could try matching the
text()
in ap
that contains anote
and wrapping it (along with anynote.ref
following thetext()
)...XML Input
XSLT 2.0 (would work as 1.0 also)
Output
Assuming an XSLT 2.0 processor I think using
for-each-group
can help:The patterns
p[not((.//p, .//div))]
andp[.//p, .//div]
and thegroup-adjacent
expressionboolean((self::text(), self::note.ref))
might need to be extended to cover other types of nodes you expect in the input and that need the same processing.