split an element into two or more elements dependi

2019-06-05 13:04发布

EDIT: for those that come to this in the future, this was a poorly written question. It was not what I was after. This Question may also be of use to you.

So, I have been trying to brush up on my XSLT the past few days. I am very unfamiliar with it, spending most of my past using XQuery to transform my XML. I am stuck on a rather simple problem, but looking around I have not found a clear solution. Simply, I want to split some elements into two depending on its children.

For example, if my XML looks like the following:

<?xml version="1.0" encoding="UTF-8"?>    
<root>
      <p>
         Bacon ipsum dolor sit amet bacon chuck pastrami swine pork rump, shoulder beef ribs doner tri-tip 
         tongue. Tri-tip ground round short ribs capicola meatloaf shank drumstick short loin pastrami t-
         bone. Sirloin turducken short ribs t-bone andouille strip steak pork loin corned beef hamburger 
         bacon filet mignon pork chop tail.
         <note.ref id="0001"><super>1</super></note.ref>
         <note id="0001">
           <p>
             You may need to consult a latin butcher. Good Luck.
           </p>
         </note>   
       Pork loin ribeye bacon pastrami drumstick sirloin, shoulder pig jowl. Salami brisket rump ham, tail
      hamburger strip steak pig ham hock short ribs jerky shank beef spare ribs. Capicola short ribs swine   
      beef meatball jowl pork belly. Doner leberkas short ribs, flank chuck pancetta bresaola bacon ham 
      hock pork hamburger fatback.
    </p>
    </root>

after I run my xsl I am left with something like the following:

<html>
<body>
   <p>
         Bacon ipsum dolor sit amet bacon chuck pastrami swine pork rump, shoulder beef ribs doner tri-tip 
         tongue. Tri-tip ground round short ribs capicola meatloaf shank drumstick short loin pastrami t-
         bone. Sirloin turducken short ribs t-bone andouille strip steak pork loin corned beef hamburger 
         bacon filet mignon pork chop tail.
         <span class="noteRef" id="0001"><sup>1</sup></span>
         <div id="note-0001"> 
           <p>
               You may need to consult a latin butcher. Good Luck.
           </p>
         </div>
           Pork loin ribeye bacon pastrami drumstick sirloin, shoulder pig jowl. Salami brisket rump ham, tail
           hamburger strip steak pig ham hock short ribs jerky shank beef spare ribs. Capicola short ribs swine   
           beef meatball jowl pork belly. Doner leberkas short ribs, flank chuck pancetta bresaola bacon ham 
           hock pork hamburger fatback.
   </p>
</body>
</html>

The problem with this is obviously an HTML <p> cannot have a <div> as a child, let a lone another <p> as a grandchild. This is just invalid. A browser, such as chromium, may render the first paragraph ending when it hits the <div>, wrapping, appropriately, the note in its own <p>, but leaving the text after the note orpahened. So that any CSS applied to the <p> will fail to be applied.

How would I split one <p> element into two depending on the elements descendants?

Desired output

  <html>
    <body>
       <p>
             Bacon ipsum dolor sit amet bacon chuck pastrami swine pork rump, shoulder beef ribs doner tri-tip 
             tongue. Tri-tip ground round short ribs capicola meatloaf shank drumstick short loin pastrami t-
             bone. Sirloin turducken short ribs t-bone andouille strip steak pork loin corned beef hamburger 
             bacon filet mignon pork chop tail.
             <span class="noteRef" id="0001"><sup>1</sup></span><
</p>
             <div id="note-0001"> 
               <p>
                   You may need to consult a latin butcher. Good Luck.
               </p>
             </div>
<p>
               Pork loin ribeye bacon pastrami drumstick sirloin, shoulder pig jowl. Salami brisket rump ham, tail
               hamburger strip steak pig ham hock short ribs jerky shank beef spare ribs. Capicola short ribs swine   
               beef meatball jowl pork belly. Doner leberkas short ribs, flank chuck pancetta bresaola bacon ham 
               hock pork hamburger fatback.
       </p>
    </body>
    </html>

I have abstracted my question slightly, so the following XSL of what I have tried could be slightly off.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xd="http://www.oxygenxml.com/ns/doc/xsl"
    exclude-result-prefixes="xs xd" version="2.0">


<xsl:template match="/">

        <html>
          <body>
             <xsl:apply-templates/>
          </body>
        </html>
</xsl:template>

    <xsl:template match="p">
        <p>
            <xsl:apply-templates/>
        </p>
    </xsl:template


    <xsl:template match="note.ref">
        <span class="noteRef" id="{@id}">
            <xsl:apply-templates/>
        </span>
    </xsl:template>

    <xsl:template match="super">
        <sup>
            <xsl:apply-templates/>
        </sup>
    </xsl:template>

    <xsl:template match="note">
          <div id="note-{@id}">
            <xsl:apply-templates/>
        </div>
    </xsl:template>

</xsl:stylesheet>

2条回答
神经病院院长
2楼-- · 2019-06-05 13:35

This may be too simplified, but you could try matching the text() in a p that contains a note and wrapping it (along with any note.ref following the text())...

XML Input

<root>
    <p>
        Bacon ipsum dolor sit amet bacon chuck pastrami swine pork rump, shoulder beef ribs doner tri-tip 
        tongue. Tri-tip ground round short ribs capicola meatloaf shank drumstick short loin pastrami t-
        bone. Sirloin turducken short ribs t-bone andouille strip steak pork loin corned beef hamburger 
        bacon filet mignon pork chop tail.
        <note.ref id="0001"><super>1</super></note.ref>
        <note id="0001">
            <p>
                You may need to consult a latin butcher. Good Luck.
            </p>
        </note>   
        Pork loin ribeye bacon pastrami drumstick sirloin, shoulder pig jowl. Salami brisket rump ham, tail
        hamburger strip steak pig ham hock short ribs jerky shank beef spare ribs. Capicola short ribs swine   
        beef meatball jowl pork belly. Doner leberkas short ribs, flank chuck pancetta bresaola bacon ham 
        hock pork hamburger fatback.
    </p>
</root>

XSLT 2.0 (would work as 1.0 also)

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output indent="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:template match="@*|*|processing-instruction()|comment()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="/root">        
        <html>
            <body>
                <xsl:apply-templates/>
            </body>
        </html>
    </xsl:template>

    <xsl:template match="p[note]">
        <xsl:apply-templates/>
    </xsl:template>

    <xsl:template match="p[note]/text()">
        <p>
            <xsl:value-of select="normalize-space(.)"/>
            <xsl:apply-templates select="following-sibling::note.ref" mode="keep"/>
        </p>
    </xsl:template> 

    <xsl:template match="note">
        <div id="note-{@id}">
            <xsl:apply-templates/>
        </div>
    </xsl:template>

    <xsl:template match="note.ref"/>
    <xsl:template match="note.ref" mode="keep">
        <span class="noteRef" id="{@id}">
            <xsl:apply-templates/>
        </span>
    </xsl:template>

    <xsl:template match="super">
        <sup>
            <xsl:apply-templates/>
        </sup>
    </xsl:template>

    <xsl:template match="text()">
        <xsl:value-of select="normalize-space(.)"/>
    </xsl:template>

</xsl:stylesheet>

Output

<html>
   <body>
      <p>Bacon ipsum dolor sit amet bacon chuck pastrami swine pork rump, shoulder beef ribs
         doner tri-tip tongue. Tri-tip ground round short ribs capicola meatloaf shank drumstick
         short loin pastrami t- bone. Sirloin turducken short ribs t-bone andouille strip steak
         pork loin corned beef hamburger bacon filet mignon pork chop tail.<span class="noteRef" id="0001"><sup>1</sup></span></p>
      <div id="note-0001">
         <p>You may need to consult a latin butcher. Good Luck.</p>
      </div>
      <p>Pork loin ribeye bacon pastrami drumstick sirloin, shoulder pig jowl. Salami brisket
         rump ham, tail hamburger strip steak pig ham hock short ribs jerky shank beef spare
         ribs. Capicola short ribs swine beef meatball jowl pork belly. Doner leberkas short
         ribs, flank chuck pancetta bresaola bacon ham hock pork hamburger fatback.
      </p>
   </body>
</html>
查看更多
兄弟一词,经得起流年.
3楼-- · 2019-06-05 13:44

Assuming an XSLT 2.0 processor I think using for-each-group can help:

<xsl:stylesheet
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  exclude-result-prefixes="xs"
  version="2.0">

<xsl:output method="html" indent="yes" version="5.0"/>

<xsl:template match="/">
  <html>
    <body>
      <xsl:apply-templates/>
    </body>
  </html>
</xsl:template>

<xsl:template match="p[not((.//p, .//div))]">
  <xsl:copy>
    <xsl:apply-templates select="@* , node()"/>
  </xsl:copy>
</xsl:template>

<xsl:template match="p[.//p, .//div]">
  <xsl:for-each-group select="node()" group-adjacent="boolean((self::text(), self::note.ref))">
    <xsl:choose>
      <xsl:when test="current-grouping-key()">
        <p>
          <xsl:apply-templates select="current()/@*, current-group()"/>
        </p>
      </xsl:when>
      <xsl:otherwise>
        <xsl:apply-templates select="current-group()"/>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:for-each-group>
</xsl:template>

<xsl:template match="note.ref">
    <span class="noteRef" id="{@id}">
        <xsl:apply-templates/>
    </span>
</xsl:template>

<xsl:template match="super">
    <sup>
        <xsl:apply-templates/>
    </sup>
</xsl:template>

<xsl:template match="note">
      <div id="note-{@id}">
        <xsl:apply-templates/>
    </div>
</xsl:template>

</xsl:stylesheet>

The patterns p[not((.//p, .//div))] and p[.//p, .//div] and the group-adjacent expression boolean((self::text(), self::note.ref)) might need to be extended to cover other types of nodes you expect in the input and that need the same processing.

查看更多
登录 后发表回答