XSLT 2.0 - Using Grouping To Nest Elements

2019-08-03 06:22发布

问题:

I'm working on a stylesheet that outputs in a hierarchical format from an input file with almost none. Each section has an extremely flat hierarchy, so I've been using a grouping method that was suggested to me - it groups each set by the first node name, and thus makes a nice heirarchy out of flat sections. This method works great - I just need to modify it to account for elements I want to skip over.

Sample input file (note: there are multiple Header elements per Section):

<Root>
    <VolumeName>Volume 1 </VolumeName>
    <Section>
        <SectionName> Section1 </SectionName>
        <Title> Title1 </Title>
        <Header> NameOfDocument1 </Header>
        <Header> Header1 </Header> 
        <Sub1> Sub1 first </Sub1> 
        <Sub1> Sub1 second </Sub1>
        <Sub2> Sub2 first, Sub1 second </Sub2> 
        <Sub1> Sub1 third </Sub1> 
        <Sub2> Sub2 first, Sub1 third </Sub2>
    </Section>

    <Section>
        <SectionName> Section2 </SectionName>
        <Title> Title2 </Title>
        <Header> Header2 </Header> 
        <Sub1> Sub1 first </Sub1> 
        <Sub1> Sub1 second </Sub1>
        <Sub2> Sub2 first, Sub1 second </Sub2> 
        <Sub1> Sub1 third </Sub1> 
        <Sub2> Sub2 first, Sub1 third </Sub2>
    </Section>
</Root>

The output of the sample input code should look like:

<Volume1>
    <Section1 Number="Section1" Name="NameOfDocument1" Title="Title1">
        <Header1>
            <Step>
                Sub1 first
            </Step>
            <Step>
                Sub1 second
                <Step>
                    Sub2 first, Sub1 second
                </Step>
            </Step>
            <Step>
                Sub1 third
                <Step>
                    Sub2 first, Sub1 third
                </Step>
             </Step> 
        </Header1>
    </Section1>

    <Section2 Number="Section2" Name="concat('NameOfDocument','2')" Title="Title2">
            <Step>
                Sub1 first
            </Step>
            <Step>
                Sub1 second
                <Step>
                    Sub2 first, Sub1 second
                </Step>
            </Step>
            <Step>
                Sub1 third
                <Step>
                    Sub2 first, Sub1 third
                </Step>
             </Step> 
    </Section2>
</Volume1>

Thanks to Dimitrie Novatchev, I now have some code that will handle the flat parts inside the Section elements. I have a template match for the Section element, then I declare an element and grab information from SectionName, Title, and sometimes Header to fill in what the element will be called and its attributes. I want to skip SectionName, Title, and sometimes Header and I'm not sure how to modify Dimitrie's code to do so. Any suggestions would be greatly appreciated! Thank you!

Dimitrie's grouping code:

<xsl:stylesheet version="2.0"  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
    <xsl:template match="/*">
        <Root>
            <xsl:apply-templates select="*[1]">
                <xsl:with-param name="pScope" select="*"/>
                <xsl:with-param name="pElemName" select="name(*[1])"/>
            </xsl:apply-templates>
         </Root>
    </xsl:template>

   <xsl:template match="*">
       <xsl:param name="pScope"/>
       <xsl:param name="pElemName" select="'Step'"/>
       <xsl:for-each-group select="$pScope" 
            group-starting-with="*[name()= name($pScope[1])]">
           <xsl:element name="{$pElemName}">
               <xsl:value-of select="."/>
               <xsl:apply-templates select="current-group()[2]">
                   <xsl:with-param name="pScope" 
                       select="current-group()[position() > 1]"/>
               </xsl:apply-templates>
           </xsl:element>
       </xsl:for-each-group>
   </xsl:template>
</xsl:stylesheet>

回答1:

Here is the adaptation of my answer or the initial question to produce the result wanted now:

<xsl:stylesheet version="2.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:template match="/*">
  <Root>
   <xsl:element name="{translate(VolumeName,' ','')}">
    <xsl:apply-templates/>
   </xsl:element>
  </Root>
 </xsl:template>

 <xsl:template match="*">
  <xsl:param name="pScope"/>
  <xsl:param name="pElemName" select="'Step'"/>
  <xsl:for-each-group select="$pScope"
        group-starting-with=
        "*[name()= name($pScope[1])]">
   <xsl:element name="{$pElemName}">
    <xsl:value-of select="."/>
    <xsl:apply-templates select="current-group()[2]">
     <xsl:with-param name="pScope"
          select="current-group()[position() > 1]"/>
    </xsl:apply-templates>
   </xsl:element>
  </xsl:for-each-group>
 </xsl:template>

 <xsl:template match="VolumeName"/>

 <xsl:template match="Section">
  <xsl:element name=
      "{normalize-space(SectionName)}">
   <xsl:attribute name="Number"
        select="normalize-space(SectionName)"/>
   <xsl:attribute name="Name" select=
   "concat('NameOfDocument',
           count(preceding-sibling::Section)+1)"/>
   <xsl:attribute name="Title"
       select="normalize-space(Title)"/>"

   <xsl:variable name="vOutput">
    <xsl:apply-templates select="*[1]">
     <xsl:with-param name="pScope"
        select="Header[last()]/following-sibling::*"/>
     <xsl:with-param name="pElemName" select=
      "(normalize-space(Header[2]), 'Step')[last()]"/>
    </xsl:apply-templates>
   </xsl:variable>

   <xsl:choose>
    <xsl:when test="Header[2]">
     <xsl:element name="{normalize-space(Header[2])}">
      <xsl:sequence select="$vOutput"/>
     </xsl:element>
    </xsl:when>
    <xsl:otherwise>
      <xsl:sequence select="$vOutput"/>
    </xsl:otherwise>
   </xsl:choose>
  </xsl:element>
 </xsl:template>
 </xsl:stylesheet>

when this transformation is applied on the provided XML document:

<Root>
    <VolumeName>Volume 1 </VolumeName>
    <Section>
        <SectionName> Section1 </SectionName>
        <Title> Title1 </Title>
        <Header> NameOfDocument1 </Header>
        <Header> Header1 </Header>
        <Sub1> Sub1 first </Sub1>
        <Sub1> Sub1 second </Sub1>
        <Sub2> Sub2 first, Sub1 second </Sub2>
        <Sub1> Sub1 third </Sub1>
        <Sub2> Sub2 first, Sub1 third </Sub2>
    </Section>
    <Section>
        <SectionName> Section2 </SectionName>
        <Title> Title2 </Title>
        <Header> Header2 </Header>
        <Sub1> Sub1 first </Sub1>
        <Sub1> Sub1 second </Sub1>
        <Sub2> Sub2 first, Sub1 second </Sub2>
        <Sub1> Sub1 third </Sub1>
        <Sub2> Sub2 first, Sub1 third </Sub2>
    </Section>
</Root>

the wanted result is produced:

<Root>
    <Volume1>
        <Section1 Number="Section1" Name="NameOfDocument1" Title="Title1">"

            <Header1>
                <Step> Sub1 first </Step>
                <Step> Sub1 second 
                    <Step> Sub2 first, Sub1 second </Step></Step>
                <Step> Sub1 third 
                    <Step> Sub2 first, Sub1 third </Step></Step>
            </Header1>
        </Section1>
        <Section2 Number="Section2" Name="NameOfDocument2" Title="Title2">"

            <Step> Sub1 first </Step>
            <Step> Sub1 second 
                <Step> Sub2 first, Sub1 second </Step></Step>
            <Step> Sub1 third 
                <Step> Sub2 first, Sub1 third </Step></Step>
        </Section2>
    </Volume1>
</Root>


回答2:

I'll stick to my fine grained traversal approach. This XSLT 1.0 stylesheet:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:strip-space elements="*"/>
    <xsl:template match="*">
        <xsl:param name="pNames" select="'|'"/>
        <xsl:if test="not(contains($pNames,concat('|',name(),'|')))">
            <xsl:variable name="vNext" select="following-sibling::*[1]"/>
            <xsl:variable name="vName">
                <xsl:apply-templates select="." mode="name"/>
            </xsl:variable>
            <xsl:element name="{$vName}">
                <xsl:apply-templates select="node()[1]"/>
                <xsl:apply-templates select="$vNext">
                    <xsl:with-param name="pNames"
                                    select="concat($pNames,name(),'|')"/>
                </xsl:apply-templates>
            </xsl:element>
            <xsl:apply-templates select="$vNext" mode="search">
                <xsl:with-param name="pNames" select="$pNames"/>
                <xsl:with-param name="pSearch" select="name()"/>
            </xsl:apply-templates>
        </xsl:if>
    </xsl:template>
    <xsl:template match="*" mode="search">
        <xsl:param name="pNames"/>
        <xsl:param name="pSearch"/>
        <xsl:if test="not(contains($pNames,concat('|',name(),'|')))">
            <xsl:choose>
                <xsl:when test="name()=$pSearch">
                    <xsl:apply-templates select=".">
                        <xsl:with-param name="pNames" select="$pNames"/>
                    </xsl:apply-templates>
                </xsl:when>
                <xsl:otherwise>
                    <xsl:apply-templates select="following-sibling::*[1]"
                                         mode="search">
                        <xsl:with-param name="pNames" select="$pNames"/>
                        <xsl:with-param name="pSearch" select="$pSearch"/>
                    </xsl:apply-templates>
                </xsl:otherwise>
            </xsl:choose>
        </xsl:if>
    </xsl:template>
    <xsl:template match="SectionName|Title|Header[1]">
        <xsl:variable name="vName">
            <xsl:apply-templates select="." mode="name"/>
        </xsl:variable>
        <xsl:attribute name="{$vName}">
            <xsl:value-of select="."/>
        </xsl:attribute>
        <xsl:apply-templates select="following-sibling::*[1]"/>
    </xsl:template>
    <xsl:template match="SectionName" mode="name">Number</xsl:template>
    <xsl:template match="Title" mode="name">Title</xsl:template>
    <xsl:template match="Header[1]" mode="name">Name</xsl:template>
    <xsl:template match="VolumeName|Section|Header" mode="name">
        <xsl:value-of select="translate((.|SectionName)[last()],' ','')"/>
    </xsl:template>
    <xsl:template match="Sub1|Sub2" mode="name">Step</xsl:template>
    <xsl:template match="*" mode="name">
        <xsl:value-of select="name()"/>
    </xsl:template>
    <xsl:template match="VolumeName/text()|Header/text()"/>
</xsl:stylesheet>

Output:

<Root>
    <Volume1>
        <Section1 Number=" Section1 " Title=" Title1 " 
                  Name=" NameOfDocument1 ">
            <Header1>
                <Step> Sub1 first </Step>
                <Step> Sub1 second 
                    <Step> Sub2 first, Sub1 second </Step>
                </Step>
                <Step> Sub1 third 
                    <Step> Sub2 first, Sub1 third </Step>
                </Step>
            </Header1>
        </Section1>
        <Section2 Number=" Section2 " Title=" Title2 " Name=" Header2 ">
            <Step> Sub1 first </Step>
            <Step> Sub1 second 
                <Step> Sub2 first, Sub1 second </Step>
            </Step>
            <Step> Sub1 third 
                <Step> Sub2 first, Sub1 third </Step>
            </Step>
        </Section2>
    </Volume1>
</Root>

Note: Because calculated names are more complex than mapping I've used a pattern matching approach.

Edit: Stripping whitespace only text nodes (thanks to @Dimitre's comments), so now also shows correct result in Saxon.