XSLT-1.0 How to pick multiple tags between two sim

2019-08-30 02:18发布

问题:

I am using xsl to transform xml to xml. Could you please help me to write xsl code to convert input to output? I need the data as rich text data in CDATA for first two tags. Thanks in advance.

Input:

<ATTRIBUTE-VALUE>
    <THE-VALUE>
        <div xmlns="http://www.w3.org/1999/xhtml">
            <h1 dir="ltr" id="_1536217498885">Main Description</h1>
            <p>Line1 The main description text goes here.</p>
            <p>Line2 The main description text goes here.</p>
            <p>**<img alt="Embedded Image" class="embeddedImageLink" id="_1536739954166" src="_9c3778a0-d596-4eef-85fa-052a5e1b2166?accept=none&amp;private"/>**</p>
            <h1 dir="ltr" id="_1536217498886">Key Consideration</h1>
            <p>Line1 The key consideration text goes here.</p>
            <p>Line2 The key consideration text goes here.</p>
            <h1 dir="ltr" id="_1536217498887">Skills</h1>
            <p>Line1 The Skills text goes here.</p>
            <p>Line2 The Skills text goes here.</p>
            <p>Line3 The Skills text goes here.</p>
            <h1 dir="ltr" id="_1536217498888">Synonyms</h1>
            <p>The Synonyms text goes here.</p>
        </div>
    </THE-VALUE>
</ATTRIBUTE-VALUE>

Output:

<MainDescription>
    <![CDATA[
        <p>Line1 The main description text goes here.</p>
        <p>Line2 The main description text goes here.</p>
        <p>**<img alt="Embedded Image" class="embeddedImageLink" id="_1536739954166" src="_9c3778a0-d596-4eef-85fa-052a5e1b2166.jpg"/>**</p>
    ]]>
</MainDescription>
<KeyConsiderations>
    <![CDATA[
        <p>Line1 The key consideration text goes here.</p>
        <p>Line2 The key consideration text goes here.</p>
    ]]>
</KeyConsiderations>
<Skills>
    <p>Line1 The Skills text goes here.</p>
    <p>Line2 The Skills text goes here.</p>
    <p>Line3 The Skills text goes here.</p>
</Skills>
<Synonyms>
    <p>The Synonyms text goes here.</p>
</Synonyms>

I am able to get elements from h1 using below code. But I don't have clue for getting values of '< p >' so I marked it as ?????????. Please help to get solution for ?????????.

<xsl:for-each select="my:THE-VALUE/xhtml:div/xhtml:h1">
    <xsl:variable name="ReqIFTextTags" select="translate(., ' ', '')"></xsl:variable>
    <xsl:element name="{$ReqIFTextTags}">
        <xsl:value-of select="?????????"></xsl:value-of>
    </xsl:element>
</xsl:for-each>

回答1:

Wrapping the siblings following the h1 elements into a wrapper element created from the h1 element in XSLT 1 is possible with a key:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xhtml="http://www.w3.org/1999/xhtml"
    exclude-result-prefixes="xhtml"
    version="1.0">

  <xsl:output method="xml" indent="yes"/>
  <xsl:strip-space elements="*"/>

  <xsl:template match="@* | node()">
    <xsl:copy>
      <xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
  </xsl:template>

  <xsl:key name="h1-group" match="xhtml:div/*[not(self::xhtml:h1)]" use="generate-id(preceding-sibling::xhtml:h1[1])"/>

  <xsl:template match="xhtml:div[xhtml:h1]">
      <xsl:apply-templates select="xhtml:h1"/>
  </xsl:template>

  <xsl:template match="xhtml:h1">
      <xsl:element name="{translate(., ' ', '')}">
          <xsl:apply-templates select="key('h1-group', generate-id())"/>
      </xsl:element>
  </xsl:template>

  <xsl:template match="xhtml:p">
      <p>
          <xsl:apply-templates/>
      </p>
  </xsl:template>

</xsl:stylesheet>

Online at https://xsltfiddle.liberty-development.net/bdxtqy.

Serializing the contents of those elements to markup is best done with an extension function (if your used XSLT 1 processor has one or easily allows setting it up) or with a library for that task like http://lenzconsulting.com/xml-to-string/xml-to-string.xsl, then you can serialize the elements:

<xsl:stylesheet
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xhtml="http://www.w3.org/1999/xhtml"
    exclude-result-prefixes="xhtml"
    version="1.0">

  <xsl:import href="http://lenzconsulting.com/xml-to-string/xml-to-string.xsl"/>

  <xsl:output method="xml" indent="yes"
    cdata-section-elements="MainDescription KeyConsideration"/>
  <xsl:strip-space elements="*"/>

  <xsl:template match="/">
      <xsl:apply-templates/>
  </xsl:template>

  <xsl:template match="@* | node()">
    <xsl:copy>
      <xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
  </xsl:template>

  <xsl:key name="h1-group" match="xhtml:div/*[not(self::xhtml:h1)]" use="generate-id(preceding-sibling::xhtml:h1[1])"/>

  <xsl:template match="xhtml:div[xhtml:h1]">
      <xsl:apply-templates select="xhtml:h1"/>
  </xsl:template>

  <xsl:template match="xhtml:h1">
      <xsl:element name="{translate(., ' ', '')}">
          <xsl:apply-templates select="key('h1-group', generate-id())" mode="xml-to-string"/>
      </xsl:element>
  </xsl:template>


</xsl:stylesheet>

Having a CDATA section requires to know the elements in advance and name then in advance e.g. <xsl:output cdata-section-elements="MainDescription KeyConsideration"/>, as I have done in above sample, also online at https://xsltfiddle.liberty-development.net/bdxtqy/1.

As you have the original elements in the XHTML namespace but your desired output has the serialized p elements in no namespace, you would first need to push the elements through a template that strips the namespace and then push them through the mode xml-to-string, this additionally requires the use of an extension function like exsl:node-set:

<xsl:stylesheet
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xhtml="http://www.w3.org/1999/xhtml"
    xmlns:exsl="http://exslt.org/common"
    exclude-result-prefixes="xhtml exsl"
    version="1.0">

  <xsl:import href="http://lenzconsulting.com/xml-to-string/xml-to-string.xsl"/>

  <xsl:output method="xml" indent="yes"
    cdata-section-elements="MainDescription KeyConsideration"/>
  <xsl:strip-space elements="*"/>

  <xsl:template match="/">
      <xsl:apply-templates/>
  </xsl:template>

  <xsl:template match="@* | node()">
    <xsl:copy>
      <xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
  </xsl:template>

  <xsl:key name="h1-group" match="xhtml:div/*[not(self::xhtml:h1)]" use="generate-id(preceding-sibling::xhtml:h1[1])"/>

  <xsl:template match="xhtml:div[xhtml:h1]">
      <xsl:apply-templates select="xhtml:h1"/>
  </xsl:template>

  <xsl:template match="xhtml:h1">
      <xsl:element name="{translate(., ' ', '')}">
          <xsl:variable name="rtf-with-xhtml-ns-stripped">
              <xsl:apply-templates select="key('h1-group', generate-id())"/>
          </xsl:variable>
          <xsl:apply-templates select="exsl:node-set($rtf-with-xhtml-ns-stripped)/node()" mode="xml-to-string"/>
      </xsl:element>
  </xsl:template>

  <xsl:template match="xhtml:p">
      <p>
          <xsl:apply-templates/>
      </p>
  </xsl:template>

</xsl:stylesheet>

https://xsltfiddle.liberty-development.net/bdxtqy/2