Why does XSLT output all text by default?

2019-01-01 07:22发布

问题:

Hi I had performed a transformation which drops a tag if it is null.

I wanted to check whether my transformation is working fine, so instead of checking it manually, I wrote one more XSLT code which just checks the presence of that particular tag in the OUTPUT XML, if it is null, then the second XSLT should output a text \"FOUND\". (I don\'t actually need some XML kind of output but I am just using XSLT for searching.)

When I tried with this XSL code ::

<xsl:stylesheet version=\"1.0\" xmlns:xsl=\"http://www.w3.org/1999/XSL/Transform\">
  <xsl:template match=\"/SiebelMessage//SuppressCalendar[.!=\'\']\">
      FOUND
  </xsl:template>
</xsl:stylesheet>

It outputs all the TEXT DATA that is present in the XML file,

to avoid that, I had to write this code::

<xsl:stylesheet version=\"1.0\" xmlns:xsl=\"http://www.w3.org/1999/XSL/Transform\">
  <xsl:template match=\"/SiebelMessage//SuppressCalendar[.!=\'\']\">
      FOUND
  </xsl:template>
  <xsl:template match=\"text()\"/>
</xsl:stylesheet>

why did the former code outputs TEXT, why should I insist XSL to ignore all other text? is that the behavior of all XML parsers or only of my own (I am using msxml parser).

回答1:

why did the former code outputs TEXT, why should I insist XSL to ignore all other text? is that the behavior of all XML parsers or only of my own

You are discovering one of the most fundamental XSLT features as specified in the Specification: the built-in templates of XSLT.

From the Spec:

There is a built-in template rule to allow recursive processing to continue in the absence of a successful pattern match by an explicit template rule in the stylesheet. This template rule applies to both element nodes and the root node. The following shows the equivalent of the built-in template rule:

<xsl:template match=\"*|/\">
  <xsl:apply-templates/>
</xsl:template>

There is also a built-in template rule for each mode, which allows recursive processing to continue in the same mode in the absence of a successful pattern match by an explicit template rule in the stylesheet. This template rule applies to both element nodes and the root node. The following shows the equivalent of the built-in template rule for mode m.

<xsl:template match=\"*|/\" mode=\"m\">
  <xsl:apply-templates mode=\"m\"/>
</xsl:template>

There is also a built-in template rule for text and attribute nodes that copies text through:

<xsl:template match=\"text()|@*\">
  <xsl:value-of select=\".\"/>
</xsl:template>

The built-in template rule for processing instructions and comments is to do nothing.

<xsl:template match=\"processing-instruction()|comment()\"/>

The built-in template rule for namespace nodes is also to do nothing. There is no pattern that can match a namespace node; so, the built-in template rule is the only template rule that is applied for namespace nodes.

The built-in template rules are treated as if they were imported implicitly before the stylesheet and so have lower import precedence than all other template rules. Thus, the author can override a built-in template rule by including an explicit template rule.

So, the reported behavior is the result of the application of the built-in templates -- the 1st and 2nd of all three of them.

It is a good XSLT design pattern to override the built-in templates with your own that will issue an error message whenever called so that the programmer immediately knows his transformation is \"leaking\":

For example, if there is this XML document:

<a>
  <b>
    <c>Don\'t want to see this</c>
  </b>
</a>

and it is processed with this transformation:

<xsl:stylesheet version=\"1.0\"
 xmlns:xsl=\"http://www.w3.org/1999/XSL/Transform\">
 <xsl:output omit-xml-declaration=\"yes\" indent=\"yes\"/>
 <xsl:strip-space elements=\"*\"/>

 <xsl:template match=\"a|b\">
   <xsl:copy>
      <xsl:attribute name=\"name\">
        <xsl:value-of select=\"name()\"/>
      </xsl:attribute>
      <xsl:apply-templates/>
   </xsl:copy>
 </xsl:template>
</xsl:stylesheet>

the result is:

<a name=\"a\">
   <b name=\"b\">Don\'t want to see this</b>
</a>

and the programmer will be greatly confused how the unwanted text appeared.

However, just adding this catch-all template helps avoid any such confusion and catch errors immediately:

 <xsl:template match=\"*\">
  <xsl:message terminate=\"no\">
   WARNING: Unmatched element: <xsl:value-of select=\"name()\"/>
  </xsl:message>

  <xsl:apply-templates/>
 </xsl:template>

Now, besides the confusing output the programmer gets a warning that explains the problem immediately:

 WARNING: Unmatched element: c

Later Addition by Michael Kay for XSLT 3.0

In XSLT 3.0, rather than adding a catch-all template rule, you can specify the fallback behaviour on an xsl:mode declaration. For example, <xsl:mode on-no-match=\"shallow-skip\"/> causes all nodes that are not matched (including text nodes) to be skipped, while <xsl:mode on-no-match=\"fail\"/> treats a no-match as an error, and <xsl:mode warning-on-no-match=\"true\"/> results in a warning.



回答2:

There are several built in template rules in XSL, one of which is this:

<xsl:template match=\"text()|@*\">
  <xsl:value-of select=\".\"/>
</xsl:template>

It outputs text.



标签: xslt