XSLT: Remove excess whitespace characters preservi

2020-04-23 06:51发布

So my problem is this. I have a transform document which is used in many places, and generically handles a lot of small formatting transforms. In one specific case, I need to remove whitespace from the result. The output looks something like:

'\n         <I>Something</I>Very Significant With A Superscript Note<SUP>1</SUP>\n      '

I've tried variations on:

<xsl:template match="no_whitespace">
    <xsl:variable name="result">
        <xsl:apply-templates/>
    </xsl:variable>
    <xsl:copy-of select="normalize-space($result)"/>
</xsl:template>

but the subnodes are stripped from the output. I have to be very careful not to set up any universal templates like 'text()' as it'll interfere with the general processing of the transform. It seems like I'm missing something obvious here.

EDIT: Tried writing an identity transform as suggested by Stefan-Hegny.

<xsl:template match="title_full">
    <xsl:apply-templates mode="stripwhitespace"/>
</xsl:template>

<xsl:template match="text()" mode="stripwhitespace">
    <xsl:value-of select="normalize-space(translate(., '\n', ''))"/>
</xsl:template>

<xsl:template match="/ | @* | *" mode="stripwhitespace">
    <xsl:apply-templates select="."/>
</xsl:template>

This solved my issue, which was to remove whitespace and newlines at the highest level of the tag, then allow transforms to proceed normally. Apologies for the murkily-constructed question, and thanks for your assistance.

EDIT the second: The use of 'translate' doesn't work as I expected, it works character by character. I used a transform that replaces substrings instead.

3条回答
Explosion°爆炸
2楼-- · 2020-04-23 07:09

It cound just indentation in the output. Do you have <xsl:output indent="yes"/> at the top of you xsl? Or maybe the processor is enforceing indentation. Using <xsl:output indent="no"/> should soak up all the \n and indentation.

查看更多
疯言疯语
3楼-- · 2020-04-23 07:12

When you use normalize-space then only the text-value of your fragment is used and thus sub-nodes are stripped. You'd have to put the normalize-space into templates for the sub-nodes as well (those that are applied by your <xsl:apply-templates/>

查看更多
倾城 Initia
4楼-- · 2020-04-23 07:28

I've got two options:

If the \n it not literal then do this:

<xsl:template match="text()[ancestor-or-self::no_whitespace]">
    <xsl:value-of select="normalize-space(.)"/>
</xsl:template>

to clean up all white space at and below the no_whitespace tag.

If the \n is a literal in the string then its get a bit more complex to get rid of the \n. Use this:

<xsl:template name="strip_newline">
    <xsl:param name="string"/>
    <xsl:value-of select="substring-before($string,'\n')"/>
    <xsl:variable name="rhs" select="substring-after($string,'\n')"/>
    <xsl:if test="$rhs">
        <xsl:call-template name="strip_newline">
            <xsl:with-param name="string" select="$rhs"/>
        </xsl:call-template>
    </xsl:if>
</xsl:template>

<xsl:template match="text()[ancestor-or-self::no_whitespace]">
    <xsl:value-of select="normalize-space(.)"/>
</xsl:template>

<xsl:template match="text()[ancestor-or-self::no_whitespace][contains(.,'\n')]">
    <xsl:variable name="cleantext">
        <xsl:call-template name="strip_newline">
            <xsl:with-param name="string" select="."/>
        </xsl:call-template>
    </xsl:variable>
    <xsl:value-of select="normalize-space($cleantext)"/>
</xsl:template>

In both cases, I'm assuming you already have an identity template in place elsewhere in you xsl:

<xsl:template match="node()|@*">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
</xsl:template>
查看更多
登录 后发表回答