Removing line breaks and broken entities using XSL

2019-09-21 11:58发布

My XML is being generated from a web form and some users are inserting line breaks and characters that being converted to line breaks \n and broken entities, like &

I'm using some variables to convert and remove bad characters, but I don't know how to strip out these types of characters.

Here's the method I'm using to convert or strip out other bad characters. Let me know if you need to see the entire XSL. …

<xsl:variable name="smallcase" select="'abcdefghijklmnopqrstuvwxyz_aaea'" />
<xsl:variable name="uppercase" select="'ABCDEFGHIJKLMNOPQRSTUVWXYZ äãêÂ.,'" />
<xsl:variable name="linebreaks" select="'\n'" />
<xsl:variable name="nolinebreaks" select="' '" />

<xsl:value-of select="translate(Surname, $uppercase, $smallcase)"/>
<xsl:value-of select="translate(normalize-space(Office_photos), $linebreaks, $nolinebreaks)"/>

The text in the XML contains content like this:

<Office_photos>bn_1.jpg: Showing a little Red Sox Pride!&#13;\nLeft to right: 
 Tessa Michelle Summers, \nJulie Gross, Alexis Drzewiecki</Office_photos>

I'm trying to get rid of the \n character inside the data

1条回答
可以哭但决不认输i
2楼-- · 2019-09-21 12:31

As Lingamurthy CS explains in the comments \n is not treated as a single character in XML. It is simply parsed into two characters without any special handling.

If this is literally want you want to change though, then in XSLT 1.0 you will need to use a recursive template to replace the text (XSLT 2.0 has a replace function, XSLT 1.0 doesn't).

A quick search on Stackoverflow finds one such template at XSLT string replace

To call this, instead of doing this....

<xsl:value-of select="translate(normalize-space(Office_photos), $linebreaks, $nolinebreaks)"/>

You would just do this

  <xsl:call-template name="string-replace-all">
     <xsl:with-param name="text" select="Office_photos" />
     <xsl:with-param name="replace" select="$linebreaks" />
     <xsl:with-param name="by" select="$nolinebreaks" /> 
  </xsl:call-template>

Try this XSLT

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
   <xsl:output omit-xml-declaration="yes" indent="yes" />

   <xsl:variable name="linebreaks" select="'\n'" />
   <xsl:variable name="nolinebreaks" select="' '" />

   <xsl:template match="/">
      <xsl:call-template name="string-replace-all">
         <xsl:with-param name="text" select="Office_photos" />
         <xsl:with-param name="replace" select="$linebreaks" />
         <xsl:with-param name="by" select="$nolinebreaks" /> 
      </xsl:call-template>
   </xsl:template>

   <xsl:template name="string-replace-all">
     <xsl:param name="text" />
     <xsl:param name="replace" />
     <xsl:param name="by" />
     <xsl:choose>
       <xsl:when test="contains($text, $replace)">
         <xsl:value-of select="substring-before($text,$replace)" />
         <xsl:value-of select="$by" />
         <xsl:call-template name="string-replace-all">
           <xsl:with-param name="text" select="substring-after($text,$replace)" />
           <xsl:with-param name="replace" select="$replace" />
           <xsl:with-param name="by" select="$by" />
         </xsl:call-template>
       </xsl:when>
       <xsl:otherwise>
         <xsl:value-of select="$text" />
       </xsl:otherwise>
     </xsl:choose>
   </xsl:template>
</xsl:stylesheet>

(Credit to Mark Elliot who created the replace template)

查看更多
登录 后发表回答