With XSLT 1.0, the regex methods of XSLT 2.0 are generally unavailable. Is there any non-regex way of replacing multiple fields in a node in a source xml document, for example to convert:
<?xml version="1.0" encoding="utf-8"?>
<xliff xmlns:xliff="urn:oasis:names:tc:xliff:document:1.1" version="1.1">
<file>
<source>abc [[field1]] def [[field2]] ghi</source>
</file>
</xliff>
to:
<?xml version="1.0" encoding="utf-8"?>
<xliff xmlns:xliff="urn:oasis:names:tc:xliff:document:1.1" version="1.1">
<file>
<source>abc F def F ghi</source>
</file>
</xliff>
I. XSLT 1.0 Solution:
This transformation:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:param name="pTargetStart" select="'[['"/>
<xsl:param name="pTargetEnd" select="']]'"/>
<xsl:param name="pReplacement" select="'F'"/>
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="source/text()" name="replace">
<xsl:param name="pText" select="."/>
<xsl:param name="pTargetStart" select="$pTargetStart"/>
<xsl:param name="pTargetEnd" select="$pTargetEnd"/>
<xsl:param name="pRep" select="$pReplacement"/>
<xsl:choose>
<xsl:when test=
"not(contains($pText, $pTargetStart)
and
contains($pText, $pTargetEnd)
)
or
not(contains(substring-after($pText, $pTargetStart),
$pTargetEnd
)
)
">
<xsl:value-of select="$pText"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="substring-before($pText, $pTargetStart)"/>
<xsl:value-of select="$pRep"/>
<xsl:variable name="vremText" select=
"substring-after(substring-after($pText, $pTargetStart),
$pTargetEnd
)"/>
<xsl:call-template name="replace">
<xsl:with-param name="pText" select="$vremText"/>
<xsl:with-param name="pTargetStart" select="$pTargetStart"/>
<xsl:with-param name="pTargetEnd" select="$pTargetEnd"/>
<xsl:with-param name="pRep" select="$pRep"/>
</xsl:call-template>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
when applied on the provided XML document:
<xliff xmlns:xliff="urn:oasis:names:tc:xliff:document:1.1" version="1.1">
<file>
<source>abc [[field1]] def [[field2]] ghi</source>
</file>
</xliff>
produces the wanted, correct result:
<xliff xmlns:xliff="urn:oasis:names:tc:xliff:document:1.1" version="1.1">
<file>
<source>abc F def F ghi</source>
</file>
</xliff>
II. XSLT 2.0 Solution (just for comparison):
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="source/text()">
<xsl:sequence select="replace(., '\[\[(.*?)\]\]', 'F')"/>
</xsl:template>
</xsl:stylesheet>
EXSLT has some good functions for you. If you need to replace simple strings, try str:replace. An XSLT 1.0 template implementation is given.
EDIT 1
I just realized Dimitre's version uses recursion and is quite similar; so my opening sentence seems silly now.
Here's a version that uses recursion:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:variable name="fld-beg" select="'[['"/>
<xsl:variable name="fld-end" select="']]'"/>
<xsl:variable name="replacement" select="'F'"/>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="source/text()">
<xsl:call-template name="replace">
<xsl:with-param name="str" select="."/>
</xsl:call-template>
</xsl:template>
<xsl:template name="replace">
<xsl:param name="str"/>
<xsl:choose>
<xsl:when test="contains($str, $fld-beg) and contains($str, $fld-end)">
<xsl:call-template name="replace">
<xsl:with-param name="str" select="concat(
substring-before($str, $fld-beg),
$replacement,
substring-after($str, $fld-end))"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$str"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
match="source/text()"
matches all the text in 'source' node as one string and passes it to the named pattern 'replace'. 'replace' looks for occurrences of the beginning and ending delimiters ('[[' and ']]'), and if found splits the text at (and thus ignoring) the delimiters, inserts the replacement string, and passes all that to itself to repeat the process.
I say "split", but given the lack of a real split()
in XPath 1.0, we can get by teaming up substring-before()
and substring-after()
.
Given the text in the source, 'abc [[field1]] def [[field2]] ghi'
, the recursion goes like this, showing how it's split, replaced, and passed:
'abc ' + 'F' + def [[field2]] ghi'
, passed again into 'replacement'
'abc F def ' + 'F' + ' ghi'
, passed again into 'replacement'
- since the delimiters are not present,
'abc F def F ghi'
is passed back up to match="source/text()"
Here's how it looks with xsltproc
:
$ xsltproc so.xsl so.xml
<?xml version="1.0"?>
<xliff xmlns:xliff="urn:oasis:names:tc:xliff:document:1.1" version="1.1">
<file>
<source>abc F def F ghi</source>
</file>
</xliff>
I hope this helps.
You can use Java inside XSL, example for replaceAll:
<xsl:template name="replace_all" xmlns:string="java.lang.String">
<xsl:param name="text"/>
<xsl:param name="pattern"/>
<xsl:param name="replace"/>
<xsl:variable name="text_string" select="string:new($text)"/>
<xsl:value-of select="string:replaceAll($text_string, $pattern, $replace)"/>
</xsl:template>
pattern is a regexp. For further info see:
String javadoc