How to split string in XML

2020-02-02 02:50发布

问题:

I have this kind of XSL

<xsl:stylesheet  xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    <xsl:output indent="yes"/>

    <xsl:template match="dataroot">
        <xml><xsl:apply-templates/></xml>
    </xsl:template>

    <xsl:template match="M_17">
        <package id="{package_id}" cat="{cat}">
            <nazwa><xsl:value-of select="nazwa"/></nazwa>
            <xsl:if test="author"><author><xsl:value-of select="author"/></author></xsl:if>
            <xsl:if test="www"><www><xsl:value-of select="translate(www,'#','')"/></www></xsl:if>
            <xsl:if test="opis"><opis><xsl:value-of select="opis"/></opis></xsl:if>
            <xsl:if test="img"><img><xsl:value-of select="translate(img,'#','')"/></img></xsl:if>

            <xsl:if test="depends"><depends><xsl:value-of select="depends"/></depends></xsl:if>
            <xsl:if test="conflicts"><conflicts><xsl:value-of select="conflicts"/></conflicts></xsl:if>
            <xsl:if test="after"><after><xsl:value-of select="after"/></after></xsl:if>
            <xsl:if test="replaces"><replaces><xsl:value-of select="replaces"/></replaces></xsl:if>
        </package>
    </xsl:template>

</xsl:stylesheet>

but when there is in eg. depends 2 values which in this code will show

<depends>modload com1node</depends>

but I want to convert it by followed XSL to:

<depends>modloader</depends>
<depends>com1node</depends>

This should happened for: depends, conflicts, after and replaces

How to split this strings (if they occur in source XML) into simple ones (one in each line as I show in example)?

Part of Core XML

<?xml version="1.0" encoding="UTF-8"?>
<dataroot xmlns:od="urn:schemas-microsoft-com:officedata" generated="2014-05-11T15:51:32">
    <Mnc_172>
        <ID>1</ID>
        <package_id>minecraft</package_id>
        <cat>lib</cat>
        <www>#http://minecraft.net/#</www>
        <nazwa>Minecraft</nazwa>
        <author>Mojang</author>
        <opis>Game - build your own world!</opis>
        <img>#/mc.png#</img>
    </Mnc_172>
    <Mnc_172>
        <ID>2</ID>
        <package_id>modloader</package_id>
        <cat>lib</cat>
        <www>#http://minecraftforum.net/topic/75440-x/#</www>
        <nazwa>ModLoader</nazwa>
        <author>Risugami</author>
        <opis>ModLoader - library to load mods</opis>
        <img>#/gen.png#</img>
        <replaces>modL forging</replaces>
    </Mnc_172>
    ...
</dataroot>

回答1:

The XML does not match your XSLT: M_17 vs. Mnc_172. Anyway, in XSLT 1.0 you need to use a recursive template to tokenize the contents. So try changing:

<depends><xsl:value-of select="depends"/></depends>

to:

<xsl:call-template name="tokenize">
    <xsl:with-param name="text" select="depends"/>
    <xsl:with-param name="elemName" select="'depends'"/>
</xsl:call-template>

and add the following template to your stylesheet:

<xsl:template name="tokenize">
    <xsl:param name="text"/>
    <xsl:param name="elemName"/>
    <xsl:param name="sep" select="' '"/>
    <xsl:choose>
        <xsl:when test="contains($text, $sep)">
            <xsl:element name="{$elemName}">
                <xsl:value-of select="substring-before($text, $sep)"/>
            </xsl:element>
            <!-- recursive call -->
            <xsl:call-template name="tokenize">
                <xsl:with-param name="text" select="substring-after($text, $sep)" />
                <xsl:with-param name="elemName" select="$elemName" />
            </xsl:call-template>
        </xsl:when>
        <xsl:otherwise>
            <xsl:element name="{$elemName}">
                <xsl:value-of select="$text"/>
            </xsl:element>
        </xsl:otherwise>
    </xsl:choose>
</xsl:template>


回答2:

XSLT 2.0 has an easy function to tokenize strings, but in XSLT 1.0 you have to be more creative. The way I would usually attack something like this is with a recursive template which does something with the text before the first space and then calls itself recursively with the remaining text, stopping when it runs out.

<xsl:stylesheet  xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    <xsl:output indent="yes"/>

    <xsl:template match="dataroot">
        <xml><xsl:apply-templates/></xml>
    </xsl:template>

    <xsl:template match="M_17">
        <package id="{package_id}" cat="{cat}">
            <nazwa><xsl:value-of select="nazwa"/></nazwa>
            <xsl:if test="author"><author><xsl:value-of select="author"/></author></xsl:if>
            <xsl:if test="www"><www><xsl:value-of select="translate(www,'#','')"/></www></xsl:if>
            <xsl:if test="opis"><opis><xsl:value-of select="opis"/></opis></xsl:if>
            <xsl:if test="img"><img><xsl:value-of select="translate(img,'#','')"/></img></xsl:if>

            <xsl:apply-templates select="depends | conflicts | after | replaces" />
        </package>
    </xsl:template>

    <xsl:template match="depends | conflicts | after | replaces">
        <xsl:param name="text" select="concat(normalize-space(), ' ')" />
        <xsl:if test="$text">
            <xsl:copy>
                <xsl:value-of select="substring-before($text, ' ')" />
            </xsl:copy>
            <xsl:apply-templates select=".">
                <xsl:with-param name="text" select="substring-after($text, ' ')" />
            </xsl:apply-templates>
        </xsl:if>
    </xsl:template>
</xsl:stylesheet>

The trick here is what we do with the text parameter. Initially I'm setting it to concat(normalize-space(), ' '), which means the whole text of the target element with

  • leading and trailing whitespace removed
  • internal whitespace normalized to a single space character and
  • one trailing space added

So $text is initially word1-space-word2-space-...-wordN-space

Now at each step we create a new element with the same name as the original one and with the first word of $text as its content. We then recurse, passing everything after the first space to the next step (i.e. word2-space-...-wordN-space). Eventually we reach the point where $text is just wordN-space, at which point we produce an element for wordN and then finish because substring-after($text, ' ') is empty.

Note that

<xsl:copy>
    <xsl:value-of select="substring-before($text, ' ')" />
</xsl:copy>

will copy the namespace declarations that are in scope on the input element. This is harmless but you may consider it looks a bit messy. To avoid this you could use

<xsl:element name="{local-name()}">
    <xsl:value-of select="substring-before($text, ' ')" />
</xsl:element>

instead.