I am using XSLT 1.0 in order to convert some XML into JSON output. Unfortunately some of the XML I'm working with has HTML markup in it. Here's an example of some XML input:
<text>
Kevin Love and Steph Curry can talk about their first-
time starting gigs in the All-Star game Friday night when the Minnesota
Timberwolves visit Oracle Arena to face the Golden State Warriors.
</text>
<continue>
<P>
Love and Curry were two of four first-time All-Star starters when the league
made the announcement on Thursday.
</P>
<P>
Love got a late push to overtake Houston Rockets center Dwight Howard in the
final week of voting.
</P>
<P>
"I think it's a little sweeter this way because I really didn't expect it,"
Love said on a conference call. "I was already humbled by the response the
fans gave me to being very close to the top (frontcourt players). The outreach
by the Minnesota fans and beyond was truly amazing."
</P>
</continue>
The markup is not ideal and I need to retain the <P>
tags in my JSON output. In order to deal with quotes, I escape them. Here's my template for handling this:
<xsl:variable name="escaped-continue">
<xsl:call-template name="replace-string">
<xsl:with-param name="text" select="continue"/>
<xsl:with-param name="replace" select="'"'" />
<xsl:with-param name="with" select="'\"'"/>
</xsl:call-template>
</xsl:variable>
<xsl:variable name="escaped-text">
<xsl:call-template name="replace-string">
<xsl:with-param name="text" select="text"/>
<xsl:with-param name="replace" select="'"'" />
<xsl:with-param name="with" select="'\"'"/>
</xsl:call-template>
</xsl:variable>
<xsl:template name="replace-string">
<xsl:param name="text"/>
<xsl:param name="replace"/>
<xsl:param name="with"/>
<xsl:choose>
<xsl:when test="contains($text,$replace)">
<xsl:value-of select="substring-before($text,$replace)"/>
<xsl:value-of select="$with"/>
<xsl:call-template name="replace-string">
<xsl:with-param name="text"
select="substring-after($text,$replace)"/>
<xsl:with-param name="replace" select="$replace"/>
<xsl:with-param name="with" select="$with"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$text"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
I then simply use something like the following to output JSON:
{
"text": "<xsl:value-of select="normalize-space($escaped-text)"/>",
"continue": "<xsl:value-of select="normalize-space($escaped-continue)"/>"
}
The issue I have here is that the output looks like this:
{
"text": "Kevin Love and Steph Curry can talk about their first- time starting gigs in the All-Star game Friday night when the Minnesota Timberwolves visit Oracle Arena to face the Golden State Warriors.",
"continue": "Love and Curry were two of four first-time All-Star starters when the league made the announcement on Thursday. Love got a late push to overtake Houston Rockets center Dwight Howard in the final week of voting. \"I think it's a little sweeter this way because I really didn't expect it,\" Love said on a conference call. \"I was already humbled by the response the fans gave me to being very close to the top (frontcourt players). The outreach by the Minnesota fans and beyond was truly amazing.\"
}
As you can see, double quotes are properly escaped, however the <P>
tags have been stripped and/or parsed directly by the XSLT parser and then suppressed by normalize-space()
. What's the best way to re-add the <P>
tags into my output here?