I get a huge XML file containing a list of TV broadcasts. And I have to split it up into small files containing all broadcasts for one day only. I managed to to that but have two problems with the xml header and a node being there multiple times.
The structure of the XML is the following:
<?xml version="1.0" encoding="UTF-8"?>
<broadcasts>
<broadcast>
<id>4637445812</id>
<week>39</week>
<date>2009-09-22</date>
<time>21:45:00:00</time>
... (some more)
</broadcast>
... (long list of broadcast nodes)
</broadcasts>
My XSL looks like this:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:redirect="http://xml.apache.org/xalan/redirect"
extension-element-prefixes="redirect"
version="1.0">
<!-- mark the CDATA escaped tags -->
<xsl:output method="xml" cdata-section-elements="title text"
indent="yes" omit-xml-declaration="no" />
<xsl:template match="broadcasts">
<xsl:apply-templates />
</xsl:template>
<xsl:template match="broadcast">
<!-- Build filename PRG_YYYYMMDD.xml -->
<xsl:variable name="filename" select="concat(substring(date,1,4),substring(date,6,2))"/>
<xsl:variable name="filename" select="concat($filename,substring(date,9,2))" />
<xsl:variable name="filename" select="concat($filename,'.xml')" />
<redirect:write select="concat('PRG_',$filename)" append="true">
<schedule>
<broadcast program="TEST">
<!-- format timestamp in specific way -->
<xsl:variable name="tmstmp" select="concat(substring(date,9,2),'/')"/>
<xsl:variable name="tmstmp" select="concat($tmstmp,substring(date,6,2))"/>
<xsl:variable name="tmstmp" select="concat($tmstmp,'/')"/>
<xsl:variable name="tmstmp" select="concat($tmstmp,substring(date,1,4))"/>
<xsl:variable name="tmstmp" select="concat($tmstmp,' ')"/>
<xsl:variable name="tmstmp" select="concat($tmstmp,substring(time,1,5))"/>
<timestamp><xsl:value-of select="$tmstmp"/></timestamp>
<xsl:copy-of select="title"/>
<text><xsl:value-of select="subtitle"/></text>
<xsl:variable name="newVps" select="concat(substring(vps,1,2),substring(vps,4,2))"/>
<xsl:variable name="newVps" select="concat($newVps,substring(vps,7,2))"/>
<xsl:variable name="newVps" select="concat($newVps,substring(vps,10,2))"/>
<vps><xsl:value-of select="$newVps"/></vps>
<nextday>false</nextday>
</broadcast>
</schedule>
</redirect:write>
</xsl:template>
</xsl:stylesheet>
My output XMLs are like this:
PRG_20090512.xml:
<?xml version="1.0" encoding="UTF-8"?>
<schedule>
<broadcast program="TEST">
<timestamp>01/03/2010 06:00</timestamp>
<title><![CDATA[TELEKOLLEG Geschichte ]]></title>
<text><![CDATA[Giganten in Fernost]]></text>
<vps>06000000</vps>
<nextday>false</nextday>
</broadcast>
</schedule>
<?xml version="1.0" encoding="UTF-8"?> <!-- don't want this -->
<schedule> <!-- don't want this -->
<broadcast program="TEST">
<timestamp>01/03/2010 06:30</timestamp>
<title><![CDATA[Die chemische Bindung]]></title>
<text/>
<vps>06300000</vps>
<nextday>false</nextday>
</broadcast>
</schedule>
<?xml version="1.0" encoding="UTF-8"?>
...and so on
I can put in omit-xml-declaration="yes" in the output declaration, but the I don't have any xml header. I tried to put in a check if the tag is already in the output, but failed to select nodes in the output...
This is what I tried:
<xsl:choose>
<xsl:when test="count(schedule) = 0"> <!-- schedule needed -->
<schedule>
<broadcast>
...
<xsl:otherwise> <!-- no schedule needed -->
<broadcast>
...
Thanks for any help, as I'm unaware how to handle that. ;( YeTI