I have an XML file with a list of 92 tab-delimited text files:
<?xml version="1.0" encoding="UTF-8"?>
<dumpSet>
<dump filename="file_one.txt"/>
<dump filename="file_two.txt"/>
<dump filename="file_three.txt"/>
...
</dumpSet>
The first row in each file contains the field names for the subsequent rows. This is just an example. The names and number of elements will vary by record. Most will have around 50 field names.
Title Translated Title Watch Video Interviewee Interviewer
Interview with Barack Obama Obama, Barack Walters, Barbara
Interview with Sarah Palin Palin, Sarah Couric, Katie Smith, John
...
Oxygen XML Editor has an Import function that can convert text files to XML, but--as far as I know--this cannot be done in a batch process with multiple files. So far, the batch processing part has not been a problem. I am using XSLT 2.0's unparsed-text() function to pull in the content from the files in the list. However, I am struggling to group the XML output correctly. Example of desired output:
<collection>
<record>
<title>Interview with Barack Obama</title>
<translatedtitle></translatedtitle>
<watchvideo></watchvideo>
<interviewee>Obama, Barack</interviewee>
<interviewer>Walters, Barbara</interviewer>
<videographer>Smith, John</videographer>
</record>
<record>
<title>Interview with Sarah Palin</title>
<translatedtitle></translatedtitle>
<watchvideo></watchvideo>
<interviewee>Palin, Sarah</interviewee>
<interviewer>Couric, Katie</interviewer>
<videographer>Smith, John</videographer>
</record>
...
</collection>
Right now, here is the kind of output I am getting:
<collection>
<record>
<title>title</title>
<value>Interview with Barack Obama</value>
<value>Interview with Sarah Palin</value>
<translatedtitle>translatedtitle</translatedtitle>
<value/>
<value/>
<watchvideo>watchvideo</watchvideo>
<value/>
<value/>
<interviewee>interviewee</interviewee>
<value>Obama, Barack</value>
<value>Palin, Sarah</value>
<interviewer>interviewer</interviewer>
<value>Walters, Barbara</value>
<value>Couric, Katie</value>
<videographer>videographer</videographer>
<value>Smith, John</value>
<value>Smith, John </value>
<value/>
<value/>
</record>
</collection>
That is, I'm not able to group the output by record. Here's the current code I'm working with, based on an example in Doug Tidwell's XSLT book:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="#all" version="2.0">
<xsl:param name="i" select="1"/>
<xsl:param name="increment" select="1"/>
<xsl:param name="operator" select="'<='"/>
<xsl:param name="testVal" select="100"/>
<xsl:template match="/">
<collections>
<collection>
<xsl:for-each select="dumpSet/dump">
<!-- Pull in external tab-delimited files -->
<xsl:for-each select="unparsed-text(concat('../2013-04-26/',@filename),'UTF-8')">
<record>
<!-- Call recursive template to loop through elements. -->
<xsl:call-template name="for-loop">
<xsl:with-param name="i" select="$i"/>
<xsl:with-param name="increment" select="$increment"/>
<xsl:with-param name="operator" select="$operator"/>
<xsl:with-param name="testVal" select="$testVal"/>
</xsl:call-template>
</record>
</xsl:for-each>
</xsl:for-each>
</collection>
</collections>
</xsl:template>
<xsl:template name="for-loop">
<xsl:param name="i"/>
<xsl:param name="increment"/>
<xsl:param name="operator"/>
<xsl:param name="testVal"/>
<xsl:variable name="testPassed">
<xsl:choose>
<xsl:when test="$operator = '<='">
<xsl:if test="$i <= $testVal">
<xsl:text>true</xsl:text>
</xsl:if>
</xsl:when>
</xsl:choose>
</xsl:variable>
<xsl:if test="$testPassed = 'true'">
<!-- Separate the header from the tab-delimited file. -->
<xsl:for-each select="tokenize(.,'\r|\n')[1]">
<!-- Spit out the field names. -->
<xsl:for-each select="tokenize(.,'\t')[$i]">
<xsl:element name="{replace(lower-case(translate(.,'-.','')),' ','')}">
<xsl:value-of select="replace(lower-case(translate(.,'-.','')),' ','')"/>
</xsl:element>
</xsl:for-each>
</xsl:for-each>
<!-- For the following rows, loop through the field values. -->
<xsl:for-each select="tokenize(.,'\r|\n')[position()>1]">
<xsl:for-each select="tokenize(.,'\t')[$i]">
<value>
<xsl:value-of select="."/>
</value>
</xsl:for-each>
</xsl:for-each>
<!-- Call the template to increment. -->
<xsl:call-template name="for-loop">
<xsl:with-param name="i" select="$i + $increment"/>
<xsl:with-param name="increment" select="$increment"/>
<xsl:with-param name="operator" select="$operator"/>
<xsl:with-param name="testVal" select="$testVal"/>
</xsl:call-template>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
How should I change this to to group the output by record?