I have a deeply nested xml file, and want transform it to flat csv. Therefore I have to go to the deepest path (here: availability
), and reuse values from parents (here: category
, element
).
Example:
<market>
<category>
<type>kids</type>
</category>
<items>
<element>
<name>police car</name>
<type>toy</type>
<color>blue</color>
<availability>
<stock cat="A" in="5"/>
<stock cat="B" in="2"/>
</availability>
</element>
</element>
...
</element>
</items>
</market>
Desired csv output:
kids,police car, toy, blue, A, 5
kids,police car, toy, blue, B, 2
Note how the kids
value is copied to each resulting element
line, and how each element
is copied to each availability
view.
I stared as follows, but of course this does not give the desired result. Because I don't know how to:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="3.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" omit-xml-declaration="yes" indent="no"/>
<xsl:template match="market">
<xsl:for-each select="//category">
<xsl:value-of select="type"/>
</xsl:for-each>
<xsl:for-each select="//items//element">
<xsl:value-of select="name"/>
<xsl:value-of select="type"/>
<xsl:value-of select="color"/>
</xsl:for-each>
<xsl:for-each select="//items//element//availability//stock">
<xsl:value-of select="//@cat"/>
<xsl:value-of select="//@in"/>
</xsl:for-each>
</xsl:template>
The following might work, but I don't know if that's the way to go:
<xsl:template match="market">
<xsl:variable name="ctype">
<xsl:value-of select="market/category/type"/>
</xsl:variable>
<xsl:for-each select="//items//element">
<xsl:variable name="elem">
<xsl:text>;</xsl:text>
<xsl:value-of select="copy-of(.)!(.//name, .//type, .//color)" separator=";"/>
</xsl:variable>
<!-- nesting for-each -->
<xsl:for-each select="availability//stock">
<xsl:copy-of select="$elem"/>
<xsl:text>;</xsl:text>
<xsl:value-of select="copy-of(.)!(.//@cat, .//@in)" separator=";"/>
</xsl:for-each>
</xsl:for-each>
</xsl:template>
I usually write a template matching those elements that map to a line and select the other values as needed through XPath navigation:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="3.0">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:apply-templates select="//availability/stock"/>
</xsl:template>
<xsl:template match="stock">
<xsl:value-of select="ancestor::market/category/type, ancestor::element!(name, type, color), @cat, @in" separator=", "/>
<xsl:text> </xsl:text>
</xsl:template>
</xsl:stylesheet>
That allows for a compact and clear notation of which values compose a line in the CSV file.
https://xsltfiddle.liberty-development.net/jyH9rM9
A static header information like the category/type
could also be stored in a global variable:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="3.0">
<xsl:output method="text"/>
<xsl:variable name="category-type" select="market/category/type"/>
<xsl:template match="/">
<xsl:apply-templates select="//availability/stock"/>
</xsl:template>
<xsl:template match="stock">
<xsl:value-of select="$category-type, ancestor::element!(name, type, color), @cat, @in" separator=", "/>
<xsl:text> </xsl:text>
</xsl:template>
</xsl:stylesheet>
https://xsltfiddle.liberty-development.net/jyH9rM9/1
A third way in XSLT 3 is to capture values in a declarative way using accumulators:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
version="3.0">
<xsl:mode use-accumulators="#all"/>
<xsl:output method="text"/>
<xsl:accumulator name="cat-type" as="xs:string?" initial-value="()">
<xsl:accumulator-rule match="market/category/type" select="string()"/>
</xsl:accumulator>
<xsl:accumulator name="element-name" as="xs:string?" initial-value="()">
<xsl:accumulator-rule match="item/element" select="()"/>
<xsl:accumulator-rule match="items/element/name" select="string()"/>
</xsl:accumulator>
<xsl:accumulator name="element-type" as="xs:string?" initial-value="()">
<xsl:accumulator-rule match="item/element" select="()"/>
<xsl:accumulator-rule match="items/element/type" select="string()"/>
</xsl:accumulator>
<xsl:accumulator name="element-color" as="xs:string?" initial-value="()">
<xsl:accumulator-rule match="item/element" select="()"/>
<xsl:accumulator-rule match="items/element/color" select="string()"/>
</xsl:accumulator>
<xsl:template match="/">
<xsl:apply-templates select="//availability/stock"/>
</xsl:template>
<xsl:template match="stock">
<xsl:value-of select="accumulator-before('cat-type'), accumulator-before('element-name'), accumulator-before('element-type'), accumulator-before('element-color'), @cat, @in" separator=", "/>
<xsl:text> </xsl:text>
</xsl:template>
</xsl:stylesheet>
https://xsltfiddle.liberty-development.net/jyH9rM9/2
That has the advantage that you could adapt it to streaming with some changes to that you could transform huge inputs with Saxon 9.8 EE without storing the complete XML input tree in memory:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
version="3.0">
<xsl:mode use-accumulators="#all" />
<xsl:output method="text"/>
<xsl:accumulator name="cat-type" as="xs:string?" initial-value="()" streamable="yes">
<xsl:accumulator-rule match="market/category/type/text()" select="string()"/>
</xsl:accumulator>
<xsl:accumulator name="element-name" as="xs:string?" initial-value="()" streamable="yes">
<xsl:accumulator-rule match="item/element" select="()"/>
<xsl:accumulator-rule match="items/element/name/text()" select="string()"/>
</xsl:accumulator>
<xsl:accumulator name="element-type" as="xs:string?" initial-value="()" streamable="yes">
<xsl:accumulator-rule match="item/element" select="()"/>
<xsl:accumulator-rule match="items/element/type/text()" select="string()"/>
</xsl:accumulator>
<xsl:accumulator name="element-color" as="xs:string?" initial-value="()" streamable="yes">
<xsl:accumulator-rule match="item/element" select="()"/>
<xsl:accumulator-rule match="items/element/color/text()" select="string()"/>
</xsl:accumulator>
<xsl:template match="/">
<xsl:apply-templates select="outermost(//availability/stock)"/>
</xsl:template>
<xsl:template match="stock">
<xsl:value-of select="accumulator-before('cat-type'), accumulator-before('element-name'), accumulator-before('element-type'), accumulator-before('element-color'), @cat, @in" separator=", "/>
<xsl:text> </xsl:text>
</xsl:template>
</xsl:stylesheet>
Try this:
<xsl:strip-space elements="*"/>
<xsl:template match="market">
<xsl:for-each select=".//stock">
<xsl:value-of select="ancestor::market/category/type
|ancestor::market/items/element/name
|ancestor::market/items/element/type
|ancestor::market/items/element/color
|@cat
|@in" separator=", "/>
<xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:template>
Output
kids, police car, toy, blue, A, 5
kids, police car, toy, blue, B, 2