First, let me say that I have enjoyed reading dozens of tips about merging multiple XML files. I've also enjoyed implementing a good number of them. But I still haven't achieved my goal.
I don't want to simply merge XML files so that one is repeated after another in the resulting XML file. I have groups with repeating elements that need to each be merged:
<SAN>
<EQLHosts>
<WindowsHosts>
<WindowsHost>
more data and structures down here...
</WindowsHost>
</WindowsHosts>
<LinuxHosts>
<LinuxHost>
...and here...
</LinuxHost>
</LinuxHosts>
</EQLHosts>
</SAN>
Each of the individual XML files might have Windows and/or Linux hosts. So if XML file 1 has data for Windows host A, B and C, and XML file 2 has data for Windows hosts D, E and F, the resulting XML should look like:
<SAN>
<EQLHosts>
<WindowsHosts>
<WindowsHost>
<Name>A</Name>
</WindowsHost>
<WindowsHost>
<Name>B</Name>
</WindowsHost>
<WindowsHost>
<Name>C</Name>
</WindowsHost>
<WindowsHost>
<Name>D</Name>
</WindowsHost>
<WindowsHost>
<Name>E</Name>
</WindowsHost>
<WindowsHost>
<Name>F</Name>
</WindowsHost>
</WindowsHosts>
<LinuxHosts>
<LinuxHost/>
</LinuxHosts>
</EQLHosts>
</SAN>
I have used this XSLT, among others, to get this to work:
<?xml version="1.0"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:variable name="file1" select="document('CorralData1.xml')"/>
<xsl:variable name="file2" select="document('CorralData2.xml')"/>
<xsl:variable name="file3" select="document('CorralData3.xml')"/>
<xsl:template match="/">
<SAN>
<xsl:copy-of select="/SAN/*"/>
<xsl:copy-of select="$file1/SAN/*"/>
<xsl:copy-of select="$file2/SAN/*"/>
<xsl:copy-of select="$file3/SAN/*"/>
</SAN>
</xsl:template>
</xsl:stylesheet>
This file produces a combined XSLT, with all data all the way down the tree included correctly, but with multiple instances of WindowsHosts. Don't want that.
Is there a way to tell XSLT how to do this with a minimum of syntax, or do I need to add each element and sub-element specifically in the XSLT file?
I should have checked. But I went ahead and used collection() and got a solution to work perfectly using the Saxon HE XSLT processor.
But I'm running in an InfoPath environment, and there's only an XSLT 1.0 processor. Does anyone have a recommendation for replacing the collection() command in an XSLT 1.0 environment? Can I go back to using document() in some way?
So I now have this file...
<?xml version="1.0" encoding="windows-1252"?>
<files>
<file name="CorralData1.xml"/>
<file name="CorralData2.xml"/>
</files>
...which I use with a stylesheet containing...
<xsl:variable name="windowsHosts" select="/SAN/WindowsHosts/WindowsHost"/>
<xsl:variable name="vmwareHosts" select="/SAN/VMwareHosts/VMwareHost"/>
<xsl:variable name="linuxHosts" select="/SAN/LinuxHosts/LinuxHost"/>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="/">
<xsl:for-each select="/files/file">
<xsl:apply-templates select="document(@name)/SAN"/>
</xsl:for-each>
<SAN>
<EQLHosts>
<WindowsHosts>
<xsl:for-each select="$windowsHosts">
<xsl:copy-of select="."/>
</xsl:for-each>
</WindowsHosts>
<VMwareHosts>
<xsl:for-each select="$vmwareHosts">
<xsl:copy-of select="."/>
</xsl:for-each>
</VMwareHosts>
<LinuxHosts>
<xsl:for-each select="$linuxHosts">
<xsl:copy-of select="."/>
</xsl:for-each>
</LinuxHosts>
</EQLHosts>
</SAN>
</xsl:template>
...but this gets me multiple /SAN roots. I'm close but something's still a little off.
I used two XSLT files for this operation. The first simply appends all the files:
and the second merges the data by group:
What I would do is use
distinct-values()
to get each unique host name. You could also usecollection()
to make it a little easier. (Usage may differ depending on the implementation. I used Saxon 9.4.)Example...
Input files in the directory "input_dir"...
CorralData1.xml
CorralData2.xml (Windows-A and Windows-B are repeated)
CorralData3.xml (Windows-A and Windows-B are repeated)
XSLT 2.0
Output