How to use XSLT to get only certain rows and certa

2019-02-15 05:14发布

问题:

How can I use XSLT to convert this XML file:

<file>
    <row>
        <cell></cell>
        <cell>(info...)</cell>
        <cell></cell>
    </row>
    <row>
        <cell>first name</cell>
        <cell>last name</cell>
        <cell>age</cell>
    </row>
    <row>
        <cell>Jim</cell>
        <cell>Smith</cell>
        <cell>34</cell>
    </row>
    <row>
        <cell>Roy</cell>
        <cell>Rogers</cell>
        <cell>22</cell>
    </row>
    <row>
        <cell>Hank</cell>
        <cell>Grandier</cell>
        <cell>23</cell>
    </row>
    <row>
        <cell>(info...)</cell>
        <cell></cell>
        <cell>(info...)</cell>
    </row>

    <row>
        <cell>Sally</cell>
        <cell>Cloud</cell>
        <cell>26</cell>
    </row>

    <row>
        <cell>John</cell>
        <cell>Randall</cell>
        <cell>44</cell>
    </row>  

</file>

to this XML file:

<file>
    <row>
        <cell>Jim</cell>
        <cell>34</cell>
    </row>
    <row>
        <cell>Roy</cell>
        <cell>22</cell>
    </row>
    <row>
        <cell>Sally</cell>
        <cell>26</cell>
    </row>
    <row>
        <cell>John</cell>
        <cell>44</cell>
    </row>  
</file>

Basically the rules are:

  • only first and third column (first name and age)
  • only rows within certain ranges, e.g. in the simple example above it would be rows 3-5 and rows 7-8 so I would assume I would need some kind of mapping table with this information in it

Addendum

Here is my solution using MarcoS's tip about params:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    version="1.0">

    <xsl:output method="xml" indent="yes" omit-xml-declaration="no" />

    <xsl:param name="range-1-begin"  select="3"/>
    <xsl:param name="range-1-end"  select="4"/>

    <xsl:param name="range-2-begin"  select="6"/>
    <xsl:param name="range-2-end"  select="7"/>

    <xsl:template match="file">
        <marco>
            <xsl:for-each select="row">
                <xsl:if test="(position() &gt;= $range-1-begin and position() &lt;= $range-1-end)
                    or (position() &gt;= $range-2-begin and position() &lt;= $range-2-end)">
                    <row>
                        <xsl:for-each select="cell">
                            <xsl:if test="position() = 1 or 
                                position() = 3">
                                <cell>
                                    <xsl:value-of select="."/>
                                </cell>
                            </xsl:if>
                        </xsl:for-each>
                    </row>
                </xsl:if>
            </xsl:for-each>
        </marco>
    </xsl:template>

</xsl:stylesheet>

回答1:

This is a possible solution (maybe not very elegant):

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    version="1.0">

    <xsl:output method="xml" indent="yes" omit-xml-declaration="no" />

    <xsl:template match="file">
        <file>
            <xsl:for-each select="row">
                <xsl:if test="(position() >= 3 and position() &lt; 5)
                    or (position() >= 7 and position() &lt;= 8)">
                    <row>
                        <xsl:for-each select="cell">
                            <xsl:if test="position() = 1 or 
                                position() = 3">
                                <cell>
                                    <xsl:value-of select="."/>
                                </cell>
                            </xsl:if>
                        </xsl:for-each>
                    </row>
                </xsl:if>
            </xsl:for-each>
        </file>
    </xsl:template>

</xsl:stylesheet>

Essentially, you can use XPath position() function to select the ranges of row and cell elements that you want.



回答2:

Here's a piece of XSLT that does what you describe.

<?xml version="1.0" encoding="iso-8859-1"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" encoding="iso-8859-1" indent="yes" omit-xml-declaration="no"/>
  <xsl:template match="file">
    <xsl:copy>
      <xsl:apply-templates select="row[(position()&gt;=3 and position()&lt;=5) or (position()&gt;=7 and position()&lt;=8)]" />
    </xsl:copy>
  </xsl:template>
  <xsl:template match="row">
    <xsl:copy>
      <xsl:apply-templates select="cell[position()=1 or position()=3]" />
    </xsl:copy>
  </xsl:template>
  <xsl:template match="cell">
    <xsl:copy-of select="." />
  </xsl:template>
</xsl:stylesheet>

To select the rows you want in your output, I think I'd begin by marking them with an attribute which would be used as a filter. In the code invoking the XSLT, you could do it using DOM methods just after loading the XML document and before applying the transformation. Eg to keep Jim Smith but discard Roy Rogers:

<row keep="-1">
    <cell>Jim</cell>
    <cell>Smith</cell>
    <cell>34</cell>
</row>
<row>
    <cell>Roy</cell>
    <cell>Rogers</cell>
    <cell>22</cell>
</row>

And change the line in the XSLT to:

<xsl:apply-templates select="row[@keep=-1]" />


回答3:

This is probably the simplest and shortest solution, also based on using and overriding the identity rule:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:my="my:my">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <my:params>
  <row-range start="3" end="5"/>
  <row-range start="7" end="8"/>
  <cell-positions>
   <pos>1</pos>
   <pos>3</pos>
  </cell-positions>
 </my:params>

 <xsl:template match="node()|@*">
     <xsl:copy>
       <xsl:apply-templates select="node()|@*"/>
     </xsl:copy>
 </xsl:template>

 <xsl:template match=
  "row[(not(position() >= document('')/*/my:params/row-range[1]/@start)
     or
       position() > document('')/*/my:params/row-range[1]/@end
       )
     and
      (not(position() >= document('')/*/my:params/row-range[2]/@start)
     or
       position() > document('')/*/my:params/row-range[2]/@end
       )
      ]
  "/>

 <xsl:template match=
  "cell[not(position()=document('')/*/my:params/cell-positions/*)]"/>
</xsl:stylesheet>

When this transformation is applied on the provided XML document:

<file>
    <row>
        <cell></cell>
        <cell>(info...)</cell>
        <cell></cell>
    </row>
    <row>
        <cell>first name</cell>
        <cell>last name</cell>
        <cell>age</cell>
    </row>
    <row>
        <cell>Jim</cell>
        <cell>Smith</cell>
        <cell>34</cell>
    </row>
    <row>
        <cell>Roy</cell>
        <cell>Rogers</cell>
        <cell>22</cell>
    </row>
    <row>
        <cell>Hank</cell>
        <cell>Grandier</cell>
        <cell>23</cell>
    </row>
    <row>
        <cell>(info...)</cell>
        <cell></cell>
        <cell>(info...)</cell>
    </row>
    <row>
        <cell>Sally</cell>
        <cell>Cloud</cell>
        <cell>26</cell>
    </row>
    <row>
        <cell>John</cell>
        <cell>Randall</cell>
        <cell>44</cell>
    </row>
</file>

the wanted, correct result is produced:

<file>
   <row>
      <cell>Jim</cell>
      <cell>34</cell>
   </row>
   <row>
      <cell>Roy</cell>
      <cell>22</cell>
   </row>
   <row>
      <cell>Hank</cell>
      <cell>23</cell>
   </row>
   <row>
      <cell>Sally</cell>
      <cell>26</cell>
   </row>
   <row>
      <cell>John</cell>
      <cell>44</cell>
   </row>
</file>


标签: xml xslt