xslt to skip already “visited” nodes

2019-02-13 02:39发布

not sure if this is possible without having to go through several passes, but I'll ask anyway (my XSL is a little rusty)

I have an XML document, which contains nodes as follows:

<structures>
 <structure id="STRUCT_A">
   <field idref="STRUCT_B" name="b"/>
   <field idref="STRUCT_C" name="c"/>
   <field idref="FIELD_D" name="d"/>
 </structure>

 <structure id="STRUCT_B">
   <field idref="STRUCT_C" name="c"/>
   <field idref="FIELD_E" name="e"/>
 </structure>

 <structure id="STRUCT_C">
   <field idref="FIELD_E" name="e"/>
   <field idref="FIELD_F" name="f"/>
   <field idref="FIELD_G" name="g"/>
 </structure>
</structures>

(The real file contains lots of structure tags which interdependencies, none of which are circular!)

What I want to do is to generate some text (in this case C++ structs), and the obvious requirement is the order of the structs, so my ideal output would be

struct STRUCT_C
{
  FIELD_E e;
  FIELD_F f;
  FIELD_G g;
};

struct STRUCT_B
{
  STRUCT_C c;
  FIELD_E e;
};

struct STRUCT_A
{
  STRUCT_B b;
  STRUCT_C c;
  FIELD_D d;
};

I know I could use forward declarations and that would mean that the order doesn't matter, however the problem is that there will be "processing" code inline in the structures, and they would require the real definition to be present.

So far I can detect to see if a structure has any dependencies, with the following bit of xsl:

<xsl:for-each select="descendant::*/@idref">
  <xsl:variable name="name" select="."/>
  <xsl:apply-templates select="//structure[@id = $name]" mode="struct.dep"/> 
</xsl:for-each>

(this happens inside a <xsl:template match="structure">)

Now, theoretically, I could then follow this dependency "chain" and generate the structs for each entry first, then the one that I am currently at, however as you can imagine, this generates lot's of copies of the same structure - which is a pain..

Is there anyway to avoid the copies? Basically, once a structure has been visited, and if we visit again, not to bother outputting the code for it... I don't need the full xslt to do this (unless it's trivial!), but just any ideas on approaches...

If there isn't, I could in theory wrap the struct with a #ifdef/#define/#endif guard so that the compiler only uses the first definition, however this is REALLY NASTY! :(

(NOTES: xslt 1.0, xsltproc on linux: Using libxml 20623, libxslt 10115 and libexslt 812)

标签: xslt
3条回答
Deceive 欺骗
2楼-- · 2019-02-13 03:27

This transformation:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:variable name="vLeafs" select="/*/structure[not(field/@idref = /*/structure/@id)]"/>

 <xsl:template match="/*">
  <xsl:apply-templates select="$vLeafs[1]">
   <xsl:with-param name="pVisited" select="'|'"/>
  </xsl:apply-templates>

 </xsl:template>

 <xsl:template match="structure">
   <xsl:param name="pVisited"/>

struct <xsl:value-of select="@id"/>
{<xsl:text/>
  <xsl:apply-templates/>
};
  <xsl:variable name="vnewVisited"
       select="concat($pVisited, @id, '|')"/>
  <xsl:apply-templates select=
  "../structure[not(contains($vnewVisited, concat('|', @id, '|')))
              and
                not(field/@idref
                           [not(contains($vnewVisited, concat('|', ., '|')) )
                          and
                           . = ../../../structure/@id
                           ]
                   )
               ] [1]
  ">
   <xsl:with-param name="pVisited" select="$vnewVisited"/>
  </xsl:apply-templates>
 </xsl:template>

 <xsl:template match="field">
  <xsl:value-of select="concat('&#xA;   ', @idref, ' ', @name, ';')"/>
 </xsl:template>
</xsl:stylesheet>

when applied on the provided XML document:

<structures>
 <structure id="STRUCT_A">
   <field idref="STRUCT_B" name="b"/>
   <field idref="STRUCT_C" name="c"/>
   <field idref="FIELD_D" name="d"/>
 </structure>

 <structure id="STRUCT_B">
   <field idref="STRUCT_C" name="c"/>
   <field idref="FIELD_E" name="e"/>
 </structure>

 <structure id="STRUCT_C">
   <field idref="FIELD_E" name="e"/>
   <field idref="FIELD_F" name="f"/>
   <field idref="FIELD_G" name="g"/>
 </structure>
</structures>

produces the wanted, correct result:

struct STRUCT_C
{
   FIELD_E e;
   FIELD_F f;
   FIELD_G g;
};


struct STRUCT_B
{
   STRUCT_C c;
   FIELD_E e;
};


struct STRUCT_A
{
   STRUCT_B b;
   STRUCT_C c;
   FIELD_D d;
};

Explanation: structure elements are processed strictly one by one. At any time we process the first structure element whose id isn't yet registered in the pVisited parameter and that has no field/@idref value that isn't already in the pVisited parameter and refers to an existing structure element.

查看更多
我只想做你的唯一
3楼-- · 2019-02-13 03:35

Ooh, this is more complicated than it looked at first. +1 for good question.

I think the best way to accomplish this in XSLT 1.0 would be to pass an accumulating parameter whenever you apply-templates to a structure. The parameter (call it "$visited-structures") is a space-delimited list of names of structures you've already processed.

Update: finally got this. :-)

In the template for processing a structure, check whether any other structures this one depends on are not already listed in $visited-structures. If not, generate the code for this structure, and recurse on the template selecting the next non-visited structure, appending the current structure name to the $visited-structures parameter. Otherwise, don't generate code for the structure but recurse on the template selecting the first dependency structure, passing the $visited-structures parameter unmodified.

Here is the code...

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
   version="1.0">
   <xsl:key name="structuresById" match="/*/structure" use="@id" />

   <xsl:template match="structures">
      <xsl:apply-templates select="structure[1]" >
         <!-- a space-delimited list of id's of structures already processed, with space
            at beginning and end. Could contain duplicates. -->
         <xsl:with-param name="visited-structures" select="' '"/>
      </xsl:apply-templates>
   </xsl:template>

   <xsl:template match="structure">
      <xsl:param name="visited-structures" select="' '" />  
      <xsl:variable name="dependencies" select="key('structuresById', field/@idref)
                     [not(contains($visited-structures, @id))]"/>
      <xsl:choose>
         <xsl:when test="$dependencies">
            <xsl:apply-templates select="$dependencies[1]">
               <xsl:with-param name="visited-structures" select="$visited-structures"/>
            </xsl:apply-templates>            
         </xsl:when>
         <xsl:otherwise>
            <!-- Now generate code for this structure ... ... -->
struct <xsl:value-of select="@id"/>
{
<xsl:apply-templates select="field"/>};
            <xsl:variable name="new-visited" select="concat(' ', @id, $visited-structures)"/>
            <xsl:apply-templates select="/*/structure[not(contains($new-visited, @id))][1]" >
               <xsl:with-param name="visited-structures" select="$new-visited"/>
            </xsl:apply-templates>
         </xsl:otherwise>
      </xsl:choose>      
   </xsl:template>

   <xsl:template match="field">
      <xsl:value-of select="concat('  ', @idref, ' ', @name, ';&#xa;')"/>      
   </xsl:template>

</xsl:stylesheet>

And the output:

<?xml version="1.0" encoding="utf-8"?>

struct STRUCT_C
{
  FIELD_E e;
  FIELD_F f;
  FIELD_G g;
};


struct STRUCT_B
{
  STRUCT_C c;
  FIELD_E e;
};


struct STRUCT_A
{
  STRUCT_B b;
  STRUCT_C c;
  FIELD_D d;
};
查看更多
Melony?
4楼-- · 2019-02-13 03:39

Just for fun, other approach (level by level) and ussing keys:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="text"/>
    <xsl:key name="kStructById" match="structure" use="@id"/>
    <xsl:key name="kStructByIdref" match="structure" use="field/@idref"/>
    <xsl:template match="/">
        <xsl:param name="pParents" select="/.."/>
        <xsl:param name="pChilds"
                   select="structures/structure[not(key('kStructById',
                                                        field/@idref))]"/>
        <xsl:variable name="vParents" select="$pParents|$pChilds"/>
        <xsl:variable name="vChilds"
                      select="key('kStructByIdref',$pChilds/@id)
                                 [count(key('kStructById',
                                             field/@idref) |
                                        $vParents) =
                                  count($vParents)]"/>
        <xsl:apply-templates select="$pChilds"/>
        <xsl:apply-templates select="current()[$vChilds]">
            <xsl:with-param name="pParents" select="$vParents"/>
            <xsl:with-param name="pChilds" select="$vChilds"/>
        </xsl:apply-templates>
    </xsl:template>
    <xsl:template match="structure">
        <xsl:value-of select="concat('struct ',@id,'&#xA;{&#xA;')"/>
        <xsl:apply-templates/>
        <xsl:text>};&#xA;</xsl:text>
    </xsl:template>
    <xsl:template match="field">
        <xsl:value-of select="concat('&#x9;',@idref,' ',@name,';&#xA;')"/>
    </xsl:template>
</xsl:stylesheet>

Output:

struct STRUCT_C
{
    FIELD_E e;
    FIELD_F f;
    FIELD_G g;
};
struct STRUCT_B
{
    STRUCT_C c;
    FIELD_E e;
};
struct STRUCT_A
{
    STRUCT_B b;
    STRUCT_C c;
    FIELD_D d;
};
查看更多
登录 后发表回答