Reverse-Engineering unknown XML based on known XSL

2019-07-13 11:17发布

问题:

Solved!

After following Matti's suggestions, I removed the custom functions and all is well.

Original Post:

I'm new to XSLT as of today, so I'm sure this is a no-brainer for many of you. Anyways:

I've been tasked with creating a widget for my company's website that uses data provided by a 3rd-party vendor.

The vendor refuses to send us a sample XML file (even a blanked-out one with just the element tags!) so I'm trying to recreate the XML based on what I can see in the XSLT that they -did- send us. (ridiculosity abounds)

This is the (stripped) XSLT file we were sent:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:transform version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:myCustXslFunctions="urn:CustomXslFunctions">

  <xsl:variable name="NumberColumns" >1</xsl:variable>
  <xsl:variable name="PaperId" >1234567890ABCDEF</xsl:variable>

  <xsl:output method="html" version="1.0" encoding="UTF-8" indent="no" />
  <xsl:template match="/NewDataSet">
    <div><xsl:apply-templates select="/NewDataSet" mode="columns" /></div>
  </xsl:template>

  <xsl:template match="NewDataSet" mode="columns">
    <xsl:for-each select="Table[position() mod $NumberColumns  = 1 or $NumberColumns = 1]">
      <p>
        <xsl:for-each select=".|following-sibling::Table[position() &lt; $NumberColumns]">
          <span class="description">
            <xsl:element name="a">
              <xsl:attribute name="target">_blank</xsl:attribute>
              <xsl:attribute name="class" >description</xsl:attribute>
              <xsl:choose>
                <xsl:when test="retail='true'">
                  <xsl:attribute name="href">http://website/retail/?pid=<xsl:value-of select="$PaperId" />&#38;adid=<xsl:value-of select="paperitemid" /></xsl:attribute>
                </xsl:when>
                <xsl:otherwise>
                  <xsl:attribute name="href">http://website/?pid=<xsl:value-of select="$PaperId" />&#38;adid=<xsl:value-of select="paperitemid" /></xsl:attribute>
                </xsl:otherwise>
              </xsl:choose>
              <xsl:choose>
                <xsl:when test="imageurl != ''">
                  <xsl:element name="img">
                    <xsl:attribute name="src"><xsl:value-of select="imageurl" /></xsl:attribute>
                    <xsl:attribute name="border">0</xsl:attribute>
                    <xsl:attribute name="class">thumbnail</xsl:attribute>
                  </xsl:element>
                </xsl:when>
                <xsl:otherwise>
                  <xsl:element name="img">
                    <xsl:attribute name="src">http://website/thumbs/<xsl:value-of select="paperid" />_<xsl:value-of select="paperitemid" />_100.jpg</xsl:attribute>
                    <xsl:attribute name="border">0</xsl:attribute>
                    <xsl:attribute name="class">thumbnail</xsl:attribute>
                  </xsl:element>
                </xsl:otherwise>
              </xsl:choose>
              </xsl:element>
          </span>
        </xsl:for-each>
      </p>
      <p>
        <xsl:for-each select=".|following-sibling::Table[position() &lt; $NumberColumns]">
          <span class="description">
            <xsl:element name="a">
              <xsl:attribute name="target">_blank</xsl:attribute>
              <xsl:attribute name="class" >description</xsl:attribute>
              <xsl:choose>
                <xsl:when test="retail='true'">
                  <xsl:attribute name="href">http://website/?pid=<xsl:value-of select="$PaperId" />&#38;adid=<xsl:value-of select="paperitemid" /></xsl:attribute>
                </xsl:when>
                <xsl:otherwise>
                  <xsl:attribute name="href">http://website/?pid=<xsl:value-of select="$PaperId" />&#38;adid=<xsl:value-of select="paperitemid" /></xsl:attribute>
                </xsl:otherwise>
              </xsl:choose>
              <xsl:choose>
                <xsl:when test="string-length(shortdescr) = 0"><xsl:value-of select="myCustXslFunctions:MakeNice(descr,20,20,'Left','true')" /></xsl:when>
                <xsl:otherwise><xsl:value-of select="myCustXslFunctions:MakeNice(shortdescr,20,20,'Left','true')" /></xsl:otherwise>
              </xsl:choose>
            </xsl:element>
          </span>
        </xsl:for-each>
      </p>
    </xsl:for-each>
  </xsl:template>
</xsl:transform>

And my feeble attempt at reverse-engineering the XML:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="facepalm.xsl"?>
<NewDataSet>
  <Table>
    <paperid>123</paperid>
    <paperitemid>12345</paperitemid>
    <descr>facepalm of doom</descr>
    <shortdescr>facepalm</shortdescr>
    <retail>true</retail>
    <imageurl>http://website/facepalm.jpg</imageurl>
  </Table>
  <Table>
    <paperid>456</paperid>
    <paperitemid>67890</paperitemid>
    <descr>mega-sigh</descr>
    <shortdescr>sigh</shortdescr>
    <retail>true</retail>
    <imageurl>http://website/sigh.jpg</imageurl>
  </Table>
</NewDataSet>

There's no doubt in my mind that I'm overlooking something simple, but my novice status with XSLT has already made this a multi-hour project.

Any help is greatly appreciated.

回答1:

My guess would be more like:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="facepalm.xsl"?>
<NewDataSet>
 <Table>
  <paperid>123</paperid>
  <paperitemid>12345</paperitemid>
  <descr>failvendor</descr>
  <shortdescr>facepalm</shortdescr>
  <retail>true</retail>
  <imageurl>http://website/facepalm.jpg</imageurl>
 </Table>
 <Table>
  <paperid>456</paperid>
  <paperitemid>67890</paperitemid>
  <descr>is fail</descr>
  <shortdescr>sigh</shortdescr>
  <retail>true</retail>
  <imageurl>http://website/sigh.jpg</imageurl>
 </Table>
</NewDataSet>
  1. The [] stuff doesn't refer to parts of the element name, it refers to the position of the element. So the element name is just Table.
  2. You missed the descr and paperid elements.

What the XSLT seems to be doing is laying out items on a list in columns. Yes, it is that ridiculously complicated in XSLT.

Also, it would seem that it's ignoring paperid and paperitemid if imageurl is defined, and ignoring descr if shortdescr is provided. This might help you on your quest.

...how are you supposed to test this without the actual XML, btw?



回答2:

In the general case it is impossible to determine the structure of an input XML file given just an XSLT

While in this instance you may have been able to reverse engineer an XML descprion based on the XSLT, in the generic case its impossible to do correctly. In this instance it was possible because the template was small and used for-each.

XSLT is declarative, which means you describe what should happen if certain nodes are encountered, but its certainly legal to include templates which are never called, or are called in ways that are not obvious. Similarly, the use of <xsl:apply-templates /> gives no insight as to what elements are inside of a know element.

For example:

<xsl:template match="book">
    <xhtml:div class="book">
        <xsl:apply-templates />
    </xhtml:div>
</xsl:template>

<xsl:template match="title">
    <xhtml:h1><xsl:value-of select="."/></xhtml:h1>
</xsl:template>

<xsl:template match="chapter/title">
    <xhtml:h2><xsl:value-of select="."/></xhtml:h2>
</xsl:template>

Does book have a title? Do books have chapters? Do chapters even have titles? We don't and can't know.