XSLT: Convert base64 data into image files

2019-01-13 22:59发布

问题:

I have seen several questions on how to encode an image file in base64, but how about the other way around - how do I reconstitute a picture from a base64 string stored in an XML file?

<resource>
<data encoding="base64">
R0lGODlhEAAQAPMAMcDAwP/crv/erbigfVdLOyslHQAAAAECAwECAwECAwECAwECAwECAwECAwEC
AwECAyH/C01TT0ZGSUNFOS4wGAAAAAxtc09QTVNPRkZJQ0U5LjAHgfNAGQAh/wtNU09GRklDRTku
MBUAAAAJcEhZcwAACxMAAAsTAQCanBgAIf8LTVNPRkZJQ0U5LjATAAAAB3RJTUUH1AkWBTYSQXe8
fQAh+QQBAAAAACwAAAAAEAAQAAADSQhgpv7OlDGYstCIMqsZAXYJJEdRQRWRrHk2I9t28CLfX63d
ZEXovJ7htwr6dIQB7/hgJGXMzFApOBYgl6n1il0Mv5xuhBEGJAAAOw==
</data>
<mime>image/gif</mime>
<resource-attributes>
    <file-name>clip_image001.gif</file-name>
</resource-attributes>
</resource>

Given the above XML node resource, how do I go about creating clip_image001.gif?

Please suggest:

  1. XSLT processors and/or extensions enable this, plus
  2. a sample XSLT that triggers the conversion

Note that it must be able to handle at least GIF & PNG file formats. Preferably not restricted to any OS.


Implemented solution

Based around Mads Hansen's solution. Main difference being that I referenced net.sf.saxon.value.Base64BinaryValue directly in my namespace rather than using the saxon namespace, because I understood the Java APIs more intuitively than the Saxonica website's descriptions of the base64Binary-to-octets and base64Binary functions.

<xsl:stylesheet version="2.0" 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:b64="net.sf.saxon.value.Base64BinaryValue"
    xmlns:fos="java.io.FileOutputStream"
    ...
    exclude-result-prefixes="b64 fos">
...
<xsl:for-each select="resource">                
    <xsl:variable name="b64" select="b64:new(string(data))"/>
    ...
    <xsl:variable name="fos" select="fos:new(string($img))"/>
    <xsl:value-of select="fos:write($fos, b64:getBinaryValue($b64))"/>  
    <xsl:value-of select="fos:close($fos)"/>
</xsl:for-each>
...

P.S. See sibling question for my implementation of how to obtain the hashes necessary to identify the image files.


This question is a subquestion of another question I have asked previously.

回答1:

I found this entry from the XSL maiing lists that describes how to use the Saxon extension function xs:base64Binary-to-octet to stream it out to a file using the Java FileOutputStream in an XSLT 2.0 stylesheet:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
version="2.0" xmlns:xs="http://www.w3.org/2001/XMLSchema";
xmlns:saxon="http://saxon.sf.net/";
xmlns:fos="java.io.FileOutputStream">
<xsl:template match="/">
   <xsl:variable name="img" select="concat('c:\test\jesper', '.jpg')"/>
   <xsl:variable name="fos" select="fos:new(string($img))"/>
   <xsl:value-of select="fos:write($fos,
saxon:base64Binary-to-octets(xs:base64Binary(my-base64-encoded-image)))"/>
   <xsl:value-of select="fos:close($fos)"/>
</xsl:template>
</xsl:stylesheet>


回答2:

The following works:

<img>
  <xsl:attribute name="src">
    <xsl:value-of select="concat('data:image/gif;base64,',xPath)"/>
  </xsl:attribute>
</img>


回答3:

Transform it to HTML.

<img src="data:{mime};base64,{data}" />


回答4:

There is a better method available since Saxon 9.5 via the EXPath File extension module (available in Saxon-PE and Saxon-EE).

Here is a fragment of the code I'm using to extract binary image files from Word documents (source XML is in WordProcessingML format):

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:file="http://expath.org/ns/file" xmlns:pkg="http://schemas.microsoft.com/office/2006/xmlPackage">

<xsl:template match="/pkg:package">
    <xsl:apply-templates select="pkg:part/pkg:binaryData"/>
</xsl:template>

<xsl:template match="pkg:binaryData">
    <xsl:variable name="filename">
        <xsl:value-of select="replace(../@pkg:name, '/word/media/', '')"/>
    </xsl:variable>
    <xsl:variable name="path" select="concat('/some/folder/', $filename)"/>
    <xsl:message><xsl:value-of select="$path"/></xsl:message>

    <xsl:value-of select="file:write-binary($path, xs:base64Binary(string()))"/>       
</xsl:template>