Does eXist-db compression:zip function add XML dec

2020-04-16 01:57发布

问题:

I have an XQuery function to convert a group of XML files to HTML and Zip them. It runs a trasform on each file to create <entry> elements.

Starting with that function:

declare function xport:make-sources( $path as xs:string)  as item()* {
  for $article in collection(xmldb:encode-uri($path))
  let $docnum := $article/article/div[@class = 'content']/@doc/string()
  return
  <entry name="{concat($docnum,'.html')}" type='text' method='store'>
    {transform:transform($article, doc("/db/EIDO/data/edit/xsl/doc-html.xsl"), <parameters/>)}
</entry>
} ;

Given the input, I run the XQuery to just show me the result of the transformation ... and I see this (exactly what I would expect):

<entry name="LS01.html" type="text" method="store">
<html>
    <head>
        <style>
                body {
                font-family: Arial;
                }
                article img {
                width:50%;
                }
         ...

You will note the this entry and all of them have no XML Declaration at all.

But now let's put it all together and send those entries to compression. This is all inside a web application. The full XQuery is this:

xquery version "3.0";
import module namespace transform = "http://exist-db.org/xquery/transform";
declare namespace xport = "http://www.xportability.com";
declare function xport:make-sources( $path as xs:string)  as item()* {
for $article in collection(xmldb:encode-uri($path))
  let $docnum := $article/article/div[@class = 'content']/@doc/string()
  return
  <entry name="{concat($docnum,'.html')}" type='text' method='store'>
    {transform:transform($article, doc("/db/EIDO/data/edit/xsl/doc-html.xsl"), <parameters/>)}
</entry>
} ;
let $path := request:get-parameter("path", "")
let $filename := request:get-parameter("filename", "")
let $col := xport:make-sources($path)
return
  response:stream-binary(
    xs:base64Binary(compression:zip($col,true()) ),
    'application/zip',
    $filename
)

Everything works, I get a ZIP file of all the documents that have been transformed to HTML from the XML.

BUT, when I look at the actually file in the ZIP, it has this:

<?xml version="1.0" encoding="UTF-8"?>
<html>
   <head>

The XML Declaration is not on any of the entries to ZIP. It does not exist anywhere (as it couldn't) in the list of entries. But the action of zipping them apparently is adding the declaration. I see no other reason or way. Even specifying omit-xml-declaration or changing the output type in the XSL to text or HTML makes no difference. And this is of course, because the entry list to zip is shown above and that shows the declaration is not there after the transformation.

The files in the ZIP have an added XML declaration, period.

Is there some workaround?

回答1:

The XML declaration is introduced implicitly in your query when the contents of your zip-bound <entry> elements are passed to the compression:zip() function. I'd advise setting serialization options explicitly using the fn:serialize() function. Here is sample code showing how to achieve the result you describe:

xquery version "3.1";

let $node := <html><head/><body><div><h1>Hello World!</h1></div></body></html>
let $serialized := serialize($node, map { "method": "xml", "indent": true(), 
    "omit-xml-declaration": true() })
let $entries := <entry name="test.html" type="text" method="store">{$serialized}</entry>
let $filename := "test.zip"
return
    response:stream-binary(
        compression:zip($entries, true()),
        'application/zip',
        $filename
    )

Saving this query into the database at a location like /db/apps/my-app/test.xq and calling it by pointing your web browser at http://localhost:8080/exist/apps/my-app/test.xq will cause your browser to download test.zip. Opening this zip file will reveal a test.html file absent the XML declaration:

<html>
    <head/>
    <body>
        <div>
            <h1>Hello World!</h1>
        </div>
    </body>
</html>

Stepping back to the fundamentals, the presence or absence of the XML declaration in XQuery is toggled via the omit-xml-declaration serialization parameter. To omit the XML declaration globally for an entire query, you can place this set of declarations in the prolog of your query:

declare namespace output="http://www.w3.org/2010/xslt-xquery-serialization";

declare option output:method "xml";
declare option output:omit-xml-declaration "yes";

Or, when serializing locally within a portion of a query, you can pass this same set of parameters to the fn:serialize function as a map (the method used in the code sample above):

fn:serialize($node, map { "method": "xml", "omit-xml-declaration": true() } )

(There is also an XML syntax for the 2nd options parameter.)

The current version of eXist (v4.0.0) and recent versions (probably since v3.6.0 or so) support all of the options above, and all versions support a somewhat more compact eXist-specific serialization facility, using the exist:serialize option expressed as a string consisting of key=value pairs:

declare option exist:serialize "method=xml omit-xml-declaration=yes";

You can set eXist's default serialization behavior in your conf.xml configuration file. The defaults in conf.xml can be overridden with the methods above. Serialization behavior over different interfaces in eXist, such as WebDAV or XML-RPC, typically respect the defaults set in conf.xml, but these defaults can be overridden on a per-interface basis; for example, see the documentation on serialization over eXist's WebDAV interface.