Coming from this question, I managed one entirely unsatisfactory solution for accessing an eXist-DB collection()
from an XSLT 2.0 document loaded from within an eXist-db/Xquery transformation function:
The XSLT file declares a variable :
<xsl:variable name="coll" select="collection('xmldb:exist:///db/apps/deheresi/data/collection_ms609.xml')"/>
This points to a catalog xml file I created (per Saxon documentation) that looks like this, in order to load the actual collection:
<collection stable="true">
<doc href="xmldb:exist:///db/apps/deheresi/data/ms609_0001.xml"/>
<doc href="xmldb:exist:///db/apps/deheresi/data/ms609_0002.xml"/>
...
...
<doc href="xmldb:exist:///db/apps/deheresi/data/ms609_0709.xml"/>
<doc href="xmldb:exist:///db/apps/deheresi/data/ms609_0710.xml"/>
</collection>
This allows the XSLT file to use a key that needs to search across all these files:
<xsl:key name="correspkey" match="tei:seg[@type='dep_event' and @corresp]" use="@corresp"/>
<xsl:variable name="correspvar" select="self::seg[@type='dep_event' and @corresp]/@corresp"/>
<xsl:value-of select="$coll/(key('correspid',$correspvar) except $correspvar)/@id" separator=", "/>
As it stands, if I have 50 documents in the catalog, I get a result in 2 minutes; with all 710 I get a java GC error after 4 minutes.
I have set indexes on relevant nodes in eXist-DB, but this does nothing to performance. It seems to me Saxon is working 'outside' eXist-DB's optimisations, treating eXist-DB as a simple file system.
(For what it's worth, setting href="/db/apps/deheresi/data/ms609_0001.xml"
does not let Saxon see the documents.)
I suspect all of this is why the eXist-DB documentation is non-existent.
As it goes, I am looking for solutions for intensive searches of collections from within XSLT 2.0 loaded within eXist-DB by Xquery transform()
.
If anything, I hope this post helps future searchers encountering the same problem.