I am writing a RESTFUL web service in Java. The idea is to "cut down" an XML document and strip away all the un-needed content (~98%) and leave only the tags we're interested in, while maintaining the document's structure, which is as follows (I cannot provide the actual XML content for confidentiality reasons):

<sear:SEGMENTS xmlns="" xmlns:sear="">
         <sear:DOCSET IS_LOCAL="true" TOTAL_TIME="176" LASTHIT="9" FIRSTHIT="0" TOTALHITS="262" HIT_TIME="11">
            <sear:DOC SEARCH_ENGINE_TYPE="Local Search Engine" SEARCH_ENGINE="Local Search Engine" NO="1" RANK="0.086826384" ID="2347460">

Of course, this is the structure of only the tags we are interested in - there are hundreds more tags, but they are irrelevant.

The square brackets ([]) are not part of the XML and indicate that the element are elements of a list of children and occur more than once - one per match of the search from the RESTFUL service.

This being said, my Java code containing the XSLT stylesheet is as follows:


    import javax.xml.transform.Transformer;
    import javax.xml.transform.TransformerException;
    import javax.xml.transform.TransformerFactory;
    import javax.xml.transform.TransformerFactoryConfigurationError;

    public String cutXML() throws TransformerFactoryConfigurationError, TransformerException

       String xmlSourceResource = this.xml; // where this.xml is the full XML string of structure as presented above

       String xsltResource =
       "<xsl:stylesheet version=\"1.0\" xmlns:xsl=\"\" xmlns:sear=\"\">" +

       "    <xsl:output method=\"xml\" version=\"1.0\" omit-xml-declaration=\"no\" encoding=\"UTF-8\" indent=\"yes\"/>" +
       "    <xsl:strip-space elements=\"*\"/>" +

       "    <sear:WhiteList>" +
       "        <name>title</name>" +
       "        <name>author</name>" +                
       "    </sear:WhiteList>" +

       "    <xsl:template match=\"node()|@*\">" +
       "        <xsl:copy>" +
       "            <xsl:apply-templates select=\"node()|@*\"/>" +
       "        </xsl:copy>" +
       "    </xsl:template>" +

       "    <xsl:template match=\"*[not(descendant-or-self::*[name()=document('')/*/sear:WhiteList/*])]\"/>" +


       StringWriter xmlResultResource = new StringWriter(); // where the transformed/stripped-down XML will be written

       Transformer xmlTransformer = TransformerFactory.newInstance().newTransformer(new StreamSource(new StringReader(xsltResource))); // create transformer object with XSLT given

       xmlTransformer.transform(new StreamSource(new StringReader(xmlSourceResource)), new StreamResult(xmlResultResource)); // transform XML with transformer and write into result StringWriter

       return xmlResultResource.getBuffer().toString(); // return transformed XML string


Unfortunately, when I run it on the server, all I get is an empty page with an empty source, as if the result of the transformation was an empty String.

The server's log file first gave the following information:

    [#|2012-04-26T18:26:24.967+0000|INFO|glassfish3.1.2|com.sun.jersey.api.core.PackagesResourceConfig|_ThreadID=23;_ThreadName=Thread-2;|Scanning for root resource and provider classes in the packages: dk.kb.mobileservice|#]

    [#|2012-04-26T18:26:24.969+0000|INFO|glassfish3.1.2|com.sun.jersey.api.core.ScanningResourceConfig|_ThreadID=23;_ThreadName=Thread-2;|Root resource classes found: class dk.kb.mobileservice.Middle|#]

    [#|2012-04-26T18:26:24.970+0000|INFO|glassfish3.1.2|com.sun.jersey.api.core.ScanningResourceConfig|_ThreadID=23;_ThreadName=Thread-2;|No provider classes found.|#]

    [#|2012-04-26T18:26:24.978+0000|INFO|glassfish3.1.2|com.sun.jersey.server.impl.application.WebApplicationImpl|_ThreadID=23;_ThreadName=Thread-2;|Initiating Jersey application, version 'Jersey: 1.11 12/09/2011 10:27 AM'|#]

    [#|2012-04-26T18:26:25.192+0000|INFO|glassfish3.1.2||_ThreadID=23;_ThreadName=Thread-2;|WEB0671: Loading application [kb2] at [/kb2]|#]

    [#|2012-04-26T18:26:25.200+0000|INFO|glassfish3.1.2||_ThreadID=23;_ThreadName=Thread-2;|kb2 was successfully deployed in 2,293 milliseconds.|#]

    [#|2012-04-26T18:26:46.263+0000|SEVERE|glassfish3.1.2||_ThreadID=20;_ThreadName=Thread-2;|SystemId Unknown; Line #0; Column #0; java.lang.NullPointerException |#]

    [#|2012-04-26T18:31:09.772+0000|SEVERE|glassfish3.1.2||_ThreadID=21;_ThreadName=Thread-2;|SystemId Unknown; Line #0; Column #0; java.lang.NullPointerException |#]

and now it returns the following issues:

    [#|2012-04-27T00:05:07.731+0000|SEVERE|glassfish3.1.2||_ThreadID=21;_ThreadName=Thread-2;|Error on line 1 column 1 of file:/root/webglassfish3/glassfish/domains/domain1/config/: SXXP0003: Error reported by XML parser: Content is not allowed in prolog.|#]

    [#|2012-04-27T00:05:07.732+0000|SEVERE|glassfish3.1.2||_ThreadID=21;_ThreadName=Thread-2;|Recoverable error on line 1 SXXP0003: org.xml.sax.SAXParseException: Content is not allowed in prolog.|#]

I've tested the XML file and transformed it via browser, and it worked, so I don't think it's the XML's nor the XSLT stylesheet's fault... It seems to be a Java issue.

When I run the above Java code on the entire XML outside of GlassFish, I get the following errors:

    Exception in thread "main" java.lang.VerifyError: (class: GregorSamsa$0, method: test signature:         (IIIILcom/sun/org/apache/xalan/internal/xsltc/runtime/AbstractTranslet;Lcom/sun/org/apache/xml/internal/dtm/DTMAxisIterator;)Z) Incompatible type for getting or setting field
        at GregorSamsa.applyTemplates()
        at GregorSamsa.applyTemplates()
        at GregorSamsa.transform()
        at XML2JSON.cutXML(
        at XML2JSON.main(


Content is not allowed in prolog. usually means that you have content before the start of your XML. The XML parser expects to either see the XML declaration: <?xml version="1.0"?>, or if that is omitted, then just the start of the document element (i.e. <sear:SEGMENTS>)

Print/log the content of this.xml and verify that there are no leading whitespace characters or other content before the XML declaration or document element.