My servlet's doPost() receives an HttpServletRequest whose ServletInputStream sends me a large chunk of uuencoded data wrapped in XML. E.g., there is an element:
<filedata encoding="base64">largeChunkEncodedHere</filedata>
I need to decode the chunk and write it to a file. I would like to get an InputStream from the chunk, decode it as a stream using MimeUtility, and use that stream to write the file---I would prefer not to read this large chunk into memory.
The XML is flat; i.e., there is not much nesting. My first idea is to use a SAX parser but I don't know how to do the hand-off to a stream to read just the chunk.
Thanks for your ideas.
Glenn
Edit 1: Note JB Nizet's pessimistic answer in this post.
Edit 2: I've answered my own question affirmatively below, and marked maximdim's answer below as correct, even though it doesn't quite answer the question, it did direct me to the StAX API and Woodstox.
You could use SAX filter or XPath to get only element(s) you're interested in. Once you have content of your element, pass it to MimeUtility.decode() and write stream to file.
I suggest you update your question with code sample and let us know what doesn't work.
Update:
Here is sample code using StaX2 parser (Woodstox). For some reason StaX parser included in JDK doesn't seems to have comparable getText() method, at least at quick glance.
Obviously input (r) and output (w) could be any Reader/Writer or Stream - using String just for example here.
Here are some details on how streaming from an element while parsing with StAX is possible, using the Woodstox framework.
There is a good overview in this article.
From XMLInputFactory we can call createXMLStreamReader(java.io.InputStream stream) using the ServletInputStream. This returns an XMLStreamReader2, which has a getText(Writer w, boolean preserveContents) method that returns an int for the number of bytes written. This method must be implemented. In the implementation Stax2ReaderImpl there is this implementation
In this code we will need to change the getTextCharacters() method so that it reads from the InputStream. In the Woodstox tests TestGetSegmentedText testSegmentedGetCharacters() method we see a sr.getTextCharacters(offset, buf, start, len) method used. In fact the javadoc for the multiple argument XMLStreamReader.getTextCharacters() shows the following implementation.
One more suggestion wrt Woodstox: it can also decode that base64 encoded stuff from within, efficiently. To do that, you need to cast
XMLStreamReader
intoXMLStreamReader2
(orTypedXMLStreamReader
), which is part of Stax2 extension API.But with that, you get methods
readElementAsBinary()
andgetElementAsBinary()
which automatically handle Base64 decoding.XMLStreamWriter2
similarly has Base64-encoding methods for writing binary data.