I need to store a huge amount of binary data into a file, but I want also to read/write the header of that file in XML format.
Yes, I could just store the binary data into some XML value and let it be serialized using base64 encoding. But this wouldn't be space-efficient.
Can I "mix" XML data and raw binary data in a more-or-less standardized way?
I was thinking about two options:
Is there a way to do this using JAXB?
Or is there a way to take some existing XML data and append binary data to it, in such a way that the the boundary is recognized?
Isn't the concept I'm looking for somehow used by / for SOAP?
Or is it used in the email standard? (Separation of binary attachments)
Scheme of what I'm trying to achieve:
[meta-info-about-boundary][XML-data][boundary][raw-binary-data]
Thank you!
You can leverage AttachementMarshaller & AttachmentUnmarshaller for this. This is the bridge used by JAXB/JAX-WS to pass binary content as attachments. You can leverage this same mechanism to do what you want.
PROOF OF CONCEPT
Below is how it could be implemented. This should work with any JAXB impl (it works for me with EclipseLink JAXB (MOXy), and the reference implementation).
Message Format
Root
This is an object with multiple byte[] properties.
Demo
This class has is used to demonstrate how MessageWriter and MessageReader are used:
MessageWriter
Is responsible for writing the message to the desired format:
MessageReader
Is responsible for reading the message:
For More Information
I followed the concept suggested by Blaise Doughan, but without attachment marshallers:
I let an
XmlAdapter
convert abyte[]
to aURI
-reference and back, while references point to separate files, where raw data is stored. The XML file and all binary files are then put into a zip.It is similar to the approach of OpenOffice and the ODF format, which in fact is a zip with few XMLs and binary files.
(In the example code, no actual binary files are written, and no zip is created.)
Bindings.java
Main.java
Output
I don't think so -- XML libraries generally aren't designed to work with XML+extra-data.
But you might be able to get away with something as simple as a special stream wrapper -- it would expose an "XML"-containing stream and a binary stream (from the special "format"). Then JAXB (or whatever else XML library) could play with the "XML" stream and the binary stream is kept separate.
Also remember to take "binary" vs. "text" files into account.
Happy coding.
This is not natively supportted by JAXB as you do not want serialize the binary data to XML, but can usually be done in a higher level when using JAXB. The way I do this is with webservices (SOAP and REST) is using MIME multipart/mixed messages (check multipart specification). Initially designed for emails, works great to send xml with binary data and most webservice frameworks such as axis or jersey support it in an almost transparent way.
Here is an example of sending an object in XML together with a binary file with REST webservice using Jersey with the jersey-multipart extension.
XML object
Client
Server
I tried to send a file with 110917 bytes. Using wireshark, you can see that the data is sent directly over HTTP like this:
As you see, binary data is sent has octet-stream, with no waste of space, contrarly to what happens when sending binary data inline in the xml. The is just the very low overhead MIME envelope. With SOAP the principle is the same (just that it will have the SOAP envelope).