I wish to convert a binary file in one format to a SequenceFile.
I have a Python script that takes that format on stdin and can output whatever I want.
The input format is not line-based. The individual records are binary themselves, hence the output format cannot be \t delimited or broken into lines with \n.
Can I use the Hadoop Streaming interface to consume a binary format? How do I produce a binary output format?
I assume the answer is "No" unless I hear otherwise.