So I'm having some issues decrypting a snappy file from HDFS. If I use hadoop fs -text
I am able to uncompress and output the file just file. However if I use hadoop fs -copyToLocal
and try to uncompress the file with python-snappy I get
snappy.UncompressError: Error while decompressing: invalid input
My python program is very simple and looks like this:
import snappy
with open (snappy_file, "r") as input_file:
data = input_file.read()
uncompressed = snappy.uncompress(data)
print uncompressed
This fails miserably for me. So I tried another text, I took the output from hadoop fs -text
and compressed it using the python-snappy library. I then outputted this to a file. I was able to then read this file in and uncompress it just fine.
AFAIK snappy is backwards compatible between version. My python code is using the latest snappy version and I'm guessing hadoop is using an older snappy version. Could this be a problem? Or is there something else I am missing here?