I'm looking for a way to compress an ascii-based string, any help?
I also need to decompress it. I tried zlib but with no help.
What can I do to compress the string into lesser length?
code:
def compress(request):
if request.POST:
data = request.POST.get('input')
if is_ascii(data):
result = zlib.compress(data)
return render_to_response('index.html', {'result': result, 'input':data}, context_instance = RequestContext(request))
else:
result = "Error, the string is not ascii-based"
return render_to_response('index.html', {'result':result}, context_instance = RequestContext(request))
else:
return render_to_response('index.html', {}, context_instance = RequestContext(request))
Using compression will not always reduce the length of a string!
Consider the following code;
Let's try this on an empty string;
So
zlib
produces an extra 8 characters, andbz2
14. Compression methods usually put a 'header' in front of the compressed data for use by the decompression program. This header increases the length of the output.Let's test a single word;
Even if you would substract the length of the header, the compression hasn't made the word shorter at all. That is because in this case there is little to compress. Most of the characters in the string occur only once. Now for a short sentence;
Again the compression output is larger than the input text. Due to the limited length of the text, there is little repetition in it, so it won't compress well.
You need a fairly long block of text for compression to actually work;
You don't even need you data to be ascii, you can feed zlib with anything
What you probably want here - compressed data to be ascii string? Am I right here?
If so - you should know that you have very small alphabet to code compressed data => so you'd have more symbols used.
For example to code binary data in base64 (you will get ascii string) but you will use ~30% more space for that