I have read an article on Internet and know that the natural way of decoding by traversing from root but I want to do it faster with a lookup table.
After reading, I still cannot get the points.
ex:
input:"abcdaabaaabaaa" code data 0 a 10 b 110 c 111 d
The article says that due to variable length, it determine the length by taking a bit of string of max code length and use it as index.
output:"010110111001000010000" Index Index(binary) Code Bits required 0 000 a 1 1 001 a 1 2 010 a 1 3 011 a 1 4 100 b 2 5 101 b 2 6 110 c 3 7 111 d 3
My questions are:
What does it means
due to variable length, it determine the length by taking a bit of string of max code length
? How to determine the length?How to generate the lookup table and how to use it? What is the algorithm behind?
For your example, the maximum code length is 3 bits. So you take the first 3 bits from your stream (010) and use that to index the table. This gives code, 'a' and bits = 1. You consume 1 bit from your input stream, output the code, and carry on. On the second go around you will get (101), which indexes as 'b' and 2 bits, etc.
To build the table, make it as large as 1 << max_code_length, and fill in details as if you are decoding the index as a huffman code. If you look at your example all the indices which begin '0' are a, indices beginning '10' are b, and so on.