How to cache partial crc32 checksums so I don'

2019-08-05 05:01发布

In some code that I wrote recently, I had this pattern:

from zlib import crc32

new_data = get_some_input()

crc32List['stream1'] = crc32(new_data, crc32List['stream1']) & 0xffffffffL
crc32List['stream2'] = crc32(new_data, crc32List['stream2']) & 0xffffffffL
...
crc32List['streamN'] = crc32(new_data, crc32List['streamN']) & 0xffffffffL

It seems to me, that there's a bit of redundant computation going on there and if I can find a function called magic(x, y) that does the following caching, I would be happy:

crc32List['cached'] = crc32(new_data, 0) & 0xffffffffL

crc32List['stream1'] = magic(crc32List['cached'], crc32List['stream1'])
crc32List['stream2'] = magic(crc32List['cached'], crc32List['stream2'])
...
crc32List['streamN'] = magic(crc32List['cached'], crc32List['streamN'])

'magic(x, y)' uses the cached 'x' crc32 value and returns the same result as 'crc32(new_data, y) & 0xffffffffL'

Of course 'stream[0:N]' begin with different values and end up with different values at any point in time, but the crc32 computation is almost always executed (90%+) for all N and always with 'new_data'

1条回答
劳资没心,怎么记你
2楼-- · 2019-08-05 05:47

You did not provide a hint for what language this is with a tag, and I am not familiar with a version of a crc32() function that has arguments as shown. In any case, what I think you're looking for is the crc32_combine() function of zlib.

The arguments to the actual crc32() function in zlib (in C) are crc32(crc, buf, len), where crc is the starting CRC-32 value, buf is a pointer to the bytes to compute the CRC-32 of, and len is the number of bytes. The function returns the updated CRC-32 value.

Given that:

crc32(crc32(0, seq1, len1), seq2, len2) == crc32_combine(crc32(0, seq1, len1), crc32(0, seq2, len2), len2)

Note that crc32_combine() needs to know the length of the second sequence as well as the two CRC-32 values in order to combine them.

查看更多
登录 后发表回答