E.g. how can it tell that a 4GB text file can be compressed to, say, 200MB? Obviously, it doesn't read all of the contents in 2 or so seconds... so what kind of predictive algorithm(s) does it use?
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
回答1:
They use variant of Prediction by partial matching (PPM) called PPMd. Look at wiki
回答2:
It takes usually -log(x) + log(2) bits to compress x bits. However this is a highly theoretical value and it depends heavenly on the data you want to compress. For your data you have to record each character and frequency and insert it in the formula. For example try only 3 character first. You want to look for shannon-code.