Hashing n-grams by cyclic polynomials - java imple

2019-04-13 09:38发布

问题:

I'm solving some problem that involves Rabin–Karp string search algorithm. This algorithm requires rolling hash to be faster then naive search. This article describes how to implement rolling hash. I implemented "Rabin-Karp rolling hash" without problems and found few implementations implementations, but article also mentions computational complexity and that hashing n-grams by cyclic polynomials is prefered. It links to BuzHash implementation of such technique but I wonder how it can be used to build n-gram hash on top of it. I want to have something like this or

CPHash cp = new CPHash("efghijk");
cp.shiftRight('l') // now we got hash of "fghijki"
cp.shiftLeft('d') // "defghi"

for java.

For people who will encounter problems related with string search (like me) there are some articles that I found usefull: 1, 2, 3

回答1:

I recently published an Apache licensed Java library which implements several rolling hash functions including Cyclic and Rabin-Karp:

http://code.google.com/p/rollinghashjava/

https://github.com/lemire/rollinghashjava