How to choose the modulus value in the Rabin-Karp algorithm?
I have a question about choosing q and d in the Rabin-Karp algorithm for string search. The algorithm uses the values q as a modulus and d as a hash function.
If I choose q as the amount of power 2 and d = q-1 or d = q + 1
-
How do these options affect false hits?
-
What templates will be considered as false hits in the case d = q + 1 and in the case d = q-1?
source to share
X mod (2 ^ n +1) can be described as a variable sum / difference, for example.
X = 0x11223344, n = 8, X mod 257 == 0x44 - 0x33 + 0x22 - 0x11 mod 257. The problem here is that any duplicate email will cancel itself - not very practical - of course n shouldn't be 8.
X mod (2 ^ n -1) will be the sum of all n bit patterns, for example.
X = 0x11223344, n = 8, X mod 255 == (0x44 + 0x33 + 0x22 + 0x11) mod 255
The problem is that stripping off bytes: Hash ('ab') = Hash ('ba').
source to share