hash function

Dan Bernstein brnstnd at kramden.acf.nyu.edu
Tue Jun 25 13:57:34 AEST 1991


In article <15027 at ulysses.att.com> kpv at ulysses.att.com (Phong Vo[drew]) writes:
> A good property for a hash function is if you keep feeding it
> a constant character, it performs as if it's a random number generator.

No, I don't think so. This is commonly true of good hash functions and
commonly false of bad ones, but keep in mind that pseudo-random
sequences have to keep working 100, 1000, 1000000 elements later, while
simple string hash functions are rarely applied to strings with more
than 80 characters.

  [ as a generator mod 2^n, 37 is better than 33 is better than 31 ]

Again, I don't think this is accurate. Each number has a sufficiently
long period mod 2^n that you don't have to worry about repeats.

> We typically hash byte strings and we want
> the bits to mix in the 16-bit or 32-bit registers relatively fast.
  [ so use a multipler of 261 ]

Again, practically any good multiplier works. I think you're worrying
about the fact that 31c + d doesn't cover any reasonable range of hash
values if c and d are between 0 and 255. That's why, when I discovered
the 33 hash function and started using it in my compressors, I started
with a hash value of 5381. I think you'll find that this does just as
well as a 261 multiplier.

---Dan



More information about the Comp.lang.c mailing list