hash algorithm

Dave Brower daveb at llama.rtech.UUCP
Tue Aug 23 06:10:08 AEST 1988


In article <4712 at b-tech.UUCP> zeeff at b-tech.UUCP (Jon Zeeff) writes:
>In article <2550078 at hpisod2.HP.COM> decot at hpisod2.HP.COM (Dave Decot) writes:
>>> could anyone recommend a hashing algorithm to store a list of 10
>>> digit phone numbers (area code plus 7 digits).
>>Treat the ten digits as an integer, n, and compute the index
>>as (n % HASH_SIZE), where HASH_SIZE is the size of your hash table.
>
>I don't second Dave's recommendation.  What seems like a fairly random
>hash function usually doesn't work very well when you actually measure
>it's performance.  Here is one stolen from pathalias that works well.
>
...
>static long
>hash(name, size)
>register char *name;
>register int size;
>{
>	register long sum = 0L;
>
>	while (size--) {
>		sum = (sum >> 7) ^ CrcTable[(sum ^ (*name++)) & 0x7f];
>	}
>	DEBUG("hash: returns (%ld)\n", sum % INDEX_SIZE);
>	return(sum % INDEX_SIZE);
>}
>

The konstant factors add up.  It is often the case that a better hash
isn't worth the effort.  The expression:

	sum = (sum >> 7) ^ CrcTable[(sum ^ (*name++)) & 0x7f];

take many more instructions to figure than the naive:

	sum += *name++;

If the collision chains generated by the naive function are short, it
may be cheaper to search the chain than use a smarter hash.

What I found in some real application, with about 3000 entries in the
table, was that naive beat smart in real execution time.   The naive
collision chains were typically 4-7 entries long.  I could not reduce
the time spent in the smart hash function enough to make it worth using.

The wall was high enough to get me to try to use splay trees instead of
hashing.  When it was all over, my actual cpu figures were the same for
tree and hash, giving ordering in the tree for free.

Moral:  Try several ways and *measure*.  The results may be surprising.

I posted the tree library in comp.source.unix Archive-name v15i005. 
Thanks to Doug Jones at uiowa for his original Pascal implementation.

-dB
"Ready when you are Raoul!"
{amdahl, cpsc6a, mtxinu, sun, hoptoad}!rtech!daveb daveb at rtech.com <- FINALLY!



More information about the Comp.lang.c mailing list