hash function for mac needed

Rich Salz rsalz at bbn.com
Sat Jun 22 03:35:04 AEST 1991


In <1991Jun19.161959.13677 at zoo.toronto.edu> henry at zoo.toronto.edu (Henry Spencer) writes:
|In fact, when I originally did the hashed newsgroup-lookup code in C News
|expire, a quick investigation did not reveal any hashing function which
|had overall performance much better than simply using the length of the
|string!  (The newsgroup list has grown so much that this decision is in
|need of revision, mind you.)

I think you'll want to re-check.  The distributions are real uneven -- there
are more than 40 newsgroups of length 10, for example, so you have to resolve
lots of conflicts, and if you don't put a secondary hash, that means lots of
strcmp's.  Of course, the design is very dependant on the newsgroup that a
site receives and the distribution of articles within those sites.

I use Chris Torek's hash function (amazingly good distribution and amazingly
cheap to calculate) with a table size of 128 and sort the conflicts based on
the highest article number.  It seems to give the biggest win for the biggest
number of systems.
	/r$
-- 
Please send comp.sources.unix-related mail to rsalz at uunet.uu.net.
Use a domain-based address or give alternate paths, or you may lose out.



More information about the Comp.lang.c mailing list