Perfect HASH functions.....

T. William Wells bill at twwells.com
Tue Sep 26 23:33:39 AEST 1989


In article <1989Sep24.214153.8867 at rpi.edu> flynn at acm.rpi.edu (Kevin Lincoln Flynn) writes:
: As long as people are talking hashing, can anyone think of a GOOD way to hash
: 16-bit numbers determined dynamically at runtime into a fast way to look them
: up?  Specifically, I'm talking about the attribute dictionary of an object in
: an OO system we're working on here -- the tags of attributes are 16-bit handles
: about which nothing is known at runtime.  Someone suggested closed hashing,
: with the number of buckets growing by roughly doubling primes....  ie start
: with two buckets, then 3, then 7, then 17, etc.....  any ideas out there?
: (oh yeah, the hash function as well as the strategy'd by nice.  [ :) ])

Here's my invention, which is not quite a hash table, but will do
quite as well: create a binary tree in which the numbers are used to
control the descent of the tree. In other words, if you had three
numbers (of four bits): 0100, 0110, 1000, your tree would look like
this:

		      *
		     / \
		    *  1000
		   / \
		0100 0110

To find 0110, look at the left bit, a zero, which means go left; the
next bit, a 1 means go right, and then, since we are at a leaf, we
are done. If the item is not in the tree, one runs into a leaf that
is not equal to the item or a null leaf pointer. Both insertion and
deletion should be fairly obvious. The times for search, insert, and
delete are all bounded and are linear in the maximum depth of the
tree.

There are no collisions and it is very fast to search. In fact, the
only drawback I know of is that you really should dynamically allocate
the node structures, unless you are willing to accept a difficult to
compute upper bound on the number of things you can store in the tree.
You can, however, add additional information to the nodes and keep the
number of nodes to no more than 50%, more or less, of the number of
leaves.

There are, of course, many variations on this. The one I invented it
with generates hash code for strings, uses the hash code to control
the tree traversal, and stores the items in small buckets at the
leaves.

---
Bill                    { uunet | novavax | ankh | sunvice } !twwells!bill
bill at twwells.com



More information about the Comp.lang.c mailing list