Hash functions in C

Ozan Yigit oz at yunexus.yorku.ca
Tue Nov 13 04:26:28 AEST 1990


In article <14461 at goofy.megatest.UUCP> djones at megatest.UUCP 
[Dave Jones] writes:

>I just now tested some hash-functions for an application which is
>probably more typical, namely a compiler. The simplest function is
>still a big winner on all counts.

Dave, you may also wish to take a look at the following.

%A B. J. McKenzie
%A R. Harries
%A T. Bell
%T Selecting a Hashing Algorithm
%J Software - Practice and Experience
%V 20
%N 2
%D Feb 1990
%P 209-224

This interesting article evaluates some of the more commonly known hash
functions from a few sources: Amsterdam Compiler Kit, Eth Modula-2 Cross
Compiler, Gnu-cpp, Gnu-cc1, pcc (4.3bsd), C++ (AT&T) and Icon. 

Their conclusions suggest that the algorithms of the style

	h(0) = 0; h(i) = k * h(i-1) + ch(i)	[for 1 <= i <= n]
	hash = h(n) MOD N

perform well provided k (a constant multiplier) and N (the table size) are
chosen well. They feel that most implementation of hash functions contain
far-from-optimal choices [so, what else is new?]. They also state:

	The experiments have shown that very small variations in N can
	produce large variations in the efficiency of the hash-table
	look-up, and that the popular view, that choice of a prime
	number will automatically ensure a good result, is not well
	founded.
								[pp 224]

They found that one of the simplest hashing functions (gnu-cpp) performed
well, both in speed and number of probes: [included verbatim from gnu-cpp.
k = 4, N = 1403 odd but not prime]
	
	#define HASHSIZE 1403		
	#define HASHSTEP(old, c) ((old << 2) + c)
	#define MAKE_POS(v) (v & ~0x80000000) /* make number positive */
	
	hashf (name, len, hashsize)
	     register U_CHAR *name;
	     register int len;
	     int hashsize;
	{
	  register int r = 0;
	
	  while (len--)
	    r = HASHSTEP (r, *name++);
	
	  return MAKE_POS (r) % hashsize;
	}


[ps: I have not yet tested this function, so I do not know if their results
are easily reproducible. For example, would making r unsigned, and avoiding
MAKE_POS crock slow this function down?]

enjoy..		oz
---
First learn your horn and all the theory.	Internet: oz at nexus.yorku.ca
Next develop a style. Then forget all that	uucp: utzoo/utai!yunexus!oz
and just play.		Charlie Parker [?]	York U. CCS: (416) 736 5257



More information about the Comp.lang.c mailing list