hash algorithm

Jon Zeeff zeeff at b-tech.UUCP
Fri Aug 19 10:00:26 AEST 1988


In article <2550078 at hpisod2.HP.COM> decot at hpisod2.HP.COM (Dave Decot) writes:
>> could anyone recommend a hashing algorithm to store a list of 10
>> digit phone numbers (area code plus 7 digits).
>Treat the ten digits as an integer, n, and compute the index
>as (n % HASH_SIZE), where HASH_SIZE is the size of your hash table.
>

I don't second Dave's recommendation.  What seems like a fairly random 
hash function usually doesn't work very well when you actually measure 
it's performance.  Here is one stolen from pathalias that works well.  

/* This is a simplified version of the pathalias hashing function.
 * Thanks to Steve Belovin and Peter Honeyman
 *
 * hash a string into a long int.  31 bit crc (from andrew appel).
 * the crc table is computed at run time by crcinit() -- we could
 * precompute, but it takes 1 clock tick on a 750.
 *
 * This fast table calculation works only if POLY is a prime polynomial
 * in the field of integers modulo 2.  Since the coefficients of a
 * 32-bit polynomial won't fit in a 32-bit word, the high-order bit is
 * implicit.  IT MUST ALSO BE THE CASE that the coefficients of orders
 * 31 down to 25 are zero.  Happily, we have candidates, from
 * E. J.  Watson, "Primitive Polynomials (Mod 2)", Math. Comp. 16 (1962):
 *	x^32 + x^7 + x^5 + x^3 + x^2 + x^1 + x^0
 *	x^31 + x^3 + x^0
 *
 * We reverse the bits to get:
 *	111101010000000000000000000000001 but drop the last 1
 *         f   5   0   0   0   0   0   0
 *	010010000000000000000000000000001 ditto, for 31-bit crc
 *	   4   8   0   0   0   0   0   0
 */

#define POLY 0x48000000L	/* 31-bit polynomial (avoids sign problems) */

static long CrcTable[128];

static void
crcinit()
{	register int i, j;
	register long sum;

	for (i = 0; i < 128; ++i) {
		sum = 0L;
		for (j = 7 - 1; j >= 0; --j)
			if (i & (1 << j))
				sum ^= POLY >> j;
		CrcTable[i] = sum;
	}
	DEBUG("crcinit: done\n");
}

static long
hash(name, size)
register char *name;
register int size;
{
	register long sum = 0L;

	while (size--) {
		sum = (sum >> 7) ^ CrcTable[(sum ^ (*name++)) & 0x7f];
	}
	DEBUG("hash: returns (%ld)\n", sum % INDEX_SIZE);
	return(sum % INDEX_SIZE);
}

-- 
Jon Zeeff           		Branch Technology,
uunet!umix!b-tech!zeeff  	zeeff%b-tech.uucp at umix.cc.umich.edu



More information about the Comp.lang.c mailing list