Hash functions on pointers

Conor O'Neill conor at lion.inmos.co.uk
Thu Jan 24 22:15:48 AEST 1991


I am curious as to whether it is possible to write a _portable_ _ANSI C_
hash function from pointers to integers.
In this sense, portable means that it should work and be vaguely efficient
on different machines; I do not mean that a given pointer should map to
the same integer on different machines (indeed, since pointer representations
may be different, this would be impossible).

What I require is a function which maps a pointer onto a number between
0 and N-1. The function may be (indeed, would be expected to be) many-to-one.

Obviously the vacuous
int hash (void *ptr) { return 0; }
isn't really good enough, even though it could be considered a hash function.

The problem is that the effect of casting a pointer to an integer is
implementation defined; thus an implementation could, for example, always
return 99.

ANSI standard, section 3.3.4:
``A pointer may be converted to an integral type. The size of integer required
and the result are implementation-defined. If the space provided is not
long enough, the behaviour is undefined''

This seems to indicate that there must exist an integral type (which may be a
different type for each implementation) to which a pointer may be converted.
(This does not imply that information may not be lost).
Is there any way of determining what this type is?
Do the other parts of the standard indicate that long int
would always be enough? Or maybe unsigned long int?
Neither seems clear to me.
Ideally there would have been defined something like ptrint_t,
by analogy with size_t, which would be an integral type which is `big enough'.
Presumably `integral type' means a standard type; is it permissible
to require an extended type such as long long int if that were supported?

The standard continues:
``An arbitrary integer may be converted to a pointer. The result is
implementation-defined. (Footnote 42).''

This seems to be less relevant for the suggested hash function, except
for the footnote (which does not constitute part of the standard):

``Footnote 42: The mapping functions for converting a pointer to an integer
or an integer to a pointer are intended to be consistent with the addressing
structure of the execution environment.''

This indicates that a `quality of implementation' factor is that
there is some sensible relationship between the `value' of the pointer
and the integral value; otherwise an implementation could satisfy the
standard by always returning 99 when casting a pointer to an integer.

Given all the above, is the following hash function legal and portable?
(I.e. will it have `undefined' effect in any implementation?)
(Assume that N is less than INT_MAX).

int hash (void *ptr)
{
  unsigned long int x = (unsigned long int)ptr;
  return x % N;
}

---
Conor O'Neill, Software Group, INMOS Ltd., UK.
UK: conor at inmos.co.uk		US: conor at inmos.com
"It's state-of-the-art" "But it doesn't work!" "That is the state-of-the-art".



More information about the Comp.std.c mailing list