Hash functions on pointers

Norman Diamond diamond at jit345.swstokyo.dec.com
Fri Jan 25 13:08:55 AEST 1991


In article <13983 at ganymede.inmos.co.uk> conor at inmos.co.uk () writes:
>I am curious as to whether it is possible to write a _portable_ _ANSI C_
>hash function from pointers to integers.

Strictly conforming, I am not sure.  Portable to every imaginable
implementation, yes.

In response to a different topic, it is not entirely certain whether a
pointer to an arbitrary object, cast to type (char *), actually points to
the first byte of the object and can be dereferenced, or not.  However, in
every imaginable implementation, of course it can be.

So, suppose your pointer is
  sometype *p;
Do a
  char *p_p;
  for (p_p = (char *) &p; p_p < (char *) &p + sizeof p; p ++) {
    some computation on *p_p, one of the bytes in p;
  }

At first glance, the ampersands in the for statement look like a mistake,
but of course they aren't.

>The problem is that the effect of casting a pointer to an integer is
>implementation defined; thus an implementation could, for example, always
>return 99.

Yes.  Or 0  :-)

>ANSI standard, section 3.3.4:
>``A pointer may be converted to an integral type. The size of integer required
>and the result are implementation-defined. If the space provided is not
>long enough, the behaviour is undefined''
>This seems to indicate that there must exist an integral type (which may be a
>different type for each implementation) to which a pointer may be converted.
>(This does not imply that information may not be lost).

Yes, this wording bugs me too.  If "the space provided" means the space
provided by the implementation-defined size of integer, then what is the
meaning of the implementation defining that size?  If "the space provided"
means the space provided by the user, then why even bother to include the
third sentence; the first two sentences do not state that a conversion to
the wrong size can be done at all.

>Is there any way of determining what this type is?

When it's implementation-defined, the implementation has to agree to tell
you -- in a manual, maybe a header file, maybe a telephone number, etc.

>Do the other parts of the standard indicate that long int
>would always be enough? Or maybe unsigned long int?

No.

>Ideally there would have been defined something like ptrint_t,
>by analogy with size_t, which would be an integral type which is `big enough'.

Of course.

>Presumably `integral type' means a standard type; is it permissible
>to require an extended type such as long long int if that were supported?

Presumptions don't go very far.  Good luck.

>... a `quality of implementation' factor is that
>there is some sensible relationship between the `value' of the pointer
>and the integral value; otherwise an implementation could satisfy the
>standard by always returning 99 when casting a pointer to an integer.

Of course.

>Given all the above, is the following hash function legal and portable?
>(Assume that N is less than INT_MAX).
>int hash (void *ptr)
>{
>  unsigned long int x = (unsigned long int)ptr;
>  return x % N;
>}

Whether it's legal or not depends on interpreting the standard.
Portable, probably, though I think my suggestion was more portable.

>(I.e. will it have `undefined' effect in any implementation?)

Will, probably not.  Allowed to, yes.
--
Norman Diamond       diamond at tkov50.enet.dec.com
If this were the company's opinion, I wouldn't be allowed to post it.



More information about the Comp.std.c mailing list