Sizes, alignments, and maxima (was: Contiguous Arrays)

Sun Feb 26 13:37:24 AEST 1989

In article <1839 at valhalla.ee.rochester.edu> badri at valhalla.ee.rochester.edu (Badri Lokanathan) writes:
>In article <340009 at hplvli.HP.COM>, boyne at hplvli.HP.COM (Art Boyne) writes:
>> Henry is correct: pointer arithmetice isn't guaranteed unless you keep it
>> *within* an array.
>On a slightly different note and out of curiosity, I tried the
>following experiment on a sun 3 running OS3.4:
>
[Example demonstrating correct wraparound behavior of unsigned
arithmetic with underflow.]
>
>Thus even though the intermediate value was rubbish (-100), it still
>worked correctly.
>Similarly with the pointer problem, while everybody has
>said that a problem *might* occur, is there a machine where
>failure will definitely occur?

Unsigned arithmetic is guaranteed to be modulo 2**n in the
presence of underflow or underflow (where n is of course the word
size in bits).  The same cannot be said for pointer arithmetic.
For many machines, pointer arithmetic is equivalent to unsigned
arithmetic (uses the same instructions and registers), but this
is not required by the language.

Yes, Virginia, there are machines out there with baroque memory
architectures ("baroque" if you're used to single contiguous
linear address spaces, as most of us are; the more complicated
architectures may have compensating advantages) for which pointer
arithmetic is anything but simple, but instead uses special
registers and/or instructions.  The canonical example is the
80n86 for n>=2 and when operating in certain memory management
modes.  I won't get into a full-blown description of the 8086
family's segmented architecture here; suffice it to say that,
although correct C programs can be made to run under it, it can
be a real mess.

For the purposes of the current discussion, pointer arithmetic
involving intermediate results which overflow or underflow can
and do result in processor traps.  You might ask why "regular"
unsigned arithmetic can't still be used for pointers on such
machines.  The answer is that since they don't have single linear
contiguous address spaces, pointers aren't simple numbers, but
instead (in the case of an 80x86 in other than "small" model) a
segment,offset pair.  Pointer arithmetic typically only operates
on the offset portion, which works as long as the pointer stays
within the same segment, but fails if the offset overflows, since
no carry into the segment portion normally takes place.  (Some
compilers can perform "huge model" addressing, generating
laborious code for each pointer arithmetic operation, to perform
the carry manually.)

If this sort of thing shocks you, you are not alone.  Many people
find that the warts and kludges required to "support" segmented
architectures demolish any of the purported benefits that Intel
marketing literature would lead us to believe are to be derived
from this "feature."  To paraphrase Douglas Adams, this is a very
respectable view, widely held by right-thinking people, who are
largely recognizable as being right-thinking people by the mere
fact that they hold this view.

Several people have recently argued that the use of NULL as a
preprocessor macro is at the root of much needless confusion, and
should therefore be stamped out.  Can we make a similar argument
about segmented architectures and stamp them out too :-) ?  It
occurs to me that segmented architectures are probably more
appropriate for earthworms than computers (or has some other wag
made this suggestion already?).

Followups to comp.arch.intel.flame.

                                            Steve Summit
                                            scs at adam.pika.mit.edu