Portability vs. Endianness

Dan Bernstein brnstnd at kramden.acf.nyu.edu
Wed Mar 13 14:42:05 AEST 1991


In article <1991Mar12.105451.19488 at dit.upm.es> esink at turia.dit.upm.es () writes:
> long var;
> unsigned char Bytes[4];
> Is there a portable way to move the value held in var
> into the memory space pointed to by Bytes, with the restriction
> that the representation be in Most Significant Byte first
> format ?

Portable questions may have unportable answers, but unportable questions
can never have portable answers. In other words, ``Most Significant Byte
first format (of longs in a 4-byte field)'' is not a portable concept,
so there is no way that portable code can implement the format.

Now what you probably mean is that Bytes[0] should contain the low 8
bits of var, Bytes[1] should contain the next 8 bits, etc. Literally:

  Bytes[0] = ((unsigned long) var) & 255;
  Bytes[1] = (((unsigned long) var) >> 8) & 255;
  Bytes[2] = (((unsigned long) var) >> 16) & 255;
  Bytes[3] = (((unsigned long) var) >> 24) & 255;

(You should use unsigned to keep the sign bit out of the way.) If
(unsigned long) var is between 0 and 2^32 - 1, then you can recover var
as

  var = (long) (((((((unsigned long) Bytes[3]) * 256
	  + Bytes[2]) * 256) + Bytes[1]) * 256) + Bytes[0]);
	
I think this is machine-independent. But there are no guarantees that
long has 32 bits, or that char has 8 bits.

In practice, if you can assume that longs are 32 bits and chars are 8
bits, you don't want to write code like the above to do simple byte
copies. Some people will suggest #ifdefs. If you want to avoid the
#ifdefs, you might find an alternate strategy useful; I quote from my
snuffle.c:

  WORD32 tr[16] =
   {
    50462976, 117835012, 185207048, 252579084, 319951120, 387323156,
    454695192, 522067228, 589439264, 656811300, 724183336, 791555372,
    858927408, 926299444, 993671480, 1061043516
   } ;
  ...
  register unsigned char *ctr = &tr[0];
  ...
  x[n] = h[n] + m[ctr[n & 31]];

Here WORD32 is a 32-bit type, and m is a char (actually unsigned char)
pointer, really pointing to an array of WORD32s. m[ctr[0]] will on most
32-bit architectures be the low 8 bits of the first WORD32; m[ctr[1]]
will be the next 8 bits; and so on. This works because every common byte
order---1234, 4321, 2143, 3412---is an invertible permutation of 1234.
(I suppose it would be clearer to write the tr[] values in hex, but the
magic numbers are more fun.)

---Dan



More information about the Comp.lang.c mailing list