Byte order (retitled & rehashed)

Root Boy Jim rbj at icst-cmr
Wed Apr 9 09:15:50 AEST 1986


	>	The only reason you got -28672 for BIG instead of nulls is
	>	because your machine has backwards byte order.
	>
	>Sorry Bill, *you're* the one that's got backwards byte order. Little
	>Endian is `correct', even tho bucking historical convention.
	>
	>My reasoning is this: The original byte ordering was done the obvious
	>way, Big Endian. If this was so perfect, why would a sane man design
	>anything Little Endian? For compelling mathematical reasons!
	>You wouldn't number your bits backwards (within a register) would you?
	>Admittedly, some people do, but they must not know any better.
	>
		Well, no, little-endian came about because the engineers at DEC
	who designed the PDP-11 made an arbitrary decision that was not well
	thought out.  I will not essay to defend the sanity of DEC engineers,
	and cannot recommend that any one else do so (:-)).  It was a bad
	decision.

Like I said, `Compelling Mathematical Reasons'. Suppose you were to write
an infinite precision multiply routine. Each character represents a decimal
digit. The routine is called as `mult(x,sx,y,sy)'. The code becomes:

	char *mult(x,sx,y,sy) register char *x,*y; 
	{	register char *p; register int j,k;
		p = calloc(sx + sy,1);
		for (j = 0; j < sx; j++)
			for (k = 0; k < sy; k++)
				if ((p[j + k] += x[j] * y[k]) > 10)
					p[j + k + 1] += p[j + k] / 10,
					p[j + k    ] %= 10;
		return(p);
	}

If you did it Big Endian, you would have to go to the end of the `string'
first and do it backwards with all kinds of god-awful subtraxion expressions.
There are other examples.
	
		Consider the following problem.  You have an array of 4 byte
	integers.  If you sort the array numerically you get one result.  If
	you regard the bytes as characters and sort them lexicographically on
	a little endian machine you get a different result.  The reason is that
	the most signifigant byte occupies the eight least signifigant bits.
	Consistency of signifigance requires that the direction of signifigance
	be the same for both bytes and bits.
	
No. You sort character by character. Character Strings are defined as
Big Endian. Don't mix apples and oranges.

	Little-endian came about from the idea of making the lsb of a
	word be bit 0.  From this it follows that byte 0 should be the lsbyte
	of the word.  This is in natural analogy with the idea that word 0
	is the l.s. word.  The object is:
	
	bit 0 is the lowest address bit
	byte 0 is the lowest address byte
	word 0 is the lowest address word

You Got It!
	
	This is the "all addressing flows in the same direction" constraint.
	If that were the only constraint either big-endian or little-endian
	would be acceptable, provided that it were followed thru consistently.

Perhaps now is a good time to address the issue of floating point. Several
machines, including the VAX, have the following formats:

		Byte 0		Bytes 1-3	Bytes 4-7
	Single:	sign+exp	3*mantissa	
	Double:	sign+exp	3*mantissa	4*mantissa

Allow me to coin the term `Middle Endian', which means that values are
stored with the bits closest to the binary point in lower memory.
This unifys the concept of floating points and integers. Don't even
mention `BCD' or `Packed Decimal', which are more like strings than numbers.

	However we have a second constraint, that comparison be valid whether
	a word is considered as a character string or as a number.  Given
	both constraints, the correct addressing assignment is:
	
	bit 0 is the most signifigant bit

Full Shucking Bit! This is prime example of bogosity. I have to know
how many bits you've get in a register (or word, or whatever) to know
what bit 13 is. Bit n should represent `2 raised to the power n', not
`2 raised to the power (k - n)'. That's what happens when engineers
design hardware instead of mathematicians or computer scientists.

	byte 0 is the most signifigant byte.
	
	In short, little-endian was a mistake, is a mistake, and will continue
	to be a mistake.
	
By now, you have probably seen the posting on `type punning', which allows
inferior functions to manipulate larger call-by-reference arguments as
smaller quantities as long as they remain unsigned with respect to the
smaller quantity. Why do you think DEC defined the `Jump Low Bit' instruxion?
So they could use one test on any of LOGICAL*[124] variables. And TRUE == 1.

			Richard Harter, SMDS Inc.
	
In short, DEC's decision *was* well thought out. Like I said, if there
was no good reason to design Little Endian, it never would have been
designed. Often times, the obvious (and usually first) idea is wrong.
Consider Origin One Indexing vs Origin Zero Indexing for example

P.S. Does anybody remember the `Big Indian' pinball machine?
Does anybody remember pinball machines?

	(Root Boy) Jim Cottrell		<rbj at cmr>
	Baseball in D.C. in `87!



More information about the Comp.lang.c mailing list