sizeof, ptrs, lint, portability

Sun Feb 10 05:04:01 AEST 1985

> -) I am shocked to find out that pointers may be of different sizes, &
>    may be of a different size than int. This has been true for so long
>    many people just assumed it. I believe it should be true wherever possible
>    for the following reason: if a default exists, it should be a useful one.
>    Defining different sizes for these items gives credibility to the
>    claim that C is dangerous. Just another accident waiting to happen.

The existence of automobiles is also "an accident waiting to happen" (although
it didn't wait very long) in those terms.  I don't blame the automobile,
I blame the driver.  I have no interest in seeing a governor placed on
all cars that limits speed to 35MPH.  Nor do I have an interest in seeing
the requirement that all pointer types must be represented the same way
placed on the C language.  If people can't cope with machines that require
(or, at least, strongly prefer) different pointer representations, that's
their problem, not C's.

> -) Perhaps you forgot some of the attraxions of the language: a terse
>    language that modelled the machine closely. No stupid booleans.
>    All those casts are ugly. How are we to convert people to C if they
>    must put up with all that verbosity? Shouldn't the compiler know?

That's why the ANSI C standard improved the declaration syntax for functions;
yes, the compiler should know, and ANSI C compilers do know (except for
functions with a variable number of arguments; the prime offender, "execl",
is just syntactic sugar for something that can be done equally well with
"execv").

> -) I apologize for calling (presumably someone's favorite) machines `weird'
>    or `braindamaged'. Let's say `less capable'. The pdp-11 was a landmark
>    in machine design. It was byte addressable, had a stack, memory mapped
>    i/o, an orthogonal instruxion set, and useful addressing modes. The
>    vax continued this trend. Most micros (all?) are byte addressable.

According to the Stanford MIPS people (see "Hardware/Software Tradeoffs
for Increased Performance" in the Proceedings of the Symposium on Architectural
Support for Programming Languages and Operating Systems, SIGARCH Computer
Architecture News V10#2 and SIGPLAN Notices V17#4), you may be better off
if you have a word-addressed machine and special pointers for accessing
bytes.  (In their case, byte and word pointers are both 32 bits long, but
coercions are still not copies.)

>    Most have an address space two to the size-of-register-in-bits power.

As has been said more times than I care to count, the 68000s registers are
32 bits long but 32 bit arithmetic is less efficient than 16 bit arithmetic.
I think that this is unfortunate, but it's a fact of life.  There are good
things to be said for 16-bit "int"s on a 68000.

>    This sort of machine is an inhospitable host for the C language and
>    some implementations are downright kluges. I claim that they don't
>    run C but another language I would call `C--'.

You aren't the arbiter of the C language; if you want to hold that opinion
you're welcome to it, but I suspect most people wouldn't agree.  UNIX runs
on the Sperry 1100; if users of UNIX on that machine (or other putatively
"inhospitable" machines) have any comments on that point, I'd like to hear
them.

> -) While you are claiming that it is MY CODING PRACTICES (and evidently
>    hordes of others, including 4.2bsd & sys III implementors) that are
>    nonportable, I am claiming that it is THOSE WEIRD MACHINES that are
>    nonportable. By changing the rules in the middle of the game, you
>    are depriving me (and others) of the time honored tradition of punning.

Aside from any semantic quibbles about the meaning of "nonportable", I
object to the reference to the "time honored tradition of punning".
Lots of traditions, like self-modifying code, were "time-honored" in the
days of small slow machines which "needed" that sort of stuff.  I can
get away without punning 99.9% of the time; the other .1% of code can
be "#ifdef"ed, or written in assembly language, or...

> -) I still maintain that assigning zero to a pointer stores an unspecified
>    number of zero bits.

Maintain what you will, the *language spec*, such as it is, says no such
thing.  Your statement is merely a statement of preference, which people
are at leisure to ignore.

> The null ptr is an out-of-band value. We agreed to represent it in-bound.

Who's "we"?  On many machines, there *is* no out-of-band value.  On the
VAX, 0xffffffff is arguably an out-of-band value, while on most UNIXes
on the VAX 0x0 is an in-band value.  On other machines, there *is* an
out-of-band value, specified by the architectural spec as "how to represent
a null pointer", and it need not consist of N 0 bits.

> Still, a piece  of kernel code should be able to pick up the byte at
> address zero by:
> 		int j; char *p; p = 0; j = *p;
>    Allowing any other value to actually be stored breaks this.

However, it doesn't break

	int j; char *p; j = 0; p = j; j = *p;

Admittedly, this is slightly less efficient, but the number of times when
you execute code that is intended *only* to fetch the contents of location
0 (as opposed to code that fetches the contents of an arbitrary location;

	peek(addr)
	int addr;
	{
		return(*(char *)addr);
	}

even works if you say "j = peek(0)") is very small.

> Besides, SHOW ME A C THAT USES ANYTHING OTHER THAN ZERO ON ANY MACHINE!!!

Hello?  Anybody from the Lawrence Livermore Labs S-1 project out there?
Don't you have a special bit pattern for the null pointer?

>    K&R says of pointer to integer conversions: "The mapping funxion is
>    also machine dependent, but is intended to be unsurprising to those
>    who know the machine." I would be surprised at nonzero null ptrs.

A subtle point; given a "char *" variable "p", the statement

	p = 0;

is different in character from both the statements

	p = 1;

and the statements

	i = 0;
	p = i;

given an "int" variable "i".  Arguably, this is confusing and a mistake, but
it is the clearest (and, probably, only correct) interpretation of what
K&R says on the subject.  The latter two sets of statements do this
particular mapping; the former one is a special case which shoves a null
character pointer into "p".  The mapping function in the third set of
statements is unsurprising.  If I ran the zoo, there would have been a
special keyword "nil" or "null", and THAT would have been the way to
specify null pointers; 50% of all these discussions wouldn't have occurred
if that was done.  Unfortunately, it's too late for that.

> -) Guy: if I want the square root of four, I do sqrt(4.0); NO CAST!

That's because the C language has a way of representing floating-point
constants directly.  It doesn't have a way of representing null pointers
directly; instead, it has a sneaky language rule that says the symbol
"0", when used in conjunction with a cast to a pointer or an expression
involving pointers, is interpreted as a null pointer of the appropriate
type.  If there were, say, a null pointer operator like "sizeof", like

	null(char *)

you could pass null(char *) to a routine.  Alternatively, if the language
had permitted you to declare the types of the arguments to a function
since Day 1, calling a function which expects a "char *" as an argument
would be an expression involving pointers and the 0 (or "nil" or "null")
would be interpreted as a null pointer to no character.

> -) How many of you port what percentage of programs? I thought the
>    intent of the standard was to not break existing programs. I claim
>    that the standard should recognize the existing idioms.

No, the intent of the standard is not to break existing *correct*
programs.  There exist programs, written by people at, among other places,
a certain large West Coast university, which assume that location 0
contains a null string (although that crap seems to have disappeared as
of 4.2BSD).  Does this mean that all implementations of C must map
location 0 into the address space and must put a zero byte there?

"=+" was a legal part of the language once.  It has now disappeared; the
System V compiler now only accepts "+=".  More and more programs are properly
declaring functions, casting pointers, etc..  As such, I see no point in
supporting the passing of undecorated 0s to functions whose argument types
are undeclared as the passing of a null pointer.

	Guy Harris
	{seismo,ihnp4,allegra}!rlgvax!guy