sizeof, ptrs, lint, portability

John Bruner jdb at mordor.UUCP
Sun Feb 10 08:09:35 AEST 1985


> Hello?  Anybody from the Lawrence Livermore Labs S-1 project out there?
> Don't you have a special bit pattern for the null pointer?

I had prepared this reply with the intention of avoiding a reference
to the S-1 Mark IIA.  I've mentioned it several times recently and
I wondered if people are getting tired of hearing about it.  However,
since you asked, the answer is YES.  Our machine has a 36-bit word
and a 31-bit virtual address space.  The 5 high-order bits of a
pointer constitute its "tag" which specifies an assortment of
things.  Two values, 0 and 31, are invalid tags.  An attempt to
use a pointer manipulation instruction on words containing these
tags will cause a trap.  (This allows easy detection of indirection
through integers if the integer is in the range -2**31..2**31-1.)
A tag of 2 indicates a NIL pointer, which can be copied but not
dereferenced.

There are two operating system projects here.  Amber, which is based
quite a bit on the MULTICS model, is written in Pastel and uses the
NIL pointer tag.  The other operating system is UNIX.  After a lot
of grief because the tide of sloppy programs was too great, we decided
to hack the microcode to allow 0-tagged pointers and use an integer
zero as our NULL pointer.  (There is a special microcode patch which
must be applied before we boot UNIX.) We all regard this as WRONG, WRONG,
WRONG, WRONG, WRONG, WRONG.  It means that C and Pastel cannot easily
share data structures, and it defeats a lot of the useful hardware
type checking.  We hope to develop a C front-end for our Pastel
compiler so that C programs which run under Amber can use the
NIL pointer properly.

(int)0 vs. (int *)0 has become a very sore point with me for this
reason.  I am firmly convinced that they are NOT the same and I
am unhappy that we had to contort our implementation to match an
assumption that is valid on simpler architectures.


Now, for the reply I just finished editing:

Although it is tempting to comment on the assertion that machines
which differ from a PDP-11 or a VAX are "less capable", I'm not
going to respond to that in this posting.  Instead, I'd like to
take the notion of "portability" in terms of "changing the rules
in the middle of the game" a little bit further.  Instead of
starting with C under VAX UNIX, however, I want to start with the
oldest C that I'm familiar with: the C compiler that came with
the Sixth Edition of UNIX.  (I trust that any unwarranted assumptions
that I make about C based upon the Sixth Edition can be corrected
by others who have used even earlier versions.)

Let me cite from the C Reference Manual for the Sixth Edition:

  2.3.2 Character constants [third paragraph]
  
  Character constants behave exactly like integers (not, in particular,
  like objects of character type).  In conformity with the addressing
  structure of the PDP-11, a charcter constant of length 1 has the code
  for the given character in the low-order byte and 0 in the high-order
  byte; a character constant of length 2 has the code for the first
  character in the low byte and that for the second character in the
  high-order byte.  Character constants with more than one character
  are inherently machine-dependent and should be avoided.

Nonetheless, programs used multi-character constants.  One in
particular that I'm very familiar with was APL\11.  [The author of
APL\11 was Ken Thompson (a.k.a. "/usr/sys/ken"), who I think we
can agree is rather knowledgable about C and UNIX.]  Unfortunately,
PCC generated two-character character constants in the opposite
order than Ritchie's CC.  The manual doesn't say that the results
are compiler-dependent, so one should expect them to be the same
for both compilers on the same machine.  Hence, PCC (and thus the VAX
"cc") is nonportable.  (The first time I tried to move APL from a
V6 PDP-11 to a 32/V VAX I had to find and fix 800 "new" errors.)


It is interesting that the assertion has been raised that -1 has
always been the standard error return.  Here's a simple program
for copying standard input to standard output, from "Programming
in C -- A Tutorial", section 7.
	
	main() {
		char c;
		while( (c = getchar()) != '\0' )
			putchar(c);
	}

This worked before the Standard I/O library "broke" it.  getchar()
used to return '\0' on EOF.  Also, programs which had previously used
the old I/O library (with "fin" and "fout" -- anyone remember what

	fout = dup(1);

did?) or the old Portable C library had to be changed to
accommodate STDIO.  I guess STDIO is nonportable too.


Several programs that I worked with assumed that integers were
two bytes long.  I guess the VAX is nonportable.


[Gee, this is fun!]  Back in V6 there was no "/usr/include" --
you had to code the definitions of the system structures directly
in your program or hunt down the kernel include files.  The
advent of "/usr/include" and the changes in the system calls
broke several programs that coded these things directly.  I
guess even V7 is nonportable.


Then, of course, there are the totally unnecessary additions to
C when it was hacked up for the phototypesetter 7 release.  To
take one example, consider "unsigned".  Who needs "unsigned"?
V6 was written without it -- if you needed an unsigned integer
you could always use a character pointer.  And I, for one, was
quite happy to put a "#" on the first line of my C program if
[if!] there were any #include or #define statements in my program.
[Actually, sometimes I still do this, just to be obstinate!]


[I think I'm getting carried away.  Time to come back to earth.]
As C has developed, it has provided more and more facilities for
approaching problems in an abstract, machine-independent way.
I for one applaud this growth.  I *want* to plan my programs
carefully, think about the issues involved, and have utilities
like "lint" tell me when I'm being careless.  I want to be able
to move my programs to new machines without having to rewrite
them.  As much as I like PDP-11's, I no longer use them (at
least, not with UNIX).  Eventually I'll log off of a VAX for the
last time.  Computer architectures are changing, and someday even
the assumption of a classical von Neumann architecture will be
invalid.  (This is already true for some machines.)  If C continues
to evolve, when that day comes C may still be around (in some form).
I am certain that if it sticks to a rigid PDP/VAX view of the
world it will be left behind in the dust.
-- 
  John Bruner (S-1 Project, Lawrence Livermore National Laboratory)
  MILNET: jdb at mordor.ARPA [jdb at s1-c]	(415) 422-0758
  UUCP: ...!ucbvax!dual!mordor!jdb 	...!decvax!decwrl!mordor!jdb



More information about the Comp.lang.c mailing list