Help me cast this!: Ultrix 2.x bug

Guy Harris guy at gorodish.Sun.COM
Wed May 11 04:26:06 AEST 1988


First: mail to "root" at "mfci" failed, so I'll post this; could the person
maintaining netnews at Multiflow please try to arrange that not all messages
from there have a "From:" line listing "root at mfci.UUCP" as the poster?  The
*real* poster's name appears in the "Reply-To:" line, so the information *is*
available.

Second:

> >-However, pcc compilers [without Guy Harris's fix, or equivalent]
> >-don't give a warning, and I was once told that Dennis Ritchie considers
> >-it to be perfectly legal C.

> >Told by whom?
> 
> By Bjarne Stroustrup, whom I assume simply asked him.  This was the
> result of a mail conversation I was having with him several years ago
> over what &a should mean when a is an array.  Of course, &a is not
> legal K&R C, but Bjarne thought it should be treated just like a,
> i.e., that &a should yield a pointer to the first element of a.

This is *not* the same as the "it" referred to above, which is an assignment of
a value of type "struct outfile **" to a variable of type
"struct outfile (*)[]".  The latter is not valid; "array of <type>" and
"pointer to <type>" are inequivalent types, and therefore "pointer to array
of <type>" and "pointer to pointer to <type>" are inequivalent types.  I would
be *EXTREMELY* surprised if Dennis Ritchie felt they were equivalent.

If it's not clear why they must be inequivalent, here's a specific example.
Consider that, if all pointers are represented in a particular implementation
as pointers to the first byte of an object, then

	p++

causes the address contained in "p" to be incremented by the size of the
object.  Now, the size of "struct outfile *" might, say, be 4 on a machine
with 8-bit bytes and 32-bit pointers.  However, the size of
"struct outfile [23]", for example, is 23*sizeof (struct outfile), and the
size of "struct outfile []" is unknown (in effect, zero).

As such, if you have:

	struct outfile **p;
	struct outfile (*output)[];

on an implementation of the sort listed above, with 8-bit bytes and 32-bit
pointers, the expression "p++" will increment the address contained in "p" by 4
bytes (as "sizeof (struct outfile *)" is 4) and the expression "output++" will
probably elicit a complaint from the compiler (as
"sizeof (struct outfile [])" is unknown).

In (old) K&R C (the new K&R presumably describes the ANSI rules), it is
considered incorrect to put "&" before an array or function.  In almost all
contexts, an expression of type "array of <type>" or "function returning
<type>" is converted to type "pointer to <type>" or "pointer to function
returning <type>".  The pointer-valued expressions in question are not lvalues,
and thus cannot be preceded with "&", just as you can't say "&3".  Some
*compilers* permit an "&" to be placed before expressions of this type, and
treat it as redundant.

Some compilers also appear to consider "pointer to <type>" and "array of
<type>" to be equivalent.  Unfortunately, this causes some invalid programs to
compile without complaint; those programs fail later.  In fact, one such
program *did* fail; somebody posted something to "comp.lang.c" about it
(actually to "net.lang.c", if I remember correctly, which indicates how long
ago this was!), which was what got me to look for and find the PCC bug in
question.

> PS Dennis claims that this is C:
> main()
> {
> 	int a[5][7] ;
> 	int (*p)[5][7];
>
> 	p = (int***) a; /* no & */
> 	printf("a %d p %d *p %d\n",a,p,*p); /* a == p == *p !!!  */
> 	(*p)[2][4] = 123 ;
>	printf("%d\n",a[2][4]); /* 123 */
> }
> It works!  Amazing!

"a" is of type "array of 5 arrays of 7 'int's".  "p" is of type "pointer to
array of 5 arrays of 7 'int's."  There is no way in K&R C to type-correctly
assign a pointer to "a" to "p".  The "(int ***)" cast is incorrect; "p" does
*NOT* have type "pointer to pointer to pointer to 'int'."

The fact that the values returned by "a", "p", and "*p" should not be
surprising.  In almost all contexts, an array-valued expression is converted to
a pointer to the first element of the array.  "*p" is an array-valued
expression and gets so converted; in effect, "*p" is equivalent to "p" in
almost all contexts.  The fact that the expressions "a" "p" have the same
numeric value is a consequence of the fact that *most* C implementations
represent pointers by the address of the first addressible unit of the object
pointed to.  As such, the addresses represented by "a" and "p" are the same.

If Dennis considers the above valid C, either by K&R rules or by ANSI C rules,
I would like to see his reasoning.  Everything *except for* the
"p = (int ***)a" is valid K&R C and valid ANSI C.  (Actually, if one wants to
be *extremely* fussy, one can complain about:

	the "printf" - there is no guarantee in K&R that *any* particular
	"printf" format specifier can be used to print any particular pointer,
	and ANSI C guarantees only that "%p" can be used to print "void *";

	the lack of certain #includes, such as "#include <stdio.h>";

	the lack of declaration of arguments for "main()";

but none of those are germane to this particular discussion.)

The following *would* be valid K&R C (modulo the other stuff):

main()
{
	int a[5][7];
	int (*p)[7];

	p = a; /* no &, no cast */
	printf("a %d p %d *p %d\n",a,p,*p); /* a == p == *p !!!  */
	p[2][4] = 123;
	printf("%d\n",a[2][4]); /* 123 */
}

Note that "p" is of type "pointer to array of 7 'int's."  "a" is of type "array
of 5 arrays of 7 'int's."  In most contexts, the expression "a" is converted to
a pointer to the first element of "a"; this first element is of type "array of
7 'int's," so a pointer to it is of type "pointer to array of 7 'int's," which
is the same type as "p".

The above is also valid ANSI C.  The following would be valid ANSI C (modulo
the other stuff), but not valid K&R C:

main()
{
	int a[5][7];
	int (*p)[5][7];

	p = &a; /* no cast */
	printf("a %d p %d *p %d\n",a,p,*p);
		/* "a", "p", and "*p" have the same numeric value */
		/* however, "p" and "*p" are *NOT* equivalent */
		/* "p" is a pointer to "a", "*p" is a pointer to "a[0]" */
	(*p)[2][4] = 123;
	printf("%d\n",a[2][4]); /* 123 */
}



More information about the Comp.lang.c mailing list