Pointers and Arrays

Mon Jun 30 02:10:14 AEST 1986

In article <2201 at umcp-cs.UUCP> chris at maryland.UUCP (Chris Torek) writes:
>Perhaps I just have an odd mind, but all this pointer/array stuff
>never really bothered me.

Or perhaps I simply read K&R, chapter 5, Pointers and Arrays.  I
needed to refer to K&R recently (see article <2204 at umcp-cs.UUCP>),
and while I was looking at it, I just happened to stumble across
some text in this chapter that seems to me quite clear.  Let me
give some excerpts, with commentary.  (Suggestion: while reading
this, imagine me grinning teasingly at points.  I hope the tone
comes across properly, but I have spent enough time revising this
now---great grief, an hour and a half now!)

  It is also necessary to declare the variables that participate
  in all of this:

	int x, y;
	int *px;

  The declaration of x and y is what we've seen all along.  The
  declaration of the pointer px is new.

	int *px;

  is intended as a mnemonic; it says that the combination *px is
  an int, that is, if px occurs in the context *px, it is equivalent
  to a variable of type int.  In effect, the syntax of the declaration
  for a variable mimics the syntax of expressions in which the
  variable might appear.  This reasoning is useful in all cases
  involving complicated declarations.  For example

	double atof(), *dp;

  says that in an expression atof() and *dp have values of type
  double.

So much for understanding declarations.  K&R said it all, eight
years ago.

  ... Any operation which can be acheived by array subscripting
  can also be done with pointers.  The pointer version will in
  general be faster but, at least to the uninitiated, somewhat
  harder to grasp immediately.

K&R seem to have a gift for understatement.

  The correspondence between indexing and pointer arithmetic is
  evidently very close.  In fact, a reference to an array is
  converted by the compiler to a pointer to the beginning of the
  array.  The effect is that an array name *is* a pointer expression.
  ...

(Note `expression', not `variable'.  The above does not apply to
sizeof.)

  There is one difference between an array name and a pointer that
  must be kept in mind.  A pointer is a varible, so pa=a and pa++
  are sensible operations.  But an array name is a *constant*, not
  a variable: constructions like a=pa or a++ or p=&a are illegal.

`p = &a' is much like `p = &3': illegal by fiat, not because it
cannot be done.  If it were legal, `&a' would have type `pointer to
<type of a>' (compare with `a', which has type `pointer to <type of
a[0]>').

  When an array name is passed to a function, what is passed is the
  location of the beginning of the array.  Within the called function,
  this argument is a variable, just like any other variable, and so
  an array name argument is truly a pointer, that is, a variable
  containing an address.  ...

  As formal parameters in a function definition,

	char s[];

  and

	char *s;

  are exactly equivalent; ...

This is all in the context of singly-dimensioned arrays, but with
the proper mindset applies to multi-dimensional arrays without
trouble.  (With the wrong mindset it leads to much confusion.)
K&R will have more to say about this later.

Note that this is where sizeof starts acting odd:  A compiler
treats the following as equivalent:

	   array		  pointer
	   -----		  -------
	f(arr)			f(ap)
	int arr[];		int *ap;
	{			{
		...			...

	f(a2)			f(a2p)
	int a2[][5];		int (*a2p)[5];
	{			{
		...			...

The second equivalent pointer version is neither `int **a2p' nor
`int *a2p'; nor for that matter is it `int *a2p[5]'.  This is
consistent, if (painfully apparently, given recent net.lang.c
articles) confusing.

  5.7  Multi-Dimensional Arrays

  C provides for rectangular multi-dimensional arrays, though in
  practice they tend to be much less used than arrays of pointers. ...

  ... In C, by definition a two-dimensional array is really a one-
  dimensional array, each of whose elements is an array.  Hence
  subscripts are written as

	day_tab[i][j]

  rather than

	day_tab[i, j]

  as in most languages. ...

What they do *not* mention is that day_tab[i,j] is a valid expression,
and tends to surprise people.  Lint does not, unfortunately, warn
about these.

  If a two-dimensional array is to be passed to a function, the
  argument declaration in the function *must* include the column
  dimension; the row dimension is irrelevant, since what is passed
  is, as before, a pointer.

What did I tell you?

Note that this *is* consistent.  One cannot pass an array as an
argument to a function.  Pointers, however, are fine, *including
pointers to arrays*.  Given a two or more dimensional array,
the array `constant' is converted to a pointer to an array of
one fewer dimensions.  This is now a *pointer*, and remains a
pointer until dereferenced.  For example, in

	int day_tab[2][13] = { ... };

the following are type-correct calls:

	f2d(p) int (*p)[13]; { ...  }

	f1d(p) int *p; { ...  }

	proc()
	{
					/* argument types: */
		f2d(day_tab);		/* pointer to array 13 of int */
		f2d(&day_tab[0]);	/* pointer to array 13 of int */

		f1d(day_tab[0]);	/* pointer to int */
		f1d(&day_tab[0][0]);	/* pointer to int */
	}

Calling f2d(&day_tab[0][0]) passes the right *value* but the wrong
*type*.  That it happens to work is not an excuse to do it.  If C 
were different, it would be different, but it is not, so it is not.

To return to K&R:

  5.10 Pointers vs. Multi-dimensional [sic] Arrays

(So they are not consistent with capitalisation in section names.)

  Newcomers to C are sometimes confused about the difference between
  a two-dimensional array and an array of pointers, ...

Ah, a gift indeed.

  Given the declarations

	int a[10][10];
	int *b[10];

  the usage of a and b may be similar, in that a[5][5] and b[5][5]
  are both legal references to a single int.  But a is a true array:
  all 100 storage cells ahve been allocated, and the conventional
  rectangular subscript calculation is done to find any given
  element.  For b, however, the declaration only allocates 10
  pointers; each must be set to point to an array of integers.
  Assuming that each does point to a ten-element array, then there
  will be 100 storage cells set aside, plus the ten cells for the
  pointers.  Thus the array of pointers uses slightly more space,
  and may require an explicit initialization step.  But it has two
  advantages:  accessing an element is done by indirection through
  a pointer rather than by a multiplication and addition, and the
  rows of the array may be of different lengths.  That is, each
  element of b need not point to a ten-element vector; some may
  point to two elements, some to twenty, and some to none at all.

Now for some even more horrid examples of my own, all type-correct:

	/* declare st as array 1 of array 5 of pointer to char */
	char *st[1][5] = { { "fee", "fie", "foo", "fum", "foobar" } };

	/* declare x as pointer to array 5 of pointer to char */
	char *(*x)[5] = st;

	/* declare y as array 1 of array 3 of array 4 of pointer to
	   array 5 of pointer to char */
	char *(*y[1][3][4])[5] = { {
		{ st, 0, 0, st },
		{ 0, st, st, 0 },
		{ 0, 0, st, st }
	} } ;

	/* declare p as array 2 of pointer to array 3 of array 4
	   of pointer to array 5 of pointer to char */
	char *(*(*p[2])[3][4])[5] = { y, 0 };

It does take some trickery to do this.  Given the declaration

	char *strings[5] = { ... };

the type of `strings' is `array 5 of pointer to char', which, when
used in an expression, becomes `pointer to pointer to char' (by
changing the first `array of' to `pointer to'), but for `x' and
`y' I wanted a type of `pointer to array 5 of pointer to char'.
It might be nice if I could write `&strings' to get this, but I
cannot; however, I can use the declaration above for `st' to get
`array 1 of array 5 of pointer to char'.  Changing the first `array
of' yeilds `pointer to array 5 of pointer to char', which was what
I wanted.

Likewise, for `p' I wanted `y' to evaluate to `pointer to array
3 of array 4 of pointer to array 5 of pointer to char'; in order
to get that, I again used a `fake' [1] in the declaration.

	`You can hack anything you want,
	 with pointers and funny C . . .'
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 1516)
UUCP:	seismo!umcp-cs!chris
CSNet:	chris at umcp-cs		ARPA:	chris at mimsy.umd.edu