Passing Multidimensional Arrays

Mon Dec 23 09:39:09 AEST 1985

> (First off, the original code fragment, while syntactically legal C,
> was most definitely semantically incorrect.)

(Small nit: the "original" fragment was both syntactically and
semantically legal (mainly because it didn't include an invocation of
"subr", and thus didn't have the type mismatch that was implied).  It
was, however, misleading because of this incompleteness, and
semantically illegal when completed.)

> In article <397 at codas.UUCP> mikel at codas.UUCP (Mikel Manitius) writes:
> >                             ... Then explain, if you don't mind, how
> >the compiler knows the dimensions of argv in the following construct:

> Easy.  Argv is  n o t  a 2-D array.  It is an array of pointers.

Exactly.

> [...] If anyone kept a copy of my 100+ line diatribe on arrays and
> pointers, and why people confuse them so, from a year ago, wouldst
> please mail a copy to me?  I'm afraid I didn't keep a copy, and
> if this line of discussion keeps up, I'm going to need it.	;-}
>       Joe Yao hadron!jsdy at seismo.{CSS.GOV,ARPA,UUCP}

Sadly, I didn't keep it.  But I'll take a stab at clarifying things in
your stead.  Let me preface this by saying that all of what I say here
and more is in "The C Programming Language" (K&R) and "A C Reference
Manual" (H&S).  *Please* read them.

In C, addresses can be subscripted and arrays can be indirected.  This
is often called "pointer-array equivalence", and is the root of this
confusion.  Nevertheless, it is not a complex concept.  It is a
*unification* of subscription and indirection (along with pointer
arithmetic).

It is based on the following equivalence, which defines subscripting in
terms of address calculation and indirection:

                e1[e2]   is-equivalent-to    *((e1)+(e2))

I'll take it for granted that you already know how address calculations
happen (the scaling of integers when added to addresses).  This leads to
the surprising result that integers can be subscripted (by addresses).

Note that e1 or e2 (but not both) in the above must be an address
expression.  This leads to the C-ism that array names are addresses (in
particular, they evaluate to the address of the first element in the
array (you know, the one with subscript "0")).

Thus, when you declare          int a[N];
you can use "a" like so:        a[M]
or like so                      *(a+M)
or even like so                 M[a]

and all of these are equivalent.

Now then, if you declare        int *p;
again, you can use "p" like     p[M]
or                              *(p+M)
or even                         M[p]

because "p" in this case evaluates to an address expression also.  The
crucial difference is that "a" is an "rvalue", while "p" is an "lvalue".
That is, "p" is a *variable*, and *contains* an address, while "a" is an
*expression* and *is* an address.

Thus when you declare (as is often done)

        void main(argc, argv) int argc; char *argv[]; { ... }

"argv" is *not* a two dimensional array, despite the fact that the
character pointed to by the first element of argv can be accessed by

                argv[0][0]

This expression is *exactly* equivalent to

                **argv

How can you tell what the declaration of argv above means?  Decompose
it.  First, fully parenthesize according to the rules in the manual:

                char *(argv[]);

Now, this means that argv is an array of pointers (first you subscript,
then you indirect).  Now subscripting and indirection are equivalent,
*but* *only* in *reference*, *not* in *declaration*.  Declaration of an
array allocates space for some elements.  Declaration of a pointer
allocates space for an *address* *only*.

Does all this make it clear?  If not, *read* *the* *manuals*.

[ Purists will note that I skipped over formal-actual differences,
  declaration of "two-dimensional" arrays, and other esoteric but
  often-encountered cases.  Ah well.... only so many hours in a day.]