Conformant Arrays in C (a better solution?)

Fri Mar 4 06:01:16 AEST 1988

> karl at haddock.ISC.COM (Karl Heuer)
>> pardo at uw-june.UUCP (David Keppel)
>>Or possibly: [pass an array of { sizeof(float), sizeof(float *) }]
> [...this is not sufficent because...]
> There are systems where (char *) and (int *) have the same size but different
> formats.  If the type in question is (char[4]), I think the distinction would
> be lost completely in your model.

Right.  Karl's proposed solution:

>   void *_d2malloc(int typecode, size_t nr, size_t nc) {
>     switch (typecode) {
>     case __typeof(char *): ...;
>     case __typeof(int *): ...;
>     }
>   }

... would work, and allow one to code functions to create such array
structures for arbitrary levels of dimensions (albeit nonportably), but
it is an interesting problem to come up with a compact way of defining
these things to any depth of dimension with no extensions to current C.
And try to get it coded portably.  Hmmmm...  (imagine Lurch-like
sandpapery sound of fingertips above keyboard...)

Howzabout this:  A macro alloc_dimention( type, n, initial_value ),
which returns a (type *) pointer to the initial element of an array of n
elements of (type), each initialized to the given initial_value (the
initial value being evaluated "by-name" for each element, of course).
Thus, to get the two-D, N-by-M float array, one would say

        { float ** a = alloc_dimention( float *, N,
                       alloc_dimention( float, M, 0.0 ));
            ...
        }

and references to elements of the array are just a[i][j], and the whole
shebang can be passed to subroutines, wherein references to a[i][j]
would work just as well.  To get a three-dimentional structure declared,
just do the obvious:

        { float *** a = alloc_dimention( float **, I,
                        alloc_dimention( float *, J,
                        alloc_dimention( float, K, 0.0 )));
            ...
        }                

Now, can this macro be implemented?  Well... sort of.  Consider:

    /* This include file defines the "alloc_dimension" macro, and
       associated support cruft.  The macro takes a type, a dimension limit,
       and an initial value.  An array n of type is allocated, each element
       of which is assigned the by-name value of the initial value supplied,
       and a pointer to the first element of the array is returned.

       This version assumes that setjmp.h is already included, and malloc
       already defined.

    */

    typedef struct _lab_node_s { struct _lab_node_s * next;
                                 jmp_buf jbuf;
                                 void *p;
                                 int i;
                               } _lab_node_t;
    static _lab_node_t *_lab_list = 0, *_cur_lab;
    static void *_cur_p;

    #define alloc_dimension( t, n, v ) \
        (_cur_lab = (_lab_node_t *)malloc(sizeof(_lab_node_t)),\
         _cur_lab->next = _lab_list, _lab_list = _cur_lab,\
         _lab_list->p = malloc(sizeof(t)*(n)),\
         _lab_list->i = 0,\
         setjmp( _lab_list->jbuf ) < 2 ?(\
             ((t*)_lab_list->p)[_lab_list->i] = (v),\
             ++(_lab_list->i) < (n) ? longjmp(_lab_list->jbuf,1)\
                                    : longjmp(_lab_list->jbuf,2),\
         0):0,\
         _cur_lab = _lab_list,_lab_list = _lab_list->next,\
         _cur_p = _cur_lab->p,free(_cur_lab),(t*)_cur_p)

Of course, this has some unpleasant limitations

- it isn't clear to me that setjmp and longjmp are guaranteed to work
  inside comma expressions like this (though I *think* they are), so
  this may not really be portable code
- freeing the thing becomes a hassle, requires another macro
- some errorchecking was omitted for brevity (whew)
- insufficent error checking for values of dimention limits
- the overkill of using setjmp
- the tedium of using a dynamic frame mechanism when a simple
  static mechanism would do if only we had compile-time execution
  of code a-la lisp macros
- the general cruftiness, slowness, and bulkiness of the code

Now it seems to me that, allowing non-constant expressions in the bounds
of formal array arguments is the minimal, conservative, non-dope-vector,
covers-most-bases solution.  That is:

        g(){
            double a[I][J][K];
            f( a, I, J, K );
        }
        f(a,ilim,jlim,klim)
            double a[ilim][jlim][klim]; /*currently illegal*/
            int ilim,jlim,klim;
        {
            int i,j,k;
            /* do something to each array element */
            for( i=0; i<ilim; ++i ){
                for( j=0; j<jlim; ++j ){
                    for( k=0; k<klim; ++k ){
                        ... a[i][j][k] ...;
                    }
                }
            }
        }

... would be made legal.  The problematic point that the type of the
formal a isn't precisely known until runtime can be made a special case
to lint, and be declared to have unknown behavior when the bounds of the
actual passed don't match those of the formal at run-time.  Probably the
harshest problem is that pointer formats for arrays of arrays of... to
any number of "arrays of" levels all be the same, which isn't true now.

I'd say either leave it as it is and do such things with pointers, or
take the above rather conservative step.  But then, who listens to me?

--
A LISP programmer knows the value of everything, but the cost of nothing.
                                        --- Alan J. Perlis
-- 
Wayne Throop      <the-known-world>!mcnc!rti!xyzzy!throopw