Novice C question

Wed Apr 17 06:48:47 AEST 1991

>In article <31969 at usc> ajayshah at almaak.usc.edu (Ajay Shah) writes:
[quoting from Numerical Recipes in C]
>>1	double *dvector(nl,nh)
>>2	int nl,nh;
>>3	{
>>4	        double *v;
>>5	
>>6	        v=(double *)malloc((unsigned) (nh-nl+1)*sizeof(double));
>>7	        if (!v) nrerror("allocation failure in dvector()");
>>8	        return v-nl;
>>9	}
>>
>>It's supposed to be a function which allocates a vector of doubles.

(It *is* such a function.)

>>My interpretation of nl and nh is: they're array indexes.

A better word here is `bounds':

>>If you want to allocate an array going from 5 to 10, you would say
>>p = dvector(5, 10).

The result of this call is not portable (dvector is portable only when
nl is nonpositive, nh is nonnegative, and nl is less than or equal to nh).

One way to think about this is to imagine that the malloc() call
allocates a `true' array (as opposed to reality, in which malloc
merely obtains a suitably malleable `blob of memory'):

	extern double array[nh - nl + 1];
	v = &array[0];

This partially-equivalent (but not legal C) assignment makes `a' point
to the first of nh-nl+1 doubles.  The C language then allows you to
point to and use any of those doubles; it also allows you to point to
the (single) nonexistent double at the end (&v[nh-nl+1+1]).  Thus,
v[0]..v[nh-nl+1] are all legal `double's, and v[nh-nl+1+1] is not a
double but is nonetheless addressible.  NO OTHER ELEMENTS ARE ADDRESSIBLE.
You may not compute &v[-1] in portable code.

Thus, in the actual assignment, the same thing happens.  v points to
the first of nh-nl+1 `double's, and v[0]..v[nh-nl+1] are all legal
`double's, and in addition you may compute &v[nh-nl+1+1].

So what does the `return' expression `v - nl' compute?

If nl is 0, this is just `v'.  If nl is negative, this adds -nl
`double's to v (skips forward -nl doubles---e.g., if nl is -3, this
computes &v[3]).  But if nl is positive, this tries to compute &v[<some
negative number>].  This is illegal, and the result is undefined
(perhaps the computer turns into a frog).  On many machines the result
is what you would expect (v simply points somewhere `below' the first
double, such that v[nl] is the first one), which is why Numerical
Recipes in C gets away with it.  Portable code should not count on
this.

In article <1991Apr16.033331.5408 at helios.physics.utoronto.ca>
neufeld at aurora.physics.utoronto.ca (Christopher Neufeld) writes:
>... When the compiler sees  p1[j] it evaluates it as *(p1+j).
>The addition of the integer 'j' means the same thing it meant
>above, it advances in memory by an amount   j * sizeof(*p1)

More simply, it advances by `j' objects, each of which is whatever *p1
is.  On a typical byte-addressed machine, `j objects' and `j * sizeof
*p1 bytes' give the same machine-internal-offset-value, but not all
machines are byte-addressed.

>   I should mention that there is one exception that I know of, and
>probably a few other people can provide. The rule is less simple for
>multi-dimensional arrays defined in the conventional manner:
>float arr[N1][N2][N3];
>Now, when the compiler sees  arr[i][j][k] it evaluates:
>*(arr + ((i * N2) + j) * N3 +k)
>However, if I have:
>float ***myarr;
>then when the compiler sees myarr[i][j][k] it evaluates:
>*(*(*(myarr+i)+j)+k)

Actually, in the `virtual machine', the operation of both is the same:
arr[i][j][k] means `add i to arr, indirect, add j to that, indirect,
add k to that, indirect'.  This works because `arr' is a collection%
of `array N2 of array N3 of float's; adding i moves forward by i such
objects.  If you have a byte-addressed machine---you probably do,
and you probably know about it if you do not---this is indeed the
same as moving forward i*N2*N3*sizeof(float) bytes.  If not, it is
something else.
-----
% In fact, `arr' is a collection of exactly N1 objects, and i must be
  in the half-open interval [0..N1).  If not, all bets are off (better
  find someone who knows who to turn frogs back into computers).
-----

This repeats for j and k, but the objects differ: the result of
`arr[i]' is a collection of `array N3 of float's, and the result
of `arr[i][j]' is a collection of `float's.  Indeed, arr[i] is
a collection of N2 such arrays, and arr[i][j] is a collection of
N3 floats.

myarr[i][j][k] means the same thing: `add i to myarr, indirect, add j
to that, indirect, add k to that, indirect'.  This works because
`myarr' points to the first of a collection of `pointer to pointer to
float's; adding i moves forward by i such objects.  If you have a
byte-addressed machine, this is the same as moving forward
i*sizeof(float **) bytes.  If not, it is something else.

This repeats for j and k.  In each case the number of objects is not
specified (myarr may point to the first of zero or more `float **'s;
myarr[i] may point to the first of zero or more `float *'s; and
myarr[i][j] may point to the first of zero or more `float's), but
whatever each actual number is, i, j, and k had better be less than
each.

>Note that this definition takes less CPU usually, and eats a bit more
>memory. It's also a bit easier to use in passing to functions because
>the function doesn't have to be told the dimensions 'N2' and 'N3', which
>it obviously needs for the first definition.

Right, although the `less CPU' is hard to define; `usually' may be
overstating the case.  A great deal depends on both the compiler and
the machine.
-- 
In-Real-Life: Chris Torek, Lawrence Berkeley Lab CSE/EE (+1 415 486 5427)
Berkeley, CA		Domain:	torek at ee.lbl.gov