Efficient coding considered harmful?

Thu Nov 3 15:02:06 AEST 1988

In article <1718 at garth.UUCP> smryan at garth.UUCP (Steven Ryan) writes:
>>[the context is realloc()]
>>
>>Not so.  The standard technique is to increase the size by a MULTIPLICATIVE
>>factor, not an additive increment.  The best factor is data-dependent; I
>>tend to use 1.5 because that's what InterLisp used, 2 is pretty popular.
>>The total cost in this case remains O(N), N being the final size.
>
>However, this makes the space cost O(1.5**n), exponential.

No such thing.  The space cost is O(N).

What we're talking about here is a technique for maintaining dynamically
sized data structures.  Just to keep it simple:

typedef <whatever> element;

struct dynarray
    {
	unsigned limit;		/* how many cells available */
	unsigned count;		/* how many in use */
	element *table;		/* the memory block */
    };

int init_dyn_array(struct dynarray *p; unsigned initalloc)
    {
	element *table =
	    (struct element *)malloc(initalloc * sizeof *table);

	if (table == NULL) return -1;
	p->limit = initalloc, p->count = 0, p->table = table;
	return 0;
    }

element *dyn_array_elt(struct dynarray *p; int index)
    {
	return index < 0 || index >= p->count ? NULL : &(p->table[index]);
    }

int push_dyn_array(struct dynarray *p; element new_value)
    {
	if (p->count == p->limit) {
	    unsigned newlimit = p->limit*2;
	    element *table =
		(struct element *)realloc(p->table, newcount * sizeof *table);

	    if (table == NULL) return -1;
	    p->limit = newlimit, p->table = table;
	}
	p->table[p->count++] = element;
	return 0;
    }

These routines return -ve or NULL to indicate error.

It is easy to see that if you build a flexible array by doing
	struct dynarray flex;

	if (init_dyn_array(&flex, 1)) abort();
	...
	for (...) {
	    ...
	    if (push_dyn_array(&flex, x)) abort();
	    ...
	}
then the space required to hold N elements is at most 2*N*sizeof (element).
This is O(N), not O(1.5**N) as Steve Ryan claims.

If we assume that the cost of realloc(o,x) is at most O(x + size of o(x),
it is straightforward to show that the time cost is also O(N) (basically,
the latest doubling always dominates what has preceded it).