realloc

Fri Mar 31 16:52:30 AEST 1989

Several recent articles have made just about all the points
there are to make.  I will apologize for having been imperfectly
informed of realloc's history; I had not realized that so many
de-facto "standard" implementations did not provide the
additional functionality I described.

I need to respond to Gregg Wonderly, who wonders if I am

	one of those people that assumes (*((char *)NULL) == 0) too?
	This damn lazy programming and sorry excuses for not being
	defensive has got to stop.

and Phil Kos, who suggests that I

	be more careful in the future about which effects of
	library functions are required and which are arbitrary and unreliable.

and David Levine, who chastises me for

	depending on an undocumented feature.  You
	shouldn't be surprised when it breaks.

Rest assured that I am not "one of those people."

It happens that I learned about this odd behavior of realloc in
the first place when I was implementing a C run-time library, and
one of my users complained that my realloc didn't handle NULL.  A
similar exchange occurred, with the roles reversed.  (I hadn't
known about "extended" realloc at the time.)  I felt exactly the
same way -- that a programmer who needed this behavior was being
lazy -- and only changed my mind when I discovered that it was
both documented and very useful.

Unfortunately I can no longer discover which system's
documentation I actually read about it in.  It was probably 4.1x
or 4.2bsd (for x in [abc]).  I **never** depend on undocumented
behavior -- the fact that I once implemented a realloc that
handled NULL, and have since been relying on it in my code,
proves to me that I once saw it in documentation which I assumed
was definitive.  Apparently the secret was even better-kept than
I realized, since so many examples have been listed of systems
which neither provide nor document the extended behavior.

(Before you point out the folly of trusting bsd documentation,
let me point out that Berkeley used to add more features than it
documented, and that a Berkeley man page on a standard function
like realloc was likely to be copied directly from v7-based
antecedents with little change.  In fact, I thought I had
convinced myself that realloc(NULL, ...) dated back to v7, and
was therefore likely to be present in any v7-derived system, by
discovering that my pdp11 at home, which is v7-based without any
direct bsd influence, has a realloc that handles NULL.  I'll
defer to Henry Spencer's wider experience with v7, and now assume
that my machine in fact had its realloc "fixed" since v7.)

My reason for wanting a realloc that handles NULL is precisely
because of the clean, self-starting, idempotent algorithms it
permits.  Two good examples are a string function which handles
arbitrary-length arguments:

	char *strupper(str)		/* returns ptr to static data */
	char *str;			/* overwritten with each call */
	{
	static char *retbuf = NULL;
	static int retsize = 0;
	int len = strlen(str) + 1;

	if(len > retsize)
		{
		char *new = realloc(retbuf, len);
		if(new == NULL)
			return NULL;	/* error handling problematical */
		retbuf = new;
		retsize = len;
		}

	...now copy str to retbuf, uppercasifying...

	return retbuf;
	}

or a function which stashes its argument in a data structure for
later use:

	graph_title(gd, title)
	struct graph *gd;		/* graph descriptor */
	char *title;
	{
	char *new = realloc(gd->g_title, strlen(title) + 1);
	if(new == NULL)
		return NULL;
	(void)strcpy(new, title);
	gd->g_title = new;
	}

Assuming realloc handles a NULL pointer argument, neither of
these subroutines requires any special-casing for the first call.
graph_title is nicely idempotent; it can be called multiple times
without ill effect.  (It assumes that the routine that allocated
graph descriptors initialized g_title to NULL.)

Certainly, if realloc were not guaranteed to handle NULL
pointers, I would provide a "wrapper" function around it which
did.  (Most of the time, I use a wrapper function anyway, to
centralize the error check.)  I'd rather not duplicate standard
functionality, though.  However, as has now been amply pointed
out, a realloc that handles NULL is anything but standard in the
pre-ANSI C world, and I am already adjusting my coding practices
to reflect this.  (I do like portable code; I'd rather not depend
on ANSI yet.)

Finally, to assuage a few people's doubts that "well, I can see
how realloc(NULL, ...) might be useful, but having realloc(..., 0)
return NULL is GROSS," I'll point out that, for full
consistency and generality, both cases are equally necessary.
Suppose you have a pair of variables

	char *p = NULL;
	int size = 0;

defining a buffer which grows as necessary (using realloc, of
course, as in the first example above).  If the buffer can also
grow smaller, it seems sensible to make it return to its starting
condition if the size ever reaches 0, to free all memory and
reset p to NULL.  If the size grows again, p will be correctly
reallocated to a "real" pointer again, anyway.  (Although hardly
an overriding concern, note that if malloc(0) or realloc(..., 0)
returns a non-NULL pointer, it will typically have some malloc
arena overhead behind it, consuming space, even though zero bytes
are available to the caller.)

                                            Steve Summit
                                            scs at adam.pika.mit.edu