strcmp

Chris Torek torek at elf.ee.lbl.gov
Thu Jun 20 05:33:22 AEST 1991


In article <14421 at dog.ee.lbl.gov> I wrote:
>	int strcmp(char *s1, char *s2) {

This should be

	int strcmp(const char *s1, const char *s2) {

Thanks to Matthew Farwell <dylan at ibmpcug.co.uk> for pointing this out.

>		return (*(unsigned char *)--s1 - *(unsigned char *)s2);
>... this assumes that the subtraction will not overflow.

The word `overflow' is wrong.  The subtraction cannot overflow.  There
are several cases:

	1. K&R C: `unsigned char's widen to `unsigned int's.
	   Then the subtraction is done in unsigned arithmetic,
	   which is modular arithmetic in 2^(number of bits).

	2. ANSI C:
	   2a:	sizeof(char) < sizeof(int).  Then `unsigned char's
		widen to signed `int's that are nonetheless
		nonnegative.  If the underlying system is 2's
		complement, 1's complement, or sign-magnitude, the
		subtraction will not overflow.  (If it is something
		else, I am not sure what happens.  With any luck,
		it does not overflow.)

	   2b:	sizeof(char) == sizeof(int).  Then `unsigned char's
		widen to `unsigned int's, and the subtraction is
		done in unsigned arithmetic, just as in K&R C.

(Note the complications in 2a, due to the so-called `value preserving'
rules, which are Wrong, but are now Engraved in Stone.  Oh well.)

The complication I was onsidering in <14421 at dog.ee.lbl.gov> occurs
in case 2b.  Suppose, for instance, that char and int are both 16
bits, and that we have two strings made up of characters (32761,0)
and (1,0) respectively.  Then the comparison will return

	(int)((unsigned char)32761 - (unsigned char)1)

(the `int' cast is provided by the `return').  This will be equal to

	(int)((unsigned)32761 - (unsigned)1)

or

	(int)((unsigned)32760)

or, in 2's complement, -8.  A return value of -8 says that s1 < s2, yet
32761 > 1.  This is what I called `overflow' above.  I am not sure *what*
to call it, but `overflow' is wrong.

Thanks to Lasse H. Ostergaard <lhoe at id.dth.dk> for noting this.
-- 
In-Real-Life: Chris Torek, Lawrence Berkeley Lab CSE/EE (+1 415 486 5427)
Berkeley, CA		Domain:	torek at ee.lbl.gov



More information about the Comp.lang.c mailing list