structure and array and string comparisons

Tim Maroney tim at unc.UUCP
Tue Mar 20 14:52:51 AEST 1984


>  If we do ANYTHING about this, we should do it right.  This means
>  that we can't embed "zero byte means end" into the language -- a
>  character "string" is simply an array of chars, and don't you
>  forget it.  In any case, we can't do array comparison, since an
>  array is by definition converted to a pointer to its first
>  element every time it's used -- there's no way to refer to the
>  whole array except in sizeof.  When I want to pass an array as an
>  argument, I typically enclose it in a struct.  The flexibility of
>  pointer==array==pointer is a big win but it has some minor losses
>  like this.

>From the C Reference Manual, Sec. 2.5:  "The compiler places a null byte \0
at the end of each string so that programs which scan the string can find
its end."  Thus "zero byte means end" is already written into the language.

It is true that right now, array-typed identifiers are automatically
converted to pointers in expressions.  However, the type of the identifier
is still "array".  (C Ref. Man, 14.3: "Every time an identifier of array
type appears in an expression, it is converted into a pointer to the first
member of the array.")  However, I can see no strong objection to violating
this for expressions of new operators that we add to the language, which was
my first suggestion.

It would be just as easy, but not backward compatible, to change the rules
and use the existing operators, saying that arrays are not converted to
pointers when they are the operands of the comparison operators; instead,
the compiler will produce code to compare the arrays elementwise.  This
avoids adding new elementwise comparison operators, but it means that old
code which compares array identifiers to something would have to be changed
to include type casts to turn the array into a pointer.

In either case, there are two possible ways that you could determine where
the end of an array is.  If the identifier is a fixed-size array, then you
know how many elements there are at compile time, and you can just SOB or
AOBLSS or whatever to do a fixed number of comparisons.  If it is an array
that was declared with a null constant expression, it can be assumed that
the array is null-terminated.  Thus to compare strings you would store their
addresses in variables declared, for instance "char StringVar[]".  To
compare complex numbers, you would declare the complex variables like:

float complexVar[2];

or, more cleanly,

typedef float complex[2];
complex complexVar;

and the compiler would generate the array comparison code when you compared
two identifiers of type "complex".  This is a side effect of providing array
comparison that I hadn't realized at first; I should note that this only
works like you want it to for equality comparison, not ordering.

The idea of structure comparison also points to adding array comparison.
It's certain that if you are doing a memberwise comparison of structures
with array-typed fields, you don't want to compare the addresses of the
arrays: they will always be different if the structs are not the same block
of storage (which you should always check for first anyway if it could
happen because it is so much faster when it's true).

>  I don't propose comparing unions, or following pointers.  The
>  comparison s1 == s2 should be equivalent, except for storage
>  reference pattern, to:
>
>     ( (s1.memb1 == s2.memb1) && (s1.memb2 == s2.memb2) && ...)

I agree with you, although it might be nice to be able to specify that a
pointer should be followed -- that comes perilously close to creeping
featurism, though, and I have already listed some problems with it in my
previous article, so I'll just shut up about it....

However, it is interesting to note that if the semantics of array comparison
were changed as I suggested, then you would get the proper comparison of
array-typed structure members.
--
Tim Maroney, The Censored Hacker
mcnc!unc!tim (USENET), tim.unc at csnet-relay (ARPA)

All opinions expressed herein are completely my own, so don't go assuming
that anyone else at UNC feels the same way.



More information about the Comp.lang.c mailing list