DEC VAX C (unsigned short compared to int)

Sun Mar 1 16:33:56 AEST 1987

Ted Marshall (ted at blia.BLI.COM) writes:
> We have found a difference between the code generated by DEC VAX C versions
> 1.5 and 2.1. ...
> 
> The difference occurs when you compare an unsigned short against a negative
> value without casting the short as a signed short. For example:
> 	unsigned short x;
> 	subroutine(&x);	/* subroutine writes back a value through the ptr */
> 	if (x < -10)	/* should read "if (((short) x) < -10)" */
> Version 1.5 generates a CVTWL and a CMPL for the if and thus the value in
> x is sign extended and everything works anyway. Version 2.1 generates a
> MOVZWL instead of the CVTWL and so it isn't sign extended and the condition
> is never true.

Then there was a bug in version 1.5.  And you were lucky that the
constant was negative -- the bug would presumably still have affected
the code if it was positive, but it would have been a lot harder to
detect, because the condition would not have been degenerate!

The type of -10 is int and the type of x is, of course, unsigned short.
By the Oct.1 ANSI draft, we have:

	3.3.8:	If both of the operands [of <] have arithmetic type,
		the usual arithmetic conversions are performed.

	3.2.1.5 [Usual arithmetic conversions]:
		[If neither operand has a floating type or type
		unsigned long int or long int], the integral promotions
		are performed.

	3.2.1.1: In all cases the value is converted to int if an int
		can represent all values of the original type.  [This
		and certain other conversions] are called the integral
		promotions.

	3.2.1.2: When an unsigned integer is promoted to a longer
		integer, its value is unchanged.

All of which says that x is converted to int and should not be sign
extended in doing so.

In the days of K&R there were no unsigned shorts, but if you instead
consider the combination of an unsigned int with a long (assuming that
int is shorter than long), you get the same rule: no sign extension.
The text corresponding to the above, with that change, can be found in
sections 7.6, 6.6, and 6.5 of Appendix A to K&R.

> (BTW, the reason this was in the code was that the value could
> be interpreted as either signed or unsigned depending on other conditions.)

A safer way to do this, of course, is to declare the type of x as
"union {unsigned short un; short si;}".  This forces you to say
"x.si" or "x.un" each time you use it, and reduces the chance of
errors like these.

In addition, this improves the portability of the code.
The conversion (short)x may do other things besides interpreting
the bits in a different fashion; on a machine that, unlike the VAX,
isn't of the common 2's complement type, it WILL do other things.
Most likely you did just mean to reinterpret the bits, in which case
a union is what you really want.  Of course, you then have to be
careful to ASSIGN the value to x.si in a signed context or x.un in
an unsigned context, as well as reading from the right one.

Mark Brader, utzoo!sq!msb			C unions never strike!