assigning an integer to the negation of ...

Wed Dec 21 02:47:55 AEST 1988

In article <1911 at pembina.UUCP> lake at alberta.UUCP (Robert Lake) writes:
>		i = -(unsigned short)j;
[where i is an int and j is 1---j's type is irrelevant]

To figure out what one `should' get, follow the rules:

	op		value		type
	--		-----		----
1a.	j		1		lvalue, int
1b.	expansion	1		rvalue, int
2a.	(u_short)	1		temporary, u_short	[note 1]
2b.	expansion	1		rvalue, int or u_int	[note 2]
3.	-		-1|-(u_int)1	rvalue, int or u_int
4a.	i=		-1		temporary, int
4b.	expansion	-1		rvalue, int

	Notes:
	1: u_T is a shorthand for `unsigned T'
	2: u_int under `sign-preserving' rules, either under
	   dpANS `value-preserving' rules, depending on whether
	   sizeof(int) > sizeof(short); on Suns it would be int

So the correct answer is -1, not 65535, under either set of rules.

>If I run this program on a VAX 11/780 using 4.3 BSD, I obtain -1 as the
>answer.  However, if I run this on a SUN using SUN OS 3.5, I obtain 65535
>as the answer.  Who is right?

4.3BSD got it right; SunOS got it wrong (in the name of optimisation :-) ).

So what is going on in the table above?

Every time C uses a value for some operation, the value should be an
`rvalue'.  It might be an lvalue or a `temporary'---the result of an
assignment, including casts, is a `temporary'; I made up the notion
just now, to provide a placeholder for the expressions that are neither
lvalues nor properly-expanded rvalues.  If it is not already a properly
expanded rvalue, it is expanded, either according to unsigned-preserving
rules (in the table below) or value-preserving rules (which cannot be
listed except for specific compiler systems, since they depend on the
number of bits in each type).

	original	expansion
    (lvalue or temp)
	--------	---------
	signed char	int
	u_char		u_int
	short		int
	u_short		u_int
	int		int		(already proper)
	u_int		u_int		(already proper)
	long		long		(already proper)
	u_long		u_long		(already proper)

Each expansion does the `obvious' thing: if the expansion is from
signed, any new high-order bits in the expanded signed version come
about by sign extension; if from unsigned, new high-order bits are
zeroes.  (This holds true in both expansion systems.)

So why does the Sun compiler produce 65535?

All the expansions above are expensive on some machines---including
680x0s, where it takes up to two instructions per expansion, and
possibly a temporary register.  If the compiler can prove to itself
that the expansion has no effect, it can suppress it.  For instance,
if the assignment were:

	u_short j;
	j = -(u_short)(1);

we would have the sequence (unsigned-preserving rules)

	1		(1, int, rvalue)
	(u_short)	(1, u_short, temp)
	expand		(1, u_int, rvalue)
	-		(0xffffffff, u_int, rvalue)
	j=		(0xffff, u_short, temp)

which puts 65535 in j.  The expansion had no effect on the answer:
wihtout it, we have

	1		(1, int, rvalue)
	(u_short)	(1, u_short, temp => fake rvalue)
	-		(0xffff, u_short, fake rvalue)
	j=		(0xffff, u_short, temp)

The SunOS 3.5 compiler incorrectly deduces that the expansion had no
effect (it forgets to look at the LHS of the assignment), so it drops
it from the expression tree and gets the wrong answer.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris at mimsy.umd.edu	Path:	uunet!mimsy!chris