Evaluation of if's

Thu Jun 6 02:41:04 AEST 1991

In article <1991Jun5.014758.10616 at wdl1.wdl.loral.com> bard at cutter.ssd.loral.com (J H Woodyatt) writes:
>In article <1991Jun4.233928.5185 at athena.mit.edu>, scs at adam.mit.edu (Steve Summit) writes:
>|> Of course it's undefined.  (It contains two side effects not
>|> separated by a sequence point.)
>
>Is this really undefined by ANSI? I suppose we need someone with the STANDARD to
>resolve this.

I *do* have a copy of the Standard.  In a (futile) attempt to
head this discussion off early, I posted my followup from memory,
before going home so that I could quote chapter and verse out of
my copy.  (Several people have since done so.)

I did mention, and you seem to have ignored, my justification
(which paraphrases the Standard): "...It contains two side
effects not separated by a sequence point."

>Which leads to an interesting question about portable programming. What fringy
>kinds of things in a standard can one expect compilers to sometimes overlook and
>thus be classifiable in the category of `Bad Programming Practice.'

This was just mentioned, but it's worth mentioning again:
depending on any unspecified or undefined aspects of a language
is certainly in the category of "bad programming practice."
(Actually, in my own work, I also avoid everything I feel is
"fringy," but I realize that's somewhat harder to quantify
objectively.  As an example, I never depended on free() leaving
the contents of freed memory undisturbed, even though old
versions of the malloc man page asserted that it did.)

K&R is loaded with good advice, although they don't hit you over
the head with a brick by setting it off in a sidebar with a
header like "Brian & Dennis's Programming Tip #42."  Once again,
as they say, "if you don't know how [undefined things] are done
on various machines, that innocence may help to protect you."

In article <1991Jun5.123931.22105 at grebyn.com> ckp at grebyn.com writes:
>You see, I think the (i=1) should evaluate to the value 1, and the (i=2)
>should evaluate to the value 2, regardless of the order in which they're
>performed...  The variable i is not fetched during the entire
>statement, so the order in which the variable is stored should not
>matter.  Unless someone thinks the compiler should evaluate (i = 1) by
>storing 1 in i, then fetch that value back some time later for the
>comparison, but I seem to recall some ANSI rules about how many times a
>storage object may be touched between sequence points.

The rules about how many times a storage object may be "touched"
between sequence points (as I recall; my copy of the Standard is
still at home) cover: how often a programmer may attempt (in
source code) to modify an object, how often a programmer may
attempt to access a volatile object, and what code the compiler
may generate for volatile objects.  They do not make

	(i = 1) == (i = 2)

any less undefined.

In fact, it's not at all unreasonable for the compiler to
"evaluate (i = 1) by storing 1 in i, then fetch that value back
some time later" (this issue was discussed a few months ago,
either here or on comp.std.c).  Remember that the result of an
assignment is the result after assignment, including the cast to
the type of the left-hand-side.  If we had the code

	char c1, c2;
	c1 = c2 = 5.;

I would certainly expect the compiler to emit code to convert the
value five into a char value, placing the result into c2, and
then fetch c2 and place it into c1, rather than doing the
conversion twice.  Let's try it:

	Script started on Wed Jun  5 12:12:26 1991
	adam> /lib/ccom
	f()
	{
	char c1, c2;
	c1 = c2 = 5.;
		cvtlb	$5,-2(fp)
		movb	-2(fp),-1(fp)
	}
		ret

This is on a VAX (which has nice readable instruction mnemonics);
I've elided the subroutine preamble from the output.  The
compiler didn't bother to emit a floating-point constant or a
run-time floating-point conversion, but rather a longword
(integral) 5 and a convert-longword-to-byte instruction.  As
expected, it then re-fetches c2 (stored at -2(fp)) before moving
the (byte) value to c1 (-1(fp)).

There's nothing magic or devious going on here.  pcc is not an
aggressively-optimizing compiler; this fetch-after-store is in
fact a very natural way for a compiler to handle the value of an
assignment statement.  (A C interpreter I once wrote did the same
thing, although there's a note in the source code indicating that
it caused me problems when I was attempting to assign to device
registers with funny read/write semantics, particularly because
the interpreter always fetched the result of an assignment
whether it was needed or not.)

Of course, we really don't need to think in detail about how
assignments are performed to answer the original question.  Since
the expression in question contains two side effects between
sequence points, its behavior is undefined [note 1], and the
compiler is free to <insert favorite ludicrous prove-a-point
weird undefined compiler behavior>, no matter how "unreasonable."

                                            Steve Summit
                                            scs at adam.mit.edu

Note 1 (again from memory): the behavior of

	(i = 1) == (i = 2)

is undefined, though it is not required to elicit a diagnostic,
because it violates a "shall" statement outside of a constraint.

Anyone who desires references for any of the assertions made in
this article may send me mail; I'll respond to them when I have a
copy of the Standard handy.