0x47e+barney not considered C

D. Hugh Redelmeier hugh at dgp.toronto.edu
Sun Jul 3 09:38:19 AEST 1988


In article <120200001 at hcx2> tom at hcx2.UUCP points out that under the
draft ANSI standard for C, preprocessor numbers are too greedy.
He gave the example of

	0x47e+barney

which is parsed as a preprocessor number, and then rejected when it
cannot be converted to a legitimate C token.  He is correct, and I
agree with him in considering this a mistake.  I too submitted a
comment on this in the last public review period, and mine too seems
to have been ignored (I have not received the response document).

In article <10413 at ulysses.homer.nj.att.com>, Jerry Schwarz says that
Tom's article insults his construct, the pp-number, and that Tom's
fix is bad.  Furthermore, Jerry thinks the problem does not warrant a
fix.

I think that the pp-number construct did clean up a mess (which I
too had been pointing out for a while).  But it can and should be
fixed to not break formerly valid and perfectly reasonable C
programs.

In article <8194 at brl-smoke.ARPA>, Doug Gwyn asks:

| Why do you think it so important for "0x47e" to be considered a
| preprocessing number token?  Just what is it that needs "fixing"?
| Is it that "0x47e" is supposed to be split into preprocessing tokens
| "0" and "x47e" (the second of which may be subject to macro
| replacement!) and in translation phase 7 they are not said to be
| spliced back together into a single (regular) token, so that it is
| impossible for an integer constant "0x47e" to ever be seen after
| phase 6?  If so, that does seem to me to be a problem, but it has
| nothing to do with "+barney" or with the final "e" on the constant;
| it's a generic problem for all hex constants (and was certainly not
| the committee's intention, so fixing this would presumably be
| considered editorial).

For me, the problem is that the +barney is absorbed into the hex
constant.  The + clearly ought to be a separate token, and so should
the barney.

Here is what I submitted to the committee in the previous round:

Page 33, line 36, Section 3.1.8:
preprocessor number too greedy (consider 0xABCDE+1)

The current rules for parsing preprocessor numbers are too greedy.
They are willing to match + or - after an e or E.  If the e came
from a floating point number, that is fine, but if it came from a
hexadecimal number, it is not.  Consider the following examples:

	0xABCDE+1
	0xABCDE+cat

	0xABCDEF+1
	0xABCDEF+cat

The first two lines used to be legitimate C expressions.  Now each
is a pp-number that cannot be turned into a valid C token.  The
second two lines were and remain legitimate expressions.

Although I think that the whole concept is wrong, it can be patched
up to solve this problem.

Proposed grammar:

pp-number:
	integer-constant
	floating-constant
	pp-number digit
	pp-number nondigit
	pp-number .

I find this definition intuitively appealing: it reflects what is
really going on.  Others may prefer one that is simpler to implement:

Alternate grammar:

pp-number:
	pp-floating-constant
	pp-number digit
	pp-number nondigit
	pp-number .

pp-floating-constant:
	digit
	. digit
	pp-floating-constant .
	pp-floating-constant digit
	pp-floating-constant e sign
	pp-floating-constant E sign

Note that in pathological cases, these differ.  Consider:

	1.1.e+5

----------------------------------------------------------------

Further notes on Doug's comments:

| P.S.  I don't think the committee was "too tired of arguing to
| do anything about it".  More likely the review subgroup that
| tackled your comments didn't fully understand the problem.  If I've
| correctly summarized it in the previous paragraph, then try an
| argument along those lines in your re-response.

As I understand it, most committee members saw most comments for the
first time during the meeting (I am a member; I got only the early
comments in a mailing).  Since the meeting is a very busy period,
most comments could not have been read by very many committee
members, and certainly not read very carefully.

| P.P.S.  I was the only committee member who voted against sending
| out the revised draft for the third public review, on the grounds
| that there had been insufficient time allotted to study second-
| round comments before responses were required.  This may be an
| example of that.  I do think the committee did a remarkably good
| job under the [self-imposed] circumstances.

I think that you put it well, and very diplomatically (perhaps too
diplomatically).

Hugh Redelmeier
{utcsri, utzoo, yunexus, hcr}!redvax!hugh
In desperation: hugh at csri.toronto.edu
+1 416 482 8253




More information about the Comp.std.c mailing list