fscanf bug in TC (BC) C++ 2.0

Thu Apr 18 12:15:15 AEST 1991

In article <26502 at hydra.gatech.EDU> jt2 at prism.gatech.EDU (TROSTEL,JOHN M) writes:
>>cn at allgfx.agi.oz (Con Neri) writes:
>>>We have been getting an error
>>>with a particular piece of code, namely
>>>	scanf: floating point formats not linked.
>>>	Abnormal Program termination.
>>>	Can some one shed some light on what this means? 
>i have found the same problem.  The way I worked around it was to declare
>a new float variable, say fl_var, and use it to read in my data.
>Anyone else figure this out more?  Anyone from Borland about to tell
>us how to fix this?

This issue faintly amazes me.  I can't believe that:

1. there are still people who have not heard about this problem,
2. Borland apparently still hasn't fixed it,
3. the problem exists in the first place, and
4. so many programs manage to elicit it.

(Neither 1 nor 4 are flames; I'm just, as I say, faintly amazed.)

There's no need to speculate on this problem.  It is explained in
both the comp.lang.c and comp.sys.ibm.pc.misc FAQ lists.  Here's
the whole story (at least, as much of it as I know), for those
who care:

printf is actually a miniature interpreter, and only discovers at
run time which format specifiers appear in its format string.
Since the floating-point code for dealing with %e, %f, and %g is
substantial, and since many programs do not use floating point,
it's tempting to leave the floating-point code out for programs
which don't need it, especially on machines with limited address
spaces.  In fact, Ritchie's original PDP-11 C compiler did so,
albeit with considerably more success than does Turbo C.

The basic idea is that there are two copies of the conceptual
equivalent of printf.obj (printf.o for us Unix fans) lying
around: one which handles %e, %f, and %g, and one which doesn't.
The compiler communicates with the linker somehow, informing it
whether the program is using floating point or not, and whether
the full-blown or truncated printf code should be linked.  (Note
that the algorithm is imperfect: if the program isn't using
floating point, %e, %f, and %g can't be needed, but they might
not be needed if the program is using floating point, either.
However, "program uses floating point" is in principle computable
at compile time, while "%e, %f, or %g might get passed to printf"
isn't.)

How does the compiler determine that a "program uses floating
point?"  Ritchie's compiler asserts that a program uses floating
point if a variable is declared as float or double (or, as I
recall, a pointer to same), and if that variable is then used.
(Even this heuristic isn't perfect; Doug Gwyn claims to have
augmented it to handle a few more, really obscure cases, but I
don't know the details -- perhaps they involve casts.)  Turbo C
apparently asserts that a program uses floating point only if the
program actually calls for floating point arithmetic.

I don't know why Turbo C uses such an obviously inadequate
heuristic, particularly when a simple, correct one exists (and
was used by a highly visible, 15 year old compiler).  It may be
that the printf float/nofloat decision is driven by the (equally
broken) PC floating-point-via emulator/coprocessor/both/neither
distinction, rather than by a "magic" extra undefined external
(which is how Ritchie's compiler did it, with the symbol __fltused).
I'm sure that the folks on comp.os.msdos.programmer (where this
discussion really belongs, and to where followups have been
redirected) could provide more information.

(I have heard that recent releases of Turbo C finally manage to
correct this problem.  I would appreciate any confirmation of
this rumor.)

The remaining puzzle is why so many programs are bitten by this
bug.  How many real programs read floating point values in and
printf them back out (or printf compile-time floating point
constants) without doing any arithmetic on them?  (This is not to
blame the victim, or to excuse Borland for having the bug.  The
test programs which are used to demonstrate the bugs are always
stripped down, as bug-demonstrating test programs should be.  I
wonder why the longer programs from which the test programs were
stripped down managed not to do enough floating-point arithmetic
to trigger proper linking?  Perhaps the problem only comes up
when people write little test programs to play with printf
floating-point specifiers, and there are enough of those little
test programs to account for the frequency of the question.)

                                            Steve Summit
                                            scs at adam.mit.edu