Source for ASCII to FLOAT

Sun Jul 30 01:08:02 AEST 1989

In article <9699 at alice.UUCP>, andrew at alice.UUCP (Andrew Hume) writes:

> it seems to me the discussion of an atof is going in th wrong
> direction. it is nice to insist that 1.0 and 0.1e1 evaluate
> to the same bit pattern but really what we want, at least what i want,
> is an accurate atof. if you consider the real number line, the set
> of possible values of double (or whatever precision) are just a bunch
> of points on this line. then atof should return the closest such point
> to the real represented by the string. (ties should
> probably be broken systematically.)

> 	in summary, get someone who really knows what is going on
> (not me certainly) to figure it out; buggering around with
> half-hearted hacks to preserve some precious bits of accuracy
> is simply a waste of time, however well intentioned.

Well, I sort of know what's going on, having written the
assembly-language atof() that's been in use on VAX machines
for the past ten years.

Ideally, atof would return the closest floating-point number
to the value given as input, with ties broken as specified by
the machine's arithmetic (IEEE says that ties are ordinarily
broken by rounding to even).  Unfortunately, an atof that good
requires unbounded precision arithmetic -- I just don't see
any other sensible way of doing it.

To convince yourself of this, pick an integer that's big
enough to fill completely the fraction of your floatin-point
format.  That is, changing the least significant bit (LSB)
changes the value by 1.  Now add 0.5 to that integer.
You then have a number whose decimal representation is
of the form xxxxxx.5 that represents an exact tie.  If ties
are rounded down, or the integral part of the number is even
and ties are rounded to even, then this number itself will
be rounded down.  However, if instead of xxxxxx.5 you have
xxxxxx.5000...0001, it will have to be rounded up.  Thus in
general you need to know about ALL the digits of the number
to convert it exactly.

For this reason, IEEE does not require exact conversion.
Instead, it requires an error of no more than 0.47*LSB
in the conversion.  If I remember correctly, the value 0.47
is a principal result of Jerome Coonen's PhD thesis: it
guarantees that you can write out a floating-point number and
read it back in again while preserving the precise bit pattern,
yet is still not too difficult to implement.

In addition to this property, IEEE requires that input must be
converted precisely the same way at compile time and run time.
Also, the accuracy requirement implies that a decimal value
that can be exactly represented as a floating-point number
must be converted to that number.

Beyond this, it is not hard to guarantee that equivalent
representations such as 1.0 and 0.1e1 will produce exactly
the same result.  I forget whether IEEE requires that or not.
One way to do it is to follow the strategy I did in my
VAX atof(): accumulate digits in a multiple-precision integer
until further digits would cause that integer to overflow.
Along the way, remember where the decimal point is.  Then
pick off the exponent, if any, offset it by the decimal point
position, and then (finally) scale the multiple-precision
integer appropriately.  I used 64 bits for the integer;
since a VAX double-precision fraction is 56 bits (including
a hidden bit) I could lose 7 bits in the scaling and still
be assured of getting it (nearly) right.  I did not actually
prove that I never lost more than 7 bits in scaling, but I did
try a number of edge cases and believe that the actual results
are at least a full decimal digit better than the IEEE requirement.

I completely agree with Andrew Hume that floating-point conversion
is not something that can be done well casually.
-- 
				--Andrew Koenig
				  ark at europa.att.com