Floating point puzzle

Larry Riddle riddle at emory.uucp
Sun Aug 7 13:15:22 AEST 1988

The following is a very simple C program compiled and run on a Sun-4
with no special command line options.

	float x,y;
	x = 1.0/10.0;
	y = 1677721.0/16777216.0; 
	printf("x: %x",x);
	printf("y: %x",y);

Here is the output:

x: 3fb99999 0.10000000149011612 
y: 3fb99999 0.09999996423721313

Notice that x and y, which have been declared as floats, and thus have
a 32 bit representation (according to the manual this obeys IEEE
floating point arithmetic standards), both are printed the same in hex,
but give different values when printed as floats. I believe that the
hex is a straight translation of the internal bit representation. The
division in computing x and y is done in double precision (64 bits) and
then converted to floats.

Can anyone enlighten me on why this output comes out this way?


According to what I have read about the IEEE standard, floats should
have 1 sign bit, a biased exponent of 8 bits, and a 23 bit normalized
mantissa. However, my experiments seem to imply that floats have an 11
bit biased exponent (offset by 1023) and only a 20 bit normalized
mantissa, exactly the same as doubles, except double has a 52 bit
mantissa. For example, the bit pattern 3fb99999 given above for 1/10
corresponds to

    exponent   mantissa
0 01111111011 10011001100110011001

The 11 bits of this exponent gives 1019-1023 = -4 which coupled with
the mantissa gives the binary number


which is the (non-terminating) binary representation for 1/10.  Notice
also that this 32 bit representation has been chopped, rather than

I don't understand this discrepancy either. Any suggestions?


Larry Riddle        |  gatech!emory!riddle               USENET
Emory University    |  riddle at emory                      CSNET,BITNET
Dept of Math and CS |  riddle.emory at csnet-relay          ARPANET 
Atlanta, Ga 30322   |  (404) 727-7922                    AT at T

More information about the Comp.lang.c mailing list