Aussie AI
Representing Zero
-
Book Excerpt from "Generative AI in C++"
-
by David Spuler, Ph.D.
Representing Zero
The sign bit, exponent, and mantissa can represent a lot of numbers, but not zero. We cannot just set all the mantissa bits to zero, because that's not zero, which is rather strange.
There's an implicit extra “1” bit so all the mantissa bits clear isn't 0.0000
, it's 1.0000
.
It always starts with a “1
” and there's literally no way to represent 0.0000
.
Also, the exponent can represent -127
to +128
, but setting the exponent to 0
also isn't zero, because 2^0
is 1
.
And 2^-127
is very small and does get us very close to zero, but it's also not zero.
With sudden horrifying insight, we realize:
There's no way to represent zero!
The solution is that the IEEE 754 standard designers decided to treat all bits zero as being really zero.
All bits zero in the exponent is 0
, but then subtracting the 127
offset, means that it is -127
(the smallest number).
So, if we clear all the exponent and mantissa bits to zeros, the number should be 1.0x2^-127
,
but we can all pretend it's actually zero.
Then we can do some pretend coding, ahem, I mean microcoding,
so that all our Floating-Point Units (FPUs) pretend it's zero, too.
Negative zero. Weirdly, there are two zeros: normal zero and negative zero. The IEEE 754 standard allows two different bit patterns to mean zero, depending on the sign bit. If we clear all the exponent and mantissa to zero, then the sign bit zero means zero, but the sign bit set to “1” means “negative zero”.
I'm not really sure what negative zero even means!
But sometimes when you work with floats, a 0.000
number will get printed with a “-
” in front of it.
Maybe it's negative zero, or maybe it's a tiny negative number with hidden digits at the 15th decimal place.
Fortunately, most of the arithmetic operations treat negative zero the same as zero. The C++ compiler handles it automatically. Adding negative zero does nothing, and multiplying by negative zero is also zero. But one of the gotcha's if you're being tricky with the bits of a 32-bit floating-point number, by pretending it's a 32-bit integer: testing for zero isn't one integer comparison, it's two!
• Next: • Up: Table of Contents |
The new AI programming book by Aussie AI co-founders:
Get your copy from Amazon: Generative AI in C++ |