Aussie AI

Negative Zero

  • Book Excerpt from "Generative AI in C++"
  • by David Spuler, Ph.D.

Negative Zero

Floating-point representations have two zeros: positive zero (the usual “0.0f” one) and negative zero (“-0.0f”). Note that there's no negative zero in integers, but only in floating-point types, because integers use two's complement in C++.

Usually, you don't have to worry about negative zero float values, because all of the floating-point operations treat zero and negative zero as equal. Negative zero is not less than positive zero, but is equal instead. For example, the “==” and “!=” operators should correctly handle both zeros as the same, and testing “f==0.0f” will succeed for zero and negative zero.

Normal C++ operations on float types will automatically handle negative zero for you, such as “<” will treat the two zeros are equal, not less-than. This happens at the cost of some inefficiency.

Detecting Negative Zero. Testing for negative zero is not easy. Unfortunately, you cannot use the std::fpclassify function because it returns FP_ZERO for both positive and negative zero. Here are some fast macros for 32-bit floats that look at the bits by pretending it's an unsigned 32-bit integer:

    #define AUSSIE_FLOAT_TO_UINT(f)  (*(unsigned int*)&f)
    #define AUSSIE_FLOAT_IS_POSITIVE_ZERO(f) \
        (((AUSSIE_FLOAT_TO_UINT(f) )) == 0)  // All 0s
    #define AUSSIE_FLOAT_IS_NEGATIVE_ZERO(f)  \ 
        (((AUSSIE_FLOAT_TO_UINT(f) )) == (1u<<31)) // Sign bit

Note that these macros only work for float variables, not constants, because the address-of “&” operator gets a compilation error for floating-point constants (e.g. 0.0f or -0.0f). Also, these only work for 32-bit float types, and comparable macros are needed for 64-bit double or 128-bit long double types.

Pitfall: Bitwise tricks on negative zero. There are some pitfalls with negative zero if you are trying to subvert the normal floating-point number representations and do bitwise operations on them (as I just did above!).

For example, if you're doing bitwise tests on a float, you may still need to test for two values of zero, such as using one or both of the above zero testing macros.

For magnitude comparisons of float types via their underlying bits, there's also a problem. Whereas positive zero is all-bits-zero and will equal integer zero or unsigned integer zero, negative zero has the uppermost bit set (the sign bit), so it will be a negative integer or a very large unsigned number. Hence, negative zero will sort as less than positive zero if using signed integer tests, or will sort as massively greater than many numbers if using unsigned integers for testing.

The problem with negative zero also means that doing any bitwise comparisons will fail. You cannot just compare the underlying integers for equality against each other, nor can you use byte-wise testing. For example, using memcmp for equality testing a float vector will occasionally fail for float values where positive zero compares against negative zero, leading to insidious bugs.

Optimization by Suppressing Negative Zero. Since negative zero introduces an inefficiency into basic float operations (e.g. == or != with 0.0), can we block it for a speedup? Are there any settings that fix the CPU or the compiler to ignore negative zero?

The FTZ and DAZ modes are mainly for subnormal numbers, not negative zero. I'm not aware of any hardware CPU modes specifically for disallowing skipping negative zeros, and I wonder whether they would actually be a de-optimization anyway, by forcing the FPU to explicitly check for negative zeros. Apparently, FTZ might help avoid negative zero in computations, but I'm not sure it's 100% of cases. There is a GCC flag “-ffast-math” which disables the production of negative zero in software.

AI Engines and Negative Zero. Can we speed up the floating-point computations of our AI engine by blocking all floating-point negative zeros? Then the FPU or GPU can assume there's only one type of zero, and run faster.

One situation with a lot of zeros are the “sparsity” methods from unstructured pruning (i.e. magnitude pruning or movement pruning), although hopefully these would be pruned to the normal zero, rather than negative zero. However, if a pruned model is doing any “zero skipping” routine that tests “f==0.0f” on a weight, there's an unnecessary hidden test for negative zero. We could either run in a negative-zero-disabled mode, or use our own bitwise test for floating point zero as all-bits-zero (i.e. using the unsigned integer trick).

Another point about negative zero in AI engines is that weights are static, so you can certainly pre-process the model file to ensure that none of the weights are negative zero. Then, during runtime inference, you can assume that all float values of weights can ignore negative zero. Alternatively, the training method can avoid negative zeros in its computations by running in such a mode, or it could explicitly check for negative zero as a final safety check.

What about zero values at runtime? The main float computation is the vector of logits, which represents probabilities of words (conceptually). Can we guarantee that it never contains a negative zero, and thereby speed up analysis? Can we ensure that vector dot products never compute negative zero? Or is there a way to have the RELU activation function fix any negative zero values that are computed? These optimizations are examined in later chapters.

 

Next:

Up: Table of Contents

Buy: Generative AI in C++: Coding Transformers and LLMs

Generative AI in C++ The new AI programming book by Aussie AI co-founders:
  • AI coding in C++
  • Transformer engine speedups
  • LLM models
  • Phone and desktop AI
  • Code examples
  • Research citations

Get your copy from Amazon: Generative AI in C++