Aussie AI

Underflow and Overflow

  • Book Excerpt from "Generative AI in C++"
  • by David Spuler, Ph.D.

Underflow and Overflow

Underflow is when a tiny floating-point number becomes so small that we can only represent it as zero. This can be a very tiny positive or negative number. Note that a negative number with a huge magnitude (near negative infinity) isn't underflow; that's actually negative overflow. Underflow refers to tiny fractions.

Generally, underflow isn't a problem for AI, because a number that low isn't going to affect the results. Similarly, I don't think an AI engine needs to worry much about subnormal/denormalized tiny numbers either. If the probability of a word appearing is 2^-127 (or 2^-126 for denormalized), well, it might as well be zero anyway.

If we're using Bfloat16 for 16-bit processing, it still has 8 bit exponents, so the lowest value is almost the same number (about 2^-127). If we've quantized the network to FP16 (also 16-bit but with a 5-bit exponent), then the lowest probability we can represent is 2^-31, which is also a tiny probability.

Generally speaking, AI engines don't tend to worry about underflow in floating-point, and “pruning” of low weight values is actually a common optimization. If a floating-point calculation underflows, it should just go harmlessly to zero. More concerning would be integer underflow, which is a different issue of large negatives wrapping around to positives. Floating-point underflow is better behaved.

Overflow is when a number gets so large that it cannot be represented in floating-point. Note that there are two types of overflow: positive overflow and negative overflow.

The exponent is the problem for overflow. When the number is larger than the highest exponent power, then it's either a very large positive or a very large-magnitude negative number. For an 8-bit exponent, that means 2^+127 (because +128 is reserved for the special Inf/NaN numbers). For a 5-bit exponent in FP16, this means 2^+31, which is, coincidentally, also a good salary to request at your next performance review.

Overflow can be a problem for AI engines, but usually only in the low-bit quantized models, rather than the usual FP32 calculations. We don't realistically want a probability or weight anywhere near the huge numbers (positive or negative), but arithmetic computations can sometimes go too high. One of the advantages of “normalization layers” is that it reduces the chance of this occurring, although they're mainly used for other reasons related to accuracy. When overflow occurs, it could become a special floating-point number (NaN or Inf), or an integer number might toggle over to negative (e.g. if integer-only-arithmetic quantized).

 

Next:

Up: Table of Contents

Buy: Generative AI in C++: Coding Transformers and LLMs

Generative AI in C++ The new AI programming book by Aussie AI co-founders:
  • AI coding in C++
  • Transformer engine speedups
  • LLM models
  • Phone and desktop AI
  • Code examples
  • Research citations

Get your copy from Amazon: Generative AI in C++