Aussie AI
Integer-Only-Arithmetic Quantization
-
Book Excerpt from "Generative AI in C++"
-
by David Spuler, Ph.D.
Integer-Only-Arithmetic Quantization
Integer-only quantization is integer quantization where only integer multiplication is performed. The assumption that this is true for all integer quantization algorithms is false. Several types of integer quantization may store weights as quantized integers, but then de-quantize them back to floating-point at various points (even for weight multiplication in some algorithms). Methods that strictly restrict arithmetic to avoid floating-point operations are more precisely named “integer-only-arithmetic quantization algorithms”.
Even these integer-only quantization algorithms may still have floating-point computations in some components of the Transformer. Methods that also fully quantize non-linear components to integers, such as Softmax and normalization components, are called “end-to-end integer Transformers.”
• Next: • Up: Table of Contents |
The new AI programming book by Aussie AI co-founders:
Get your copy from Amazon: Generative AI in C++ |