Aussie AI

Integer Division for Quantizationand?

  • Book Excerpt from "Generative AI in C++"
  • by David Spuler, Ph.D.

Integer Division for Quantization?

What about using integer division instead of multiplications in quantization? After all, multiplication by a small weight like 0.003 could instead be a division by 333. Is this an avenue for optimization? It seems unlikely, since division is usually much slower than multiplication, often by an order-of-magnitude.

Integer division can possibly be used efficiently using bitshift operations. Power-of-two division might be an opportunity for (right) bitshifts instead of division, which is effectively the same as the left bitshift quantization above. Dyadic numbers are an interesting idea and their implementation involves division by a power-of-two, usually performed via a right bitshift.

Note that division is often used in scaling operations, particularly in de-quantization. However, in such cases, it isn't the bottleneck operation, as scaling or de-quantization is performed an order-of-magnitude fewer times.

Research papers on division:

  1. LibDivide, 2023, https://libdivide.com/ and https://github.com/ridiculousfish/libdivide
  2. Ridiculous Fish, May 12th, 2021, Benchmarking division and libdivide on Apple M1 and Intel AVX512, https://ridiculousfish.com/blog/posts/benchmarking-libdivide-m1-avx512.html

For more division quantization research papers, see https://www.aussieai.com/research/quantization#division.

 

Next:

Up: Table of Contents

Buy: Generative AI in C++: Coding Transformers and LLMs

Generative AI in C++ The new AI programming book by Aussie AI co-founders:
  • AI coding in C++
  • Transformer engine speedups
  • LLM models
  • Phone and desktop AI
  • Code examples
  • Research citations

Get your copy from Amazon: Generative AI in C++