Aussie AI

5-Bit Quantization (INT5)

  • Book Excerpt from "Generative AI in C++"
  • by David Spuler, Ph.D.

5-Bit Quantization (INT5)

Research papers on 5-bit quantization:

  1. E Kloberdanz, W Le, Sep 2023, MixQuant: Mixed Precision Quantization with a Bit-width Optimization Search, arXiv preprint arXiv:2309.17341, https://arxiv.org/pdf/2309.17341.pdf (Various tests of quantization from 2-bits to 8-bits.)
  2. NM Ho, DT Nguyen, JL Gustafson, WF Wong, 2023, Bedot: Bit Efficient Dot Product for Deep Generative Models, CoNGA 2023: Next Generation Arithmetic, pp. 19–37, https://link.springer.com/chapter/10.1007/978-3-031-32180-1_2, PDF: https://www.comp.nus.edu.sg/~wongwf/papers/CONGA23-Bedot.pdf (2–3 bits for weights and 2–5 bits for activation.)
  3. B Gouin-Ferland, R Coffee, AC Therrien, 2022, Data reduction through optimized scalar quantization for more compact neural networks, Frontiers in Physics, https://www.frontiersin.org/articles/10.3389/fphy.2022.957128/full (Examined 3 to 7 bit weights for quantization.)
  4. Markus Nagel, Mart van Baalen, Tijmen Blankevoort, Max Welling, 2019, Data-free quantization through weight equalization and bias correction, PDF: https://openaccess.thecvf.com/content_ICCV_2019/papers/Nagel_Data-Free_Quantization_Through_Weight_Equalization_and_Bias_Correction_ICCV_2019_paper.pdf (Evaluates INT5, INT6, INT8, INT10, INT12, and INT16.)

See more papers on 5-bit quantization (INT5) at: https://www.aussieai.com/research/quantization#int5

 

Next:

Up: Table of Contents

Buy: Generative AI in C++: Coding Transformers and LLMs

Generative AI in C++ The new AI programming book by Aussie AI co-founders:
  • AI coding in C++
  • Transformer engine speedups
  • LLM models
  • Phone and desktop AI
  • Code examples
  • Research citations

Get your copy from Amazon: Generative AI in C++