Aussie AI

Stochastic Quantization

  • Book Excerpt from "Generative AI in C++"
  • by David Spuler, Ph.D.

Stochastic Quantization

Stochastic quantization is a research area that examines intentionally inserting some randomness or statistical variation into the quantization algorithms, which may result in higher accuracy. This idea can be used in conjunction with Post-Training Quantization (PTQ) or with Quantization-Aware Training (QAT).

Research papers on stochastic quantization:

  1. Angela Fan, Pierre Stock, Benjamin Graham, Edouard Grave, Rémi Gribonval, Hervé Jégou, and Armand Joulin, 2020, Training with quantization noise for extreme model compression, arXiv e-prints, pages arXiv–2004, https://arxiv.org/abs/2004.07320
  2. Jianfei Chen, Yu Gai, Zhewei Yao, Michael W Mahoney, and Joseph E Gonzalez. 2020, A statistical framework for low-bitwidth training of deep neural networks, arXiv preprint arXiv:2010.14298, 2020, https://arxiv.org/abs/2010.14298 J Zhang, 2023, Quantization for High-dimensional Data and Neural Networks: Theory and Algorithms, Ph.D. Thesis, University of California, San Diego, https://escholarship.org/content/qt9bd2k7gf/qt9bd2k7gf.pdf (See Chapter 5 in the thesis for stochastic quantization algorithms.)

See more updated research paper citations on stochastic quantization in the Aussie AI literature review at https://www.aussieai.com/research/quantization#stochastic.

 

Next:

Up: Table of Contents

Buy: Generative AI in C++: Coding Transformers and LLMs

Generative AI in C++ The new AI programming book by Aussie AI co-founders:
  • AI coding in C++
  • Transformer engine speedups
  • LLM models
  • Phone and desktop AI
  • Code examples
  • Research citations

Get your copy from Amazon: Generative AI in C++