Aussie AI
2-Bit Quantization (INT2)
-
Book Excerpt from "Generative AI in C++"
-
by David Spuler, Ph.D.
2-Bit Quantization (INT2)
This section refers to non-ternary 2-bit quantization, using 4 distinct weights. In practice, 2-bit quantization is regarded as still having some problems with model accuracy, whereas 4-bit integer quantization is considered a more reasonable tradeoff of speed-vs-accuracy. On the other hand, maybe this is unwarranted, since Liu et al (2022) tested lots of models with 2-bits, 3-bits, and 4-bits (see Table 1 in their paper), and the extra accuracy of 4-bits over 2-bits was usually only a couple of percentage points (for double the space).
Research papers on 2-bit quantization:
- Jungwook Choi, Pierce I-Jen Chuang, Zhuo Wang, Swagath Venkataramani, Vijayalakshmi Srinivasan, Kailash Gopalakrishnan, July 2018, Bridging the Accuracy Gap for 2-bit Quantized Neural Networks (QNN), https://arxiv.org/abs/1807.06964
- Jungwook Choi, Swagath Venkataramani, Vijayalakshmi (Viji) Srinivasan, Kailash Gopalakrishnan, Zhuo Wang, Pierce Chuang, 2019, Accurate and Efficient 2-bit Quantized Neural Networks, Proceedings of Machine Learning and Systems 1 (MLSys 2019), https://proceedings.mlsys.org/paper/2019/file/006f52e9102a8d3be2fe5614f42ba989-Paper.pdf
- S. Zhou, Y. Wu, Z. Ni, X. Zhou, H. Wen and Y. Zou, 2016, DoReFa-Net: Training low bitwidth convolutional neural networks with low bitwidth gradients, arXiv:1606.06160, 2016. https://arxiv.org/abs/1606.06160 (Has binary weights, 2-bit activations)
- Z. Cai, X. He, J. Sun and N. Vasconcelos, 2017, Deep learning with low precision by half-wave Gaussian quantization, Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 5918-5926, Jul. 2017. https://arxiv.org/abs/1702.00953 (Has binary weights, 2-bit activations)
- Han-Byul Kim, Eunhyeok Park, and Sungjoo Yoo. 2022. BASQ: Branch-wise Activation-clipping Search Quantization for Sub-4-bit Neural Networks, In European Conference on Computer Vision, Cham: Springer Nature Switzerland, 17-33. https://link.springer.com/chapter/10.1007/978-3-031-19775-8_2 (Evaluates quantization precision from 2-bits to 4-bits.)
- Zechun Liu, Kwang-Ting Cheng, Dong Huang, Eric Xing, Zhiqiang Shen. Apr 2022. Nonuniform-to-uniform quantization: Towards accurate quantization via generalized straight-through estimation, In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4942-4952, https://arxiv.org/abs/2111.14826, Code: https://github.com/liuzechun/Nonuniform-to-Uniform-Quantization (Contains an extensive review of models with 2-bit weights and 2-bit activations, and also 3-bits and 4-bits.)
- E Kloberdanz, W Le, Sep 2023, MixQuant: Mixed Precision Quantization with a Bit-width Optimization Search, arXiv preprint arXiv:2309.17341, https://arxiv.org/pdf/2309.17341.pdf (Various tests of quantization from 2-bits to 8-bits.)
- Xiaofan Lin, Cong Zhao, and Wei Pan. 2017, Towards accurate binary convolutional neural network, Advances in Neural Information Processing Systems, 30, 2017. https://arxiv.org/abs/1711.11294 (Unique 2-bit quantization approach is really a double-binarized quantization method.)
- NM Ho, DT Nguyen, JL Gustafson, WF Wong, 2023, Bedot: Bit Efficient Dot Product for Deep Generative Models, CoNGA 2023: Next Generation Arithmetic, pp. 19–37, https://link.springer.com/chapter/10.1007/978-3-031-32180-1_2, PDF: https://www.comp.nus.edu.sg/~wongwf/papers/CONGA23-Bedot.pdf (2–3 bits for weights and 2–5 bits for activation.)
- Li, Y., Gong, R., Tan, X., Yang, Y., Hu, P., Zhang, Q., Yu, F., Wang, W., and Gu, S., 2021, BRECQ: Pushing the Limit of Post-Training Quantization by Block Reconstruction, ArXiv, abs/2102.05426. https://arxiv.org/abs/2102.05426 Code: https://github.com/yhhhli/BRECQ (Tests 2 to 4 bits for weights, and mixed-precision quantization.)
- Yuji Chai, John Gkountouras, Glenn G. Ko, David Brooks, Gu-Yeon Wei, June 2023, INT2.1: Towards Fine-Tunable Quantized Large Language Models with Error Correction through Low-Rank Adaptation, arXiv preprint arXiv:2306.08162, https://arxiv.org/abs/2306.08162
See more papers on 2-bit quantization (INT2) at: https://www.aussieai.com/research/quantization#int3
• Next: • Up: Table of Contents |
The new AI programming book by Aussie AI co-founders:
Get your copy from Amazon: Generative AI in C++ |