Aussie AI

Division Optimization

Last Updated 22 May, 2025

by David Spuler, Ph.D.

Division is an expensive operation, and has been largely avoided in neural networks (with preference given to multiplication, addition and bitwise operators). However, there is some research in regard to division arithmetic. Related research areas include:

Division algorithms: Faster ways to implement division, now mainly for hardware designers.
Approximate division algorithms: see below.
Power-of-two quantization: Bitshifting can optimize division, as it can for multiplication. Right bitshift is an obvious optimization for integer division involving power-of-2 divisors. This has relevance in relation to logarithmic quantization.
Integer division: For some thoughts on the use of general integer division of weights in quantization, see division quantization.
Advanced number system division: See dyadic numbers and dyadic quantization for an obscure number system involving power-of-two division.

Division Algorithms and Approximate Division

Some research on fast division algorithms:

LibDivide, https://libdivide.com/ and https://github.com/ridiculousfish/libdivide
Benchmarking division and libdivide on Apple M1 and Intel AVX512, May 12th, 2021, https://ridiculousfish.com/blog/posts/benchmarking-libdivide-m1-avx512.html
S. Hashemi, R. Bahar, and S. Reda, A low-power dynamic divider for approximate applications, Proceedings of the 53rd Annual Design Automation Conference, ACM (2016), p. 105, https://ieeexplore.ieee.org/document/7544348
Suganthi Venkatachalam; Elizabeth Adams; Seok-Bum Ko, May 2019, Design of approximate restoring dividers, 2019 IEEE International Symposium on Circuits and Systems (ISCAS), https://ieeexplore.ieee.org/document/8702363
Chitlu Subhasri, Bhaskara Rao Jammu, L. Guna Sekhar Sai Harsha, Nalini Bodasingi, Visweswara Rao Samoju, Hardware‐efficient approximate logarithmic division with improved accuracy, Journal of Circuit Theory and Applications, 2021, Wiley Online Library, https://onlinelibrary.wiley.com/doi/abs/10.1002/cta.2900, https://doi.org/10.1002/cta.2900
Mohsen Imani; Ricardo Garcia; Andrew Huang; Tajana Rosing, May 2019, CADE: Configurable approximate divider for energy efficiency, 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE) https://ieeexplore.ieee.org/document/8715112
Hiromasa Nakayama, June 2006, Algorithm computing the local b function by an approximate division algorithm in D, https://www.sciencedirect.com/science/article/pii/S0747717108001375, https://arxiv.org/abs/math/0606437
Jackson Melchert; Setareh Behroozi; Jingjie Li; Younghyun Kim, 2019, SAADI-EC: A quality-configurable approximate divider for energy efficiency, IEEE Transactions on Very Large Scale Integration (VLSI) Systems (Volume 27, Issue 11, November 2019, pp.2680-2692), https://ieeexplore.ieee.org/document/8766885
Reza Zendegani; Mehdi Kamal; Arash Fayyazi; Ali Afzali-Kusha; Saeed Safari; Massoud Pedram, 2016, SEERAD: A high speed yet energy-efficient rounding-based approximate divider, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE), 14-18 March 2016, https://ieeexplore.ieee.org/document/7459545
X Li, B Liu, RH Yang, V Courville, C Xing, VP Nia, 2023, DenseShift: Towards Accurate and Efficient Low-Bit Power-of-Two Quantization, Proceedings of the IEEE/CVF, https://openaccess.thecvf.com/content/ICCV2023/papers/Li_DenseShift_Towards_Accurate_and_Efficient_Low-Bit_Power-of-Two_Quantization_ICCV_2023_paper.pdf (Shows how division by a power-of-two, which is a bitshift in integers, can be done using integer addition on the sign and exponent bits of a floating point number.)
David Spuler, March 2024, Chapter 53. Arithmetic Optimization Research, Generative AI in C++: Coding Transformers and LLMs, https://www.amazon.com/dp/B0CXJKCWX9
David Monniaux, Alice Pain, 18 Jul 2022, Formally verified 32- and 64-bit integer division using double-precision floating-point arithmetic, https://arxiv.org/abs/2207.08420
Michael Lunglmayr, 9 Sep 2022 (v2), Efficient Non-sequential Division for FPGAs, https://arxiv.org/abs/2105.05747
X. Fang, Y. Wang, L. Chen and F. An, "A Reconfigurable Floating-Point Division and Square Root Architecture for High-Precision Softmax," Jan 2025, IEEE Transactions on Circuits and Systems I: Regular Papers, doi: 10.1109/TCSI.2024.3524307. https://ieeexplore.ieee.org/abstract/document/10830536/
Zahra Ebrahimi, Muhammad Zaid, Mark Wijtvliet, Akash Kumar, 28 Jun 2022, RAPID: AppRoximAte Pipelined Soft Multipliers and Dividers for High-Throughput and Energy-Efficiency, https://arxiv.org/abs/2206.13970

Aussie AI

Division Optimization

Division Algorithms and Approximate Division

More AI Research

Quick Links

Product

New to Writing?

Writing Styles