Aussie AI
Multiplication Arithmetic Optimization
-
Last Updated 12 September, 2024
-
by David Spuler, Ph.D.
Multiplication is the foremost bottleneck in training and inference of neural networks and Transformer architectures. Most models rely on matrix multiplications, whether you call it tensors or convolutions, which involve vector dot products, which in turn involve "multiply-and-add" sequences (called "multiply-accumulate" or MAC). The multiplication part is more expensive than the accumulation.
There have been various ideas over the years of AI research as to how to optimize multiplications, including:
- Hardware-accelerated multiplication (lately, this is a GPU's bread-and-butter)
- Faster multiplication arithmetic algorithms
- Approximate multiplication arithmetic algorithms
- Integer multiplication instead of floating-point (see quantization)
- Faster matrix multiplication algorithms
- Avoiding or reducing multiplications (e.g. zero-multiplication models, pruning, zero skipping, sparsity, etc.)
- Advanced math numerical systems
- Zero skipping
For other arithmetic optimizations, see also:
Algorithms for Faster Multiplication
Although being able to multiply two integers together is taken for granted by modern programmers, there are actually complicated algorithms happening behind the scenes (i.e. in the chips). Early algorithms include Karatsuba multiplication (1962), Toom-Cook multiplication, Schonhage–Strassen algorithms, and contributions by Knuth. The improvement and parallelization of such algorithms is fundamental to GPU and hardware accelerator design. Use of such algorithms in software acceleration of model inference seems unlikely to beat hardware acceleration.
- Erica Klarreich, Multiplication Hits the Speed Limit, Communications of the ACM, January 2020, Vol. 63 No. 1, Pages 11-13 10.1145/3371387, https://cacm.acm.org/magazines/2020/1/241707-multiplication-hits-the-speed-limit/abstract
- A. Karatsuba, Yu. Ofman, "Multiplication of many-digital numbers by automatic computers", Dokl. Akad. Nauk SSSR, 145:2 (1962), 293–294, https://www.mathnet.ru/php/archive.phtml?wshow=paper&jrnid=dan&paperid=26729&option_lang=eng
- Stephen A. Cook, "On the Minimum Computation Time of Functions", PhD thesis, Harvard University, Cambridge, Mass., 1966, https://community.ams.org/journals/tran/1969-142-00/S0002-9947-1969-0249212-8/S0002-9947-1969-0249212-8.pdf
- David Harvey, Joris van der Hoeven. "Integer multiplication in time O(n log n)", pp.563-617, Volume 193 (2021), Issue 2, https://annals.math.princeton.edu/2021/193-2/p04
- Shmuel Winograd, "Arithmetic Complexity of Computations", CBMS-NSF Regional Conference Series in Applied Mathematics, 1980, https://epubs.siam.org/doi/book/10.1137/1.9781611970364
- Shri Prakash Dwivedi, "An Efficient Multiplication Algorithm Using Nikhilam Method", Fifth International Conference on Advances in Recent Technologies in Communication and Computing (ARTCom 2013), 2013, p.223–228, https://digital-library.theiet.org/content/conferences/10.1049/cp.2013.2209
- Harpreet Singh Dhillon and Abhijit Mitra, "A Reduced-Bit Multiplication Algorithm for Digital Arithmetic", International Journal of Electronics, 2008, https://www.researchgate.net/profile/Harpreet-Dhillon-2/publication/254962192_A_Reduced-Bit_Multiplication_Algorithm_for_Digital_Arithmetic/links/560a8d2008ae576ce6400210/A-Reduced-Bit-Multiplication-Algorithm-for-Digital-Arithmetic.pdf
- Alexander Heinecke, Greg Henry, Maxwell Hutchinson, and Hans Pabst. LIBXSMM: Accelerating Small Matrix Multiplications by Runtime Code Generation. In SC16: International Conference for High Performance Computing, Networking, Storage and Analysis, pages 981–991. IEEE, 2016, https://ieeexplore.ieee.org/document/7877162
- G Alsuhli, V Sakellariou, H Saleh, M Al-Qutayri, Number Systems for Deep Neural Network Architectures: A Survey, 2023, https://arxiv.org/abs/2307.05035
- JINGXUAN YANG, XIAOQIN WANG, AND YIYING JIAN, 17 February 2024, CANET: Quantized Neural Network Inference With 8-bit Carry-Aware Accumulator, https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10445180
- Bo Liu, Grace Li Zhang, Xunzhao Yin, Ulf Schlichtmann, Bing Li, 25 Feb 2024, EncodingNet: A Novel Encoding-based MAC Design for Efficient Neural Network Acceleration, https://arxiv.org/abs/2402.18595 (Hardware design for MAC computations.)
- Darshan C. Ganji, Saad Ashfaq, Ehsan Saboori, Sudhakar Sah, Saptarshi Mitra, MohammadHossein AskariHemmat, Alexander Hoffman, Ahmed Hassanien, Mathieu Léonardon, 18 Apr 2023, DeepGEMM: Accelerated Ultra Low-Precision Inference on CPU Architectures using Lookup Tables, https://arxiv.org/abs/2304.09049
- David Spuler, March 2024, Chapter 53. Arithmetic Optimization Research, Generative AI in C++: Coding Transformers and LLMs, https://www.amazon.com/dp/B0CXJKCWX9
- R. Agrawal, N. S. Abhijith, U. Anil Kumar, S. Veeramachaneni and S. E. Ahmed, 2024, Energy-Efficient Ternary Multiplier, 2024 IEEE 6th International Conference on AI Circuits and Systems (AICAS), Abu Dhabi, United Arab Emirates, 2024, pp. 382-387, doi: 10.1109/AICAS59952.2024.10595938, https://ieeexplore.ieee.org/abstract/document/10595938
Approximate Multiplication Arithmetic
In addition to faster integer multiplication algorithms, there are various ways to approximate multiplication arithmetic, using "inexact logic". This means approximating the low-level numeric multiplication of integers or floating-point weights, not to be confused with approximating matrix multiplication in higher-level algorithms. The aim is to be faster, trading off some errors in the multiplication. Research exists in this area for both integer multiplication and floating-point multiplication.
A simple example of approximating integer multiplication is the use of power-of-two weights and bitshift algorithms (or multiple bitshift-adds), as discussed in the section on weight bitshifting. Some of the more theoretical algorithms for approximating integer multiplication arithmetic are shown below.
These methods are interesting, but using these in software inference algorithms seems unlikely to be faster than hardware acceleration of (non-approximate) multiplication. In fact, Kim et al. (2019) notes that they tested their new algorithms against a non-GPU version of the models, as their algorithm was slower than hardware-accelerated models. The possible use of these approximate multiplication algorithms to further speed up multiplication in hardware accelerators, accepting some error, has been somewhat explored, but seems an area offering future improvements.
Some of the general papers on approximate multiplication (and other types of approximate arithmetic) include:
- M. A. Hanif, A. Marchisio et al., “X-DNNs: Systematic cross-layer approximations for energy-efficient deep neural networks,” Journal of Low Power Electronics, vol. 14, no. 4, pp. 520–534, Dec. 2018. https://www.semanticscholar.org/paper/X-DNNs:-Systematic-Cross-Layer-Approximations-for-Hanif-Marchisio/5ddaf1aff7d5a4a3484963849828c8d2d1315bc3
- M. Shafique, R. Hafiz, S. Rehman, W. El-Harouni, and J. Henkel, Cross-layer approximate computing: From logic to architectures, Proceedings of the 53rd Annual Design Automation Conference, ACM (2016), p. 99., https://ieeexplore.ieee.org/document/7544342
- S. Mittal, A survey of techniques for approximate computing, ACM Computing Surveys (CSUR) 48, 62 (2016), https://dl.acm.org/doi/10.1145/2893356
- P. Kulkarni, P. Gupta, and M. Ercegovac, Trading accuracy for power with an underdesigned multiplier architecture, 2011 24th International Conference on VLSI Design (VLSI Design), IEEE (2011), pp. 346–351, https://ieeexplore.ieee.org/document/5718826
- V. Gupta, D. Mohapatra, S. P. Park, A. Raghunathan, and K. Roy, Impact: imprecise adders for low-power approximate computing, Proceedings of the 17th IEEE/ACM International Symposium on Low-Power Electronics and Design, IEEE Press (2011), pp. 409–414, https://ieeexplore.ieee.org/document/5993675
- M. T. Teimoori, M. A. Hanif, A. Ejlali, and M. Shafique, AdAM: Adaptive approximation management for the non-volatile memory hierarchies, Design, Automation and Test in Europe Conference and Exhibition (DATE), 2018, IEEE (2018), pp. 785–790, https://ieeexplore.ieee.org/document/8342113
- F. Sampaio, M. Shafique, B. Zatt, S. Bampi, and J. Henkel, Approximation-aware Multi-Level Cells STT-RAM cache architecture, 2015 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES), October (2015), pp. 79–88, https://ieeexplore.ieee.org/abstract/document/7324548
- Open-Source Library of Low-Power Approximate Computing Modules: https://ces.itec.kit.edu/lpACLib.php Code: https://sourceforge.net/projects/lpaclib/
- S. S. Sarwar, S. Venkataramani et al., “Energy-efficient neural computing with approximate multipliers,” J. Emerg. Technol. Comput. Syst., vol. 14, no. 2, pp. 16:1–16:23, Jul. 2018, https://dl.acm.org/doi/10.1145/3097264
- Q. Zhang, T. Wang, Y. Tian, F. Yuan, and Q. Xu, “Approxann: An approximate computing framework for artificial neural network,” in DATE’15, March 2015, pp. 701–706, https://ieeexplore.ieee.org/document/7092478
- P. Kulkarni, P. Gupta, M. Ercegovac, “Trading Accuracy for Power with an Underdesigned Multiplier Architecture”, 24th International Conference on VLSI Design (VLSI Design), pp. 346–351, 2011, https://ieeexplore.ieee.org/document/5718826
- M. A. Hanif, R. Hafiz, and M. Shafique, Error resilience analysis for systematically employing approximate computing in convolutional neural networks, Design, Automation and Test in Europe Conference and Exhibition (DATE), 2018, IEEE (2018), pp. 913–916, https://ieeexplore.ieee.org/document/8342139
- K. Bhardwaj, P. S. Mane, J. Henkel, “Power- and Area-Efficient Approximate Wallace Tree Multiplier for Error-Resilience Systems”, ISQED, 2014, https://ieeexplore.ieee.org/document/6783335
- K. Y. Kyaw, W.-L. Goh, K.-S. Yeo, “Low-power high-speed multiplier for error-tolerant application”, IEEE International Conference of Electron Devices and Solid-State Circuits (EDSSC), 2010, https://ieeexplore.ieee.org/document/5713751
- M. B. Sullivan, E. E. Swartzlander, “Truncated error correction for flexible approximate multiplication”, ASILOMAR, pp. 355–359, 2012, https://ieeexplore.ieee.org/document/6489023
- W. El-Harouni, S. Rehman, B. S. Prabakaran, A. Kumar, R. Hafiz, and M. Shafique, Embracing approximate computing for energy-efficient motion estimation in high efficiency video coding, 2017 Design, Automation and Test in Europe Conference and Exhibition (DATE), IEEE (2017), pp. 1384–1389, https://ieeexplore.ieee.org/document/7927209
- M. Brandalero, A. C. S. Beck, L. Carro, and M. Shafique, Approximate on-the-fly coarse-grained reconfigurable acceleration for general-purpose applications, 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC), IEEE (2018), pp. 1–6, https://ieeexplore.ieee.org/document/8465930
- J. Zhang, K. Rangineni, Z. Ghodsi, and S. Garg, ThUnderVolt: Enabling aggressive voltage underscaling and timing error resilience for energy efficient deep neural network accelerators. arXiv preprint arXiv:1802.03806 (2018), https://arxiv.org/abs/1802.03806
- J. S. Miguel, J. Albericio, N. E. Jerger, and A. Jaleel, The bunker cache for spatio-value approximation, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), IEEE (2016), pp. 1–12, https://ieeexplore.ieee.org/document/7783746
- V. Mrazek, S. S. Sarwar, L. Sekanina, Z. Vasicek, and K. Roy, Design of power-efficient approximate multipliers for approximate artificial neural networks, 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), November (2016), pp. 1–7, https://ieeexplore.ieee.org/document/7827658
- S. Kim, P. Howe, T. Moreau, A. Alaghi, L. Ceze, and V. Sathe, MATIC: Learning Around Errors for Efficient Low-Voltage Neural Network Accelerators, Design, Automation and Test in Europe Conference and Exhibition (DATE), 2018, IEEE (2018), pp. 1–6, https://arxiv.org/abs/1706.04332
- S. De, J. Huisken, and H. Corporaal, “Designing energy efficient approximate multipliers for neural acceleration,” in 2018 21st Euromicro Conference on Digital System Design (DSD). IEEE, 2018, pp. 288–295, https://ieeexplore.ieee.org/document/8491830
- V. Mrazek, R. Hrbacek, Z. Vasicek, and L. Sekanina, EvoApprox8b: Library of approximate adders and multipliers for circuit design and benchmarking of approximation methods, Proceedings of the Conference on Design, Automation and Test in Europe, European Design and Automation Association (2017), pp. 258–261, https://ieeexplore.ieee.org/document/7926993
- V. Mrazek, Z. Vasicek, and L. Sekanina, “Design of quality-configurable approximate multipliers suitable for dynamic environment,” in AHS’18, 2018, pp. 264–271, https://ieeexplore.ieee.org/document/8541479
- X. He, L. Ke, W. Lu, G. Yan, and X. Zhang, Axtrain: Hardware-oriented neural network training for approximate inference. arXiv preprint arXiv:1805.08309 (2018), https://arxiv.org/abs/1805.08309v1
- S. Rehman, W. El-Harouni, M. Shafique, A. Kumar, and J. Henkel, Architectural-space exploration of approximate multipliers, Proceedings of the 35th International Conference on Computer-Aided Design, ICCAD’16, ACM, New York, NY, USA (2016), pp. 80:1–80:8, https://ieeexplore.ieee.org/abstract/document/7827657, PDF: https://esim-project.eu/files/user/akumar/pdf/ICCAD_2017_ApproxMult.pdf
- S. Misailovic, M. Carbin, S. Achour, Z. Qi, M. C. Rinard, “Chisel: reliability- and accuracy-aware optimization of approximate computational kernels”, OOPSLA, 309-328, 2014, https://dspace.mit.edu/handle/1721.1/91290
- Y. Wang, Y. Qin, D. Deng, J. Wei, Y. Zhou, Y. Fan, T. Chen, H. Sun, L. Liu, S. Wei et al., “A 28nm 27.5 TOPS/W approximate-computing-based transformer processor with asymptotic sparsity speculating and out-of-order computing,” in 2022 IEEE International Solid-State Circuits Conference (ISSCC), vol. 65. IEEE, 2022, pp. 1–3, https://ieeexplore.ieee.org/document/9731686
- Y Wu, C Chen, W Xiao, X Wang, C Wen, J Han, A Survey on Approximate Multiplier Designs for Energy Efficiency: From Algorithms to Circuits, arXiv preprint, 2023, https://arxiv.org/abs/2301.12181 (Extensive survey of approximate multiplication methods.)
- G Alsuhli, V Sakellariou, H Saleh, M Al-Qutayri, Number Systems for Deep Neural Network Architectures: A Survey, 2023, https://arxiv.org/abs/2307.05035 (Various number systems have approximate multiplication properties.)
- Zixuan Ou, Bing Yu, Wenbin Ye, "An Efficient Algorithm-Hardware Co-Design for Radar-Based Fall Detection With Multi-Branch Convolutions", IEEE Transactions on Circuits and Systems I: Regular Papers, vol.70, no.4, pp.1613-1624, 2023. https://ieeexplore.ieee.org/document/10005051
- Mahdi Taheri, Mohammad Hasan Ahmadilivani, Maksim Jenihhin, Masoud Daneshtalab, Jaan Raik, "APPRAISER: DNN Fault Resilience Analysis Employing Approximation Errors", 2023 26th International Symposium on Design and Diagnostics of Electronic Circuits and Systems (DDECS), pp.124-127, 2023. https://ieeexplore.ieee.org/document/10139468, https://arxiv.org/abs/2305.19733
- Efstratios Zacharelos, Italo Nunziata, Gerardo Saggese, Antonio G.M. Strollo, Ettore Napoli, "Approximate Recursive Multipliers Using Low Power Building Blocks", IEEE Transactions on Emerging Topics in Computing, vol.10, no.3, pp.1315-1330, 2022. https://ieeexplore.ieee.org/document/9812478
- Kah Phooi Seng, Li-Minn Ang, "Embedded Intelligence: State-of-the-Art and Research Challenges", IEEE Access, vol.10, pp.59236-59258, 2022. https://ieeexplore.ieee.org/document/9775683, PDF: https://research.usc.edu.au/esploro/outputs/99640278002621
- Antonio Giuseppe Maria Strollo, Ettore Napoli, Davide De Caro, Nicola Petra, Gerardo Saggese, Gennaro Di Meo, "Approximate Multipliers Using Static Segmentation: Error Analysis and Improvements", IEEE Transactions on Circuits and Systems I: Regular Papers, vol.69, no.6, pp.2449-2462, 2022. https://ieeexplore.ieee.org/document/9726786
- Haroon Waris, Chenghua Wang, Chenyu Xu, Weiqiang Liu, "AxRMs: Approximate Recursive Multipliers Using High-Performance Building Blocks", IEEE Transactions on Emerging Topics in Computing, vol.10, no.2, pp.1229-1235, 2022. https://ieeexplore.ieee.org/document/9483645
- Ying Wu, Chuangtao Chen, Weihua Xiao, Xuan Wang, Chenyi Wen, Jie Han, Xunzhao Yin, Weikang Qian, Cheng Zhuo, "A Survey on Approximate Multiplier Designs for Energy Efficiency: From Algorithms to Circuits", ACM Transactions on Design Automation of Electronic Systems, 2023. https://doi.org/10.1145/3610291, https://arxiv.org/abs/2301.12181 (Extensive survey of many approximate multiplication algorithms.)
- Sudeh Shirkavand Saleh Abad, Mohammad Hossein Moaiyeri, "Hardware-accuracy trade-offs for error-resilient applications using an ultra-efficient hybrid approximate multiplier", The Journal of Supercomputing, 2022. https://doi.org/10.1007/s11227-022-04789-6
- Giorgos Armeniakos, Georgios Zervakis, Dimitrios Soudris, Jörg Henkel, "Hardware Approximate Techniques for Deep Neural Network Accelerators: A Survey", ACM Computing Surveys, vol.55, no.4, pp.1, 2023. https://doi.org/10.1145/3527156, https://arxiv.org/abs/2203.08737 (Survey of many approximate techiques in AI including logarithmic and non-logarithmic approximate multiplication.)
- X. Jiao, V. Akhlaghi, Yu Jiang, and R. K. Gupta. 2018. Energy-efficient neural networks using approximate computation reuse. Proc. of the 2018 Design, Automation and Test in Europe Conference and Exhibition, (DATE) (2018), 1223–1228. https://ieeexplore.ieee.org/document/8342202, PDF: http://wingtecher.com/themes/WingTecherResearch/assets/papers/date18-energy-efficient.pdf (Caches and reuses approximate computations by using Bloom filters, a data structure similar to hashing.)
- Ao Ren, Ji Li, Zhe Li, Caiwen Ding, Xuehai Qian, Qinru Qiu, Bo Yuan, Yanzhi Wang, 2017, "SC-DCNN: Highly-scalable deep convolutional neural network using stochastic computing", ACM SIGPLAN Notices, vol. 52, no. 4, pp. 405-418, 2017. https://arxiv.org/abs/1611.05939 (Stochastic method with multiplication and addition approximations via AND gates and multiplexers.)
- A Roy, K Roy, 2023, HADES: Hardware/Algorithm Co-design in DNN accelerators using Energy-efficient Approximate Alphabet Set Multipliers, arXiv preprint arXiv:2302.01990, https://arxiv.org/abs/2302.01990
- Dimitrios Danopoulos, Georgios Zervakis, Dimitrios Soudris, Jörg Henkel, 12 Feb 2024, TransAxx: Efficient Transformers with Approximate Computing, https://arxiv.org/abs/2402.07545 (Using approximations in Vision Transformer architectures.)
- Salar Shakibhamedan, Amin Aminifar, Nima TaheriNejad, Axel Jantsch, 2024, EASE: Energy Optimization through Adaptation — A Review of Runtime Energy-Aware Approximate Deep Learning Algorithms, https://eclectx.org/Publications/2024_M13.pdf (Survey paper on techniques for adaptive inference with a focus on approximations of inference, including loop performance, stochastic algorithms, approximate arithmetic, quantization, pruning and low-rank.)
- H Saadat, H Bokhari, 2018, Minimally biased multipliers for approximate integer and floating-point multiplication, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (Volume 37, Issue 11, November 2018), https://ieeexplore.ieee.org/abstract/document/8493590 PDF: https://sci-hub.se/10.1109/tcad.2018.2857262
- Elias Trommer. 2022. agn-approx Software Repository. https://github.com/etrommer/agn-approx
- O Spantidi, I Anagnostopoulos, 2023, The Perfect Match: Selecting Approximate Multipliers for Energy-Efficient Neural Network Inference, https://ieeexplore.ieee.org/abstract/document/10147918/
- M. Imani, M. Masich, D. Peroni, P. Wang, and T. Rosing, 2018, “Canna: Neural network acceleration using configurable approximation on gpgpu,” in 2018 23rd Asia and South Pacific Design Automation Conference (ASP-DAC). IEEE, 2018, pp. 682–689. https://ieeexplore.ieee.org/document/8297401
- S. S. Sarwar, S. Venkataramani, A. Raghunathan, and K. Roy, “Multiplier-less artificial neurons exploiting error resiliency for energy-efficient neural computing,” in Proc. Des. Autom. Test Eur. Conf. Exhib., 2016, pp. 145–150. https://arxiv.org/abs/1602.08557
- Khosravi, S., Kamran, A, 2024, Iterative construction of energy and quality-efficient approximate multipliers utilizing lower bit-length counterparts. J Supercomput 80, 19210–19247 (2024). https://doi.org/10.1007/s11227-024-06212-8 https://link.springer.com/article/10.1007/s11227-024-06212-8
- Lingyun Yao, Martin Trapp, Jelin Leslin, Gaurav Singh, Peng Zhang, Karthekeyan Periasamy, Martin Andraud, 22 May 2024, On Hardware-efficient Inference in Probabilistic Circuits, https://arxiv.org/abs/2405.13639
- David Spuler, March 2024, Chapter 51. Zero-Multiplication Models, Generative AI in C++: Coding Transformers and LLMs, https://www.amazon.com/dp/B0CXJKCWX9
- H. Saadat, H. Bokhari and S. Parameswaran, 2018, Minimally Biased Multipliers for Approximate Integer and Floating-Point Multiplication, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 37, no. 11, pp. 2623-2635, Nov. 2018, doi: 10.1109/TCAD.2018.2857262, https://ieeexplore.ieee.org/document/8493590
Logarithmic Approximate Multiplication
The most common method to approximate multiplication is to use addition of the logarithms of two numbers, but more generally than via simple bitshifting. This approach is similar to logarithmic quantization (power-of-two quantization). These papers specifically use logarithmic approximation methods.
- P. Gysel, J. Pimentel et al., “Ristretto: A framework for empirical study of resource-efficient inference in convolutional neural networks,” IEEE Trans. Neural Netw. Learn. Syst., 2018, https://ieeexplore.ieee.org/abstract/document/8318896
- Min Soo Kim; Alberto A. Del Barrio; Leonardo Tavares Oliveira; Román Hermida; Nader Bagherzadeh, "Efficient Mitchell’s Approximate Log Multipliers for Convolutional Neural Networks", IEEE Transactions on Computers, Volume 68 Issue 5, p.660-675, November 2018, https://ieeexplore.ieee.org/abstract/document/8532287
- T. Hokchhay, S. Hashemi, R. I. Bahar, and S. Reda, “Hardware-software codesign of accurate, multiplier-free deep neural networks,” in Proc. 54th Annu. Design Autom. Conf. (DAC), 2017, pp. 1–6., https://arxiv.org/abs/1705.04288
- M. S. Ansari, B. F. Cockburn, and J. Han, “An improved logarithmic multiplier for energy-efficient neural computing,” IEEE Transactions on Computers, 2020, https://ieeexplore.ieee.org/document/9086744
- J. N. Mitchell, "Computer multiplication and division using binary logarithms," IEEE Trans. Electron. Comput., vol. EC-11, no. 4, pp. 512–517, Aug. 1962, https://ieeexplore.ieee.org/document/5219391
- Z. Babic, A. Avramovic, and P. Bulic, "An iterative logarithmic multiplier," Microprocess. Microsyst., vol. 35, no. 1, pp. 23–33, Feb. 2011, https://dl.acm.org/doi/10.1016/j.micpro.2010.07.001
- U. Lotric and P. Bulic, "Applicability of approximate multipliers in hardware neural networks," Neurocomput., vol. 96, pp. 57–65, Nov. 2012, https://dl.acm.org/doi/10.1016/j.neucom.2011.09.039
- Z. Du, K. Palem, A. Lingamneni, O. Temam, Y. Chen, and C. Wu, "Leveraging the error resilience of machine-learning applications for designing highly energy efficient accelerators," in Proc. 19th Asia South Pacific Des. Autom. Conf., 2014, pp. 201–206, https://pages.saclay.inria.fr/olivier.temam/files/eval/DLCPTW2014.pdf
- M. S. Kim, A. A. D. Barrio, R. Hermida, and N. Bagherzadeh, "Low-power implementation of Mitchell’s approximate logarithmic multiplication for convolutional neural networks," in Proc. 23rd Asia South Pacific Des. Autom. Conf., 2018, pp. 617–622, https://ieeexplore.ieee.org/document/8297391 (Approximate logarithm approach using the Logarithm Number System.)
- S. S. Sarwar, S. Venkataramani, A. Raghunathan, and K. Roy, "Multiplier-less artificial neurons exploiting error resiliency for energy-efficient neural computing," in Proc. Des. Autom. Test Eur. Conf. Exhib., 2016, pp. 145–150, https://arxiv.org/abs/1602.08557
- Rezaei S, Omidi R and Azarpeyvand A. (2022), Logarithm-approximate floating-point multiplier, Microelectronics Journal, 127:C, Volume 127, September 2022, 105521, https://doi.org/10.1016/j.mejo.2022.105521
- M. Skrbek, Fast neural network implementation, Neural Network World 5 (1999), 375–391, https://www.researchgate.net/publication/265303033_Fast_neural_network_implementation (Uses shift-add methods.)
- T. Mogami, Deep neural network training without multiplications, In Beyond BackPropagation WS at 34th Conference on Neural Information Processing Systems, 2020, https://arxiv.org/abs/2012.03458 (This multiplication of floating-point numbers with integer addition is effectively using Mitchell's approximate multiplication.)
- G Alsuhli, V Sakellariou, H Saleh, M Al-Qutayri, Number Systems for Deep Neural Network Architectures: A Survey, 2023, https://arxiv.org/abs/2307.05035
- Y Wu, C Chen, W Xiao, X Wang, C Wen, J Han, A Survey on Approximate Multiplier Designs for Energy Efficiency: From Algorithms to Circuits, arXiv preprint, 2023, https://arxiv.org/abs/2301.12181
- Durgesh Nandan; Jitendra Kanungo; Anurag Mahajan, 2017, An efficient VLSI architecture for iterative logarithmic multiplier, 2017 4th International Conference on Signal Processing and Integrated Networks (SPIN), February 2017, https://ieeexplore.ieee.org/document/8049986 (Uses LNS and Mitchell's approximate multiplication algorithm.)
- Uroš Lotrič, Ratko Pilipović, Patricio Bulić, "A Hybrid Radix-4 and Approximate Logarithmic Multiplier for Energy Efficient Image Processing", Electronics, vol.10, no.10, pp.1175, 2021. https://doi.org/10.3390/electronics10101175
- J Cai, 2022, Log-or-Trig: Towards efficient learning in deep neural networks Thesis, Graduate School of Engineering, Tokyo University of Agriculture and Technology, https://tuat.repo.nii.ac.jp/?action=repository_action_common_download&item_id=1994&item_no=1&attribute_id=16&file_no=3, PDF: https://tuat.repo.nii.ac.jp/index.php?action=pages_view_main&active_action=repository_action_common_download&item_id=1994&item_no=1&attribute_id=16&file_no=1&page_id=13&block_id=39 (Examines logarithmic LNS multiplication and also trigonometric methods.)
- Mark Arnold, 2023, Machine Learning using Logarithmic Arithmetic with Preconditioned Input to Mitchell's Method, 2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS), https://ieeexplore.ieee.org/abstract/document/10168554/
Non-Logarithmic Approximate Multiplication
There are various alternatives to using logarithmic algebra to approximate multiplication.
- V. Mrazek, S. S. Sarwar, L. Sekanina, Z. Vasicek, and K. Roy, “Design of power-efficient approximate multipliers for approximate artificial neural networks,” in Proc. IEEE/ACM Int. Conf. Comput.-Aided Des., 2016, pp. 1–7, https://ieeexplore.ieee.org/document/7827658
- Low-power and high-speed shift-based multiplier for error tolerant applications, Sami Malek, Sarah Abdallah, Ali Chehab, Imad H. Elhajj, Ayman Kayssi, Microprocessors and Microsystems, Volume 52, July 2017, Pages 566-574, https://doi.org/10.1016/j.micpro.2017.07.002
- S. Hashemi, R. I. Bahar, and S. Reda, “Drum: A dynamic range unbiased multiplier for approximate applications,” in Proc. IEEE/ACM Int. Conf. Comput.-Aided Des., 2015, pp. 418–425, https://ieeexplore.ieee.org/document/7372600
- A. Momeni, J. Han, P. Montuschi, and F. Lombardi, “Design and analysis of approximate compressors for multiplication,” IEEE Trans. Comput., vol. 64, no. 4, pp. 984–994, Apr. 2015, https://ieeexplore.ieee.org/document/6748013
- K. Y. Kyaw, W. L. Goh, and K. S. Yeo, “Low-power high-speed multiplier for error-tolerant application,” in Proc. IEEE Int. Conf. Electron Dev. Solid-State Circuits, 2010, pp. 1–4, https://ieeexplore.ieee.org/document/5713751
- C. Liu, J. Han, and F. Lombardi, “A low-power, high-performance approximate multiplier with configurable partial error recovery,” in Proc. Des. Autom. Test Eur. Conf. Exhib., 2014, pp. 1–4, https://ieeexplore.ieee.org/document/6800309
- P. Kulkarni, P. Gupta, and M. Ercegovac, “Trading accuracy for power with an underdesigned multiplier architecture,” in Proc. 24th Int. Conf. VLSI Des., 2011, pp. 346–351, https://ieeexplore.ieee.org/document/5718826
- S. Narayanamoorthy, H. A. Moghaddam, Z. Liu, T. Park, and N. S. Kim, “Energy-efficient approximate multiplication for digital signal processing and classification applications,” IEEE Trans. Very Large Scale Integr. Syst., vol. 23, no. 6, pp. 1180–1184, Jun. 2015, https://ieeexplore.ieee.org/document/6858039
- R. Zendegani, M. Kamal, M. Bahadori, A. Afzali-Kusha, and M. Pedram, “Roba multiplier: A rounding-based approximate multiplier for high-speed yet energy-efficient digital signal processing,” IEEE Trans. Very Large Scale Integr. Syst., vol. 25, no. 2, pp. 393–401, Feb. 2017, https://ieeexplore.ieee.org/document/7517375
- H.R. Mahdiani, A. Ahmadi, S.M. Fakhraie, C. Lucas, “Bio-Inspired Imprecise Computational Blocks for Efficient VLSI Implementation of Soft-Computing Applications,” IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 57, no. 4, pp. 850-862, April 2010, https://dl.acm.org/doi/10.1109/TCSI.2009.2027626
- G Alsuhli, V Sakellariou, H Saleh, M Al-Qutayri, Number Systems for Deep Neural Network Architectures: A Survey, 2023, https://arxiv.org/abs/2307.05035
- Y Wu, C Chen, W Xiao, X Wang, C Wen, J Han, A Survey on Approximate Multiplier Designs for Energy Efficiency: From Algorithms to Circuits, arXiv preprint, 2023, https://arxiv.org/abs/2301.12181
More AI Research
Read more about:
- Zero-Multiplication Models
- Approximation Algorithms
- Advanced AI Mathematics
- Matrix Algebra
- Logarithmic Models
- Inference Optimizations
- Code Optimizations
- « Research Home