Aussie AI

Approximate Multiplication

  • Book Excerpt from "Generative AI in C++"
  • by David Spuler, Ph.D.

Approximate Multiplication

In addition to faster integer multiplication algorithms, there are various ways to approximate multiplication arithmetic, using “inexact logic”. This means approximating the low-level numeric multiplication of integers or floating-point weights, not to be confused with approximating matrix multiplication in higher-level algorithms. The aim is to be faster, trading off some errors in the multiplication. Research exists in this area for both integer multiplication and floating-point multiplication.

A simple example of approximating integer multiplication is the use of power-of-two weights and bitshift algorithms (or multiple bitshift-adds), as discussed in the section on weight bitshifting. Some of the more theoretical algorithms for approximating integer multiplication arithmetic are shown below.

These methods are interesting, but using these in software inference algorithms seems unlikely to be faster than hardware acceleration of (non-approximate) multiplication. In fact, Kim et al. (2019) notes that they tested their new algorithms against a non-GPU version of the models, as their algorithm was slower than hardware-accelerated models. The possible use of these approximate multiplication algorithms to further speed up multiplication in hardware accelerators, accepting some error, has been somewhat explored, but seems an area offering future improvements.

Nevertheless, there has been an explosion of papers on approximate multiplication algorithms and their use in model inference and training. For analysis of low-level approximate multiplication algorithms and their theory, including logarithmic approximate multiplication and non-logarithmic approximate multiplication, see advanced AI mathematics. Also related is the Logarithmic number system (LNS) and other obscure number systems such as Dyadic numbers, the Residue Number System (RNS) and Posit Number System (PNS); see advanced number systems. See also additive neural networks and multiplier-free inference.

Papers focused on approximate multiplication algorithms:

  1. M. A. Hanif, A. Marchisio et al., 2018, X-DNNs: Systematic cross-layer approximations for energy-efficient deep neural networks, Journal of Low Power Electronics, vol. 14, no. 4, pp. 520–534, Dec. 2018. https://www.semanticscholar.org/paper/X-DNNs:-Systematic-Cross-Layer-Approximations-for-Hanif-Marchisio/5ddaf1aff7d5a4a3484963849828c8d2d1315bc3
  2. M. Shafique, R. Hafiz, S. Rehman, W. El-Harouni, and J. Henkel, 2016, Cross-layer approximate computing: From logic to architectures, Proceedings of the 53rd Annual Design Automation Conference, ACM (2016), p. 99., https://ieeexplore.ieee.org/document/7544342
  3. S. Mittal, 2016, A survey of techniques for approximate computing, ACM Computing Surveys (CSUR) 48, 62 (2016), https://dl.acm.org/doi/10.1145/2893356
  4. P. Kulkarni, P. Gupta, and M. Ercegovac, 2011, Trading accuracy for power with an underdesigned multiplier architecture, 2011 24th International Conference on VLSI Design (VLSI Design), IEEE (2011), pp. 346–351, https://ieeexplore.ieee.org/document/5718826
  5. V. Gupta, D. Mohapatra, S. P. Park, A. Raghunathan, and K. Roy, 2011, Impact: imprecise adders for low-power approximate computing, Proceedings of the 17th IEEE/ACM International Symposium on Low-Power Electronics and Design, IEEE Press (2011), pp. 409–414, https://ieeexplore.ieee.org/document/5993675
  6. M. T. Teimoori, M. A. Hanif, A. Ejlali, and M. Shafique, 2018, AdAM: Adaptive approximation management for the non-volatile memory hierarchies, Design, Automation and Test in Europe Conference and Exhibition (DATE), 2018, IEEE (2018), pp. 785–790, https://ieeexplore.ieee.org/document/8342113
  7. F. Sampaio, M. Shafique, B. Zatt, S. Bampi, and J. Henkel, 2015, Approximation-aware Multi-Level Cells STT-RAM cache architecture, 2015 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES), October (2015), pp. 79–88, https://ieeexplore.ieee.org/abstract/document/7324548
  8. Muhammad Shafique, Rehan Hafiz, Semeen Rehman, Walaa El-Harouni, Jörg Henkel, 2016, A Low Latency Generic Accuracy Configurable Adder, in 53nd ACM/EDAC/IEEE Design Automation Conference & Exhibition (DAC), 2016. (Open-Source Library of Low-Power Approximate Computing Modules, 2023), Code: https://ces.itec.kit.edu/lpACLib.php Code: https://sourceforge.net/projects/lpaclib/
  9. S. S. Sarwar, S. Venkataramani et al., 2018, Energy-efficient neural computing with approximate multipliers, J. Emerg. Technol. Comput. Syst., vol. 14, no. 2, pp. 16:1–16:23, Jul. 2018, https://dl.acm.org/doi/10.1145/3097264
  10. Q. Zhang, T. Wang, Y. Tian, F. Yuan, and Q. Xu, 2015, Approxann: An approximate computing framework for artificial neural network, in DATE’15, March 2015, pp. 701–706, https://ieeexplore.ieee.org/document/7092478
  11. P. Kulkarni, P. Gupta, M. Ercegovac, 2011, Trading Accuracy for Power with an Underdesigned Multiplier Architecture, 24th International Conference on VLSI Design (VLSI Design), pp. 346–351, 2011, https://ieeexplore.ieee.org/document/5718826
  12. M. A. Hanif, R. Hafiz, and M. Shafique, 2018, Error resilience analysis for systematically employing approximate computing in convolutional neural networks, Design, Automation and Test in Europe Conference and Exhibition (DATE), 2018, IEEE (2018), pp. 913–916, https://ieeexplore.ieee.org/document/8342139
  13. K. Bhardwaj, P. S. Mane, J. Henkel, 2014, Power- and Area-Efficient Approximate Wallace Tree Multiplier for Error-Resilience Systems, ISQED, 2014, https://ieeexplore.ieee.org/document/6783335
  14. K. Y. Kyaw, W.-L. Goh, K.-S. Yeo, 2010, Low-power high-speed multiplier for error-tolerant application, IEEE International Conference of Electron Devices and Solid-State Circuits (EDSSC), 2010, https://ieeexplore.ieee.org/document/5713751
  15. M. B. Sullivan, E. E. Swartzlander, 2012, Truncated error correction for flexible approximate multiplication, ASILOMAR, pp. 355–359, 2012, https://ieeexplore.ieee.org/document/6489023
  16. W. El-Harouni, S. Rehman, B. S. Prabakaran, A. Kumar, R. Hafiz, and M. Shafique, 2017, Embracing approximate computing for energy-efficient motion estimation in high efficiency video coding, 2017 Design, Automation and Test in Europe Conference and Exhibition (DATE), IEEE (2017), pp. 1384–1389, https://ieeexplore.ieee.org/document/7927209
  17. M. Brandalero, A. C. S. Beck, L. Carro, and M. Shafique, 2018, Approximate on-the-fly coarse-grained reconfigurable acceleration for general-purpose applications, 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC), IEEE (2018), pp. 1–6, https://ieeexplore.ieee.org/document/8465930
  18. J. Zhang, K. Rangineni, Z. Ghodsi, and S. Garg, 2018, ThUnderVolt: Enabling aggressive voltage underscaling and timing error resilience for energy efficient deep neural network accelerators, arXiv preprint arXiv:1802.03806 (2018), https://arxiv.org/abs/1802.03806
  19. J. S. Miguel, J. Albericio, N. E. Jerger, and A. Jaleel, 2016, The bunker cache for spatio-value approximation, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), IEEE (2016), pp. 1–12, https://ieeexplore.ieee.org/document/7783746
  20. V. Mrazek, S. S. Sarwar, L. Sekanina, Z. Vasicek, and K. Roy, 2016, Design of power-efficient approximate multipliers for approximate artificial neural networks, 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), November (2016), pp. 1–7, https://ieeexplore.ieee.org/document/7827658
  21. S. Kim, P. Howe, T. Moreau, A. Alaghi, L. Ceze, and V. Sathe, 2018, MATIC: Learning Around Errors for Efficient Low-Voltage Neural Network Accelerators, Design, Automation and Test in Europe Conference and Exhibition (DATE), 2018, IEEE (2018), pp. 1–6, https://arxiv.org/abs/1706.04332
  22. S. De, J. Huisken, and H. Corporaal, 2018, Designing energy efficient approximate multipliers for neural acceleration, in 2018 21st Euromicro Conference on Digital System Design (DSD). IEEE, 2018, pp. 288–295, https://ieeexplore.ieee.org/document/8491830
  23. V. Mrazek, R. Hrbacek, Z. Vasicek, and L. Sekanina, 2017, EvoApprox8b: Library of approximate adders and multipliers for circuit design and benchmarking of approximation methods, Proceedings of the Conference on Design, Automation and Test in Europe, European Design and Automation Association (2017), pp. 258–261, https://ieeexplore.ieee.org/document/7926993
  24. V. Mrazek, Z. Vasicek, and L. Sekanina, 2018, Design of quality-configurable approximate multipliers suitable for dynamic environment, in AHS’18, 2018, pp. 264–271, https://ieeexplore.ieee.org/document/8541479
  25. X. He, L. Ke, W. Lu, G. Yan, and X. Zhang, 2018, Axtrain: Hardware-oriented neural network training for approximate inference, arXiv preprint arXiv:1805.08309 (2018), https://arxiv.org/abs/1805.08309v1
  26. S. Rehman, W. El-Harouni, M. Shafique, A. Kumar, and J. Henkel, 2016, Architectural-space exploration of approximate multipliers, Proceedings of the 35th International Conference on Computer-Aided Design, ICCAD’16, ACM, New York, NY, USA (2016), pp. 80:1–80:8, https://ieeexplore.ieee.org/abstract/document/7827657, PDF: https://esim-project.eu/files/user/akumar/pdf/ICCAD_2017_ApproxMult.pdf
  27. S. Misailovic, M. Carbin, S. Achour, Z. Qi, M. C. Rinard, 2014, Chisel: reliability- and accuracy-aware optimization of approximate computational kernels, OOPSLA, 309-328, 2014, https://dspace.mit.edu/handle/1721.1/91290
  28. Y. Wang, Y. Qin, D. Deng, J. Wei, Y. Zhou, Y. Fan, T. Chen, H. Sun, L. Liu, S. Wei et al., 2022, A 28nm 27.5 TOPS/W approximate-computing-based transformer processor with asymptotic sparsity speculating and out-of-order computing, in 2022 IEEE International Solid-State Circuits Conference (ISSCC), vol. 65. IEEE, 2022, pp. 1–3, https://ieeexplore.ieee.org/document/9731686
  29. Y Wu, C Chen, W Xiao, X Wang, C Wen, J Han, 2023, A Survey on Approximate Multiplier Designs for Energy Efficiency: From Algorithms to Circuits, arXiv preprint, 2023, https://arxiv.org/abs/2301.12181 (Extensive survey of approximate multiplication methods.)
  30. G Alsuhli, V Sakellariou, H Saleh, M Al-Qutayri, 2023, Number Systems for Deep Neural Network Architectures: A Survey, 2023, https://arxiv.org/abs/2307.05035 (Various number systems have approximate multiplication properties.)
  31. Zixuan Ou, Bing Yu, Wenbin Ye, 2023, An Efficient Algorithm-Hardware Co-Design for Radar-Based Fall Detection With Multi-Branch Convolutions, IEEE Transactions on Circuits and Systems I: Regular Papers, vol.70, no.4, pp.1613-1624, 2023. https://ieeexplore.ieee.org/document/10005051
  32. Mahdi Taheri, Mohammad Hasan Ahmadilivani, Maksim Jenihhin, Masoud Daneshtalab, Jaan Raik, 2023, APPRAISER: DNN Fault Resilience Analysis Employing Approximation Errors, 2023 26th International Symposium on Design and Diagnostics of Electronic Circuits and Systems (DDECS), pp.124-127, 2023. https://ieeexplore.ieee.org/document/10139468, https://arxiv.org/abs/2305.19733
  33. Efstratios Zacharelos, Italo Nunziata, Gerardo Saggese, Antonio G.M. Strollo, Ettore Napoli, 2022, Approximate Recursive Multipliers Using Low Power Building Blocks, IEEE Transactions on Emerging Topics in Computing, vol.10, no.3, pp.1315-1330, 2022. https://ieeexplore.ieee.org/document/9812478
  34. Kah Phooi Seng, Li-Minn Ang, 2022, Embedded Intelligence: State-of-the-Art and Research Challenges, IEEE Access, vol.10, pp.59236-59258, 2022. https://ieeexplore.ieee.org/document/9775683, PDF: https://research.usc.edu.au/esploro/outputs/99640278002621
  35. Antonio Giuseppe Maria Strollo, Ettore Napoli, Davide De Caro, Nicola Petra, Gerardo Saggese, Gennaro Di Meo, 2022, Approximate Multipliers Using Static Segmentation: Error Analysis and Improvements, IEEE Transactions on Circuits and Systems I: Regular Papers, vol.69, no.6, pp.2449-2462, 2022. https://ieeexplore.ieee.org/document/9726786
  36. Haroon Waris, Chenghua Wang, Chenyu Xu, Weiqiang Liu, 2022, AxRMs: Approximate Recursive Multipliers Using High-Performance Building Blocks, IEEE Transactions on Emerging Topics in Computing, vol.10, no.2, pp.1229-1235, 2022. https://ieeexplore.ieee.org/document/9483645
  37. Ying Wu, Chuangtao Chen, Weihua Xiao, Xuan Wang, Chenyi Wen, Jie Han, Xunzhao Yin, Weikang Qian, Cheng Zhuo, 2023, A Survey on Approximate Multiplier Designs for Energy Efficiency: From Algorithms to Circuits, ACM Transactions on Design Automation of Electronic Systems, 2023. https://doi.org/10.1145/3610291, https://arxiv.org/abs/2301.12181 (Extensive survey of many approximate multiplication algorithms.)
  38. Sudeh Shirkavand Saleh Abad, Mohammad Hossein Moaiyeri, 2022, Hardware-accuracy trade-offs for error-resilient applications using an ultra-efficient hybrid approximate multiplier, The Journal of Supercomputing, 2022. https://doi.org/10.1007/s11227-022-04789-6
  39. Giorgos Armeniakos, Georgios Zervakis, Dimitrios Soudris, Jörg Henkel, 2023, Hardware Approximate Techniques for Deep Neural Network Accelerators: A Survey, ACM Computing Surveys, vol.55, no.4, pp.1, 2023. https://doi.org/10.1145/3527156, https://arxiv.org/abs/2203.08737 (Survey of many approximate techniques in AI including logarithmic and non-logarithmic approximate multiplication.)
  40. X. Jiao, V. Akhlaghi, Yu Jiang, and R. K. Gupta. 2018. Energy-efficient neural networks using approximate computation reuse, Proc. of the 2018 Design, Automation and Test in Europe Conference and Exhibition, (DATE) (2018), 1223–1228. https://ieeexplore.ieee.org/document/8342202, PDF: http://wingtecher.com/themes/WingTecherResearch/assets/papers/date18-energy-efficient.pdf (Caches and reuses approximate computations by using Bloom filters, a data structure similar to hashing.)
  41. Ao Ren, Ji Li, Zhe Li, Caiwen Ding, Xuehai Qian, Qinru Qiu, Bo Yuan, Yanzhi Wang, 2017, SC-DCNN: Highly-scalable deep convolutional neural network using stochastic computing, ACM SIGPLAN Notices, vol. 52, no. 4, pp. 405-418, 2017. https://arxiv.org/abs/1611.05939 (Stochastic method with multiplication and addition approximations via AND gates and multiplexers.)

For more research papers on approximate multiplication, see https://www.aussieai.com/research/multiplication#approximate-multiplication.

 

Next:

Up: Table of Contents

Buy: Generative AI in C++: Coding Transformers and LLMs

Generative AI in C++ The new AI programming book by Aussie AI co-founders:
  • AI coding in C++
  • Transformer engine speedups
  • LLM models
  • Phone and desktop AI
  • Code examples
  • Research citations

Get your copy from Amazon: Generative AI in C++