Aussie AI
Adder Neural Networks
-
Book Excerpt from "Generative AI in C++"
-
by David Spuler, Ph.D.
Adder Neural Networks
If multiplication is so bad, can't we just use addition? Yes, we sure can. Cue the “adder” neural networks.
But this terminology is not the same thing as “additive models” (or “additive neural networks”), which is a term that is often used in the literature meaning something else, rather than arithmetic addition. Generalized Additive Neural Networks (GANNs) are also a different concept.
Can we change the multiplication operation generically to addition without quantization? I mean, we can change the matrix multiplication C++ code from “*” to “+” and we're done, right? Unsurprisingly, it's not a new idea to build a “dot product-like operation” using addition and subtraction. The earliest replacement of multiplication with addition seems to be Ritter and Sussner (1996), and there are many other papers on “adder” models.
Research papers on adder networks:
- G. Ritter and P. Sussner, 1996, An introduction to morphological neural networks, Proceedings of 13th International Conference on Pattern Recognition (ICPR), vol. 4, pp. 709–717 vol.4, 1996, https://ieeexplore.ieee.org/abstract/document/547657 (Earliest multiplication-free neural network? Uses add and max.)
- Hongyi Pan, Diaa Badawi, Xi Zhang & Ahmet Enis Cetin, 2019, Additive neural network for forest fire detection, 18 November 2019, https://link.springer.com/article/10.1007/s11760-019-01600-7 PDF: https://repository.bilkent.edu.tr/bitstreams/e1a00ff4-b85d-4cc0-b058-f885785d8eae/download (AddNet uses a multiplication-free operator to create a dot product-like operator based on addition of absolute values and sign bit tests. The neural network must be trained with this non-multiplication operator.)
- Chen H, Wang Y, Xu C, Shi B, Xu C, Tian Q, Xu C., 2020, Addernet: Do we really need multiplications in deep learning?, In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 1468–1477, https://arxiv.org/abs/1912.13200, https://ieeexplore.ieee.org/document/9156624 (Code on GitHub at https://github.com/huaweinoah/AdderNet) (Uses an additive metric of the l-1 distance between the vector and the input feature, to constructed an additive network.)
- Xu, Y.; Xu, C.; Chen, X.; Zhang, W.; Xu, C.; and Wang, Y. 2020. Kernel Based Progressive Distillation for Adder Neural Networks, In NeurIPS. https://proceedings.neurips.cc/paper/2020/hash/912d2b1c7b2826caf99687388d2e8f7c-Abstract.html, PDF: https://proceedings.neurips.cc/paper/2020/file/912d2b1c7b2826caf99687388d2e8f7c-Paper.pdf (Uses the l-1 additive distance between vectors, like AdderNet.)
- H. Shu, J. Wang, H. Chen, L. Li, Y. Yang, and Y. Wang, 2021, Adder attention for vision transformer, In M. Ranzato, A. Beygelzimer, Y. Dauphin, P. Liang, and J. W. Vaughan, editors, NeurIPS - Advances in Neural Information Processing Systems, volume 34, pages 19899–19909, 2021, https://openreview.net/forum?id=5Ld5bRB9jzY, PDF: https://proceedings.neurips.cc/paper/2021/file/a57e8915461b83adefb011530b711704-Paper.pdf, Supplementary PDF: https://openreview.net/attachment?id=5Ld5bRB9jzY&name=supplementary_material
- Yunhe Wang, Mingqiang Huang, Kai Han, Hanting Chen, Wei Zhang, Chunjing Xu, and Dacheng Tao, 2021, Addernet and its minimalist hardware design for energy-efficient artificial intelligence, arXiv preprint arXiv:2101.10015, 2021, https://arxiv.org/abs/2101.10015
- Wenshuo Li; Xinghao Chen; Jinyu Bai; Xuefei Ning; Yunhe Wang, 2022, Searching for energy-efficient hybrid adder-convolution neural networks, IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 19-20 June 2022, https://ieeexplore.ieee.org/document/9857279, PDF: https://openaccess.thecvf.com/content/CVPR2022W/NAS/papers/Li_Searching_for_Energy-Efficient_Hybrid_Adder-Convolution_Neural_Networks_CVPRW_2022_paper.pdf
- Xinghao Chen, Chang Xu, Minjing Dong, Chunjing Xu, and Yunhe Wang, 2021, An empirical study of adder neural networks for object detection, In NeurIPS, 2021, https://arxiv.org/abs/2112.13608
- Dehua Song, Yunhe Wang, Hanting Chen, Chang Xu, Chunjing Xu, and DaCheng Tao. 2021, Addersr: Towards energy efficient image super-resolution, In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15648–15657, 2021, https://arxiv.org/abs/2009.08891
- C Liu, C Zhao, H Wu, X Han, S Li, 2022, Addlight: An energy-saving adder neural network for cucumber disease classification, Agriculture, 2022, 12(4), 452, https://doi.org/10.3390/agriculture12040452, https://www.mdpi.com/2077-0472/12/4/452
- Hanting Chen, Yunhe Wang, Chang Xu, Chao Xu, Chunjing Xu, Tong Zhang, 2021, Universal Adder Neural Networks, May 2021, https://arxiv.org/abs/2105.14202
- GuoRong Cai, ShengMing Yang, Jing Du, ZongYue Wang, Bin Huang, Yin Guan, SongJian Su, JinHe Su & SongZhi Su, 2021, Convolution without multiplication: A general speed up strategy for CNNs, Science China Technological Sciences, volume 64, pages 2627–2639 (2021), https://link.springer.com/article/10.1007/s11431-021-1936-2
- Lulan Shen; Maryam Ziaeefard; Brett Meyer; Warren Gross; James J. Clark, 2022, Conjugate Adder Net (CAddNet) - a Space-Efficient Approximate CNN, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), https://ieeexplore.ieee.org/abstract/document/9857393 , PDF: https://openaccess.thecvf.com/content/CVPR2022W/ECV/papers/Shen_Conjugate_Adder_Net_CAddNet_-_A_Space-Efficient_Approximate_CNN_CVPRW_2022_paper.pdf
- A. Afrasiyabi, O. Yildiz, B. Nasir, F. T. Yarman-Vural, and A. E. Çetin. 2017, Energy saving additive neural network, CoRR, abs/1702.02676, 2017, https://arxiv.org/abs/1702.02676 (Uses sum of absolute values instead of multiplication.)
- Martin Hardieck; Tobias Habermann; Fabian Wagner; Michael Mecik; Martin Kumm; Peter Zipf, 2023, More AddNet: A deeper insight into DNNs using FPGA-optimized multipliers, 2023 IEEE International Symposium on Circuits and Systems (ISCAS), https://ieeexplore.ieee.org/abstract/document/10181827/
- Y Zhang, B Sun, W Jiang, Y Ha, M Hu, 2022 WSQ-AdderNet: Efficient Weight Standardization based Quantized AdderNet FPGA Accelerator Design with High-Density INT8 DSP-LUT Co-Packing Optimization, 2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD), https://ieeexplore.ieee.org/document/10069557
For more research papers on adder networks, see https://www.aussieai.com/research/zero-multiplication#add.
• Next: • Up: Table of Contents |
The new AI programming book by Aussie AI co-founders:
Get your copy from Amazon: Generative AI in C++ |