Aussie AI
Look-up Tables
-
Last Updated 20 September, 2024
-
by David Spuler, Ph.D.
Look-up tables (LUTs) are a well-known simple data structure for optimizing code. They have been used to optimize neural networks in various ways. Some examples include:
- Zero-multiplier networks: Look-up tables have been used instead of multiplication to create Zero-Multiplication Models.
- Approximation algorithms. The use of approximations using table lookups also speeds up various non-linear operations; see Approximation Optimizations
Some of the more advanced data structures that extend look-up tables include:
Research Papers on Lookup Tables (LUTs)
Research papers include:
- Zhou, A.; Yao, A.; Guo, Y.; Xu, L.; and Chen, Y., 2017, Incremental network quantization: Towards lossless CNNs with low-precision weight, arXiv preprint arXiv:1702.03044, https://arxiv.org/abs/1702.03044 (bitshifting)
- S Fanning, Fixed Point Multiplication-Free Implementation of Deep Neural Networks for Embedded Systems, Masters Thesis, School of Electrical and Electronic Engineering, University College Dublin 2018, https://seanfanning.eu/posts/projects/low-bitwidth-neural-networks/Thesis_SeanFanning_13360951.pdf
- Mohammad Samragh Razlighi; Mohsen Imani; Farinaz Koushanfar; Tajana Rosing LookNN: Neural network with no multiplication, Design, Automation & Test in Europe Conference & Exhibition (DATE), 27-31 March 2017, https://ieeexplore.ieee.org/document/7927280 (Lookup-table based multiplication.)
- Covell M, Marwood D, Baluja S, Johnston N., Table-based neural units: Fully quantizing networks for multiply-free inference, 2019, arXiv preprint arXiv:1906.04798, http://arxiv.org/abs/1906.04798
- E Yvinec, A Dapogny, K Bailly, Sep 2023, Network Memory Footprint Compression Through Jointly Learnable Codebooks and Mappings, arXiv preprint arXiv:2309.17361, https://arxiv.org/abs/2309.17361 (Uses "codebooks", i.e. look-up tables, to reduce memory usage.)
- Song Han, Jeff Pool, John Tran, and William Dally, 2015, Learning both weights and connections for efficient neural network, Advances in neural information processing systems, 28, 2015, https://arxiv.org/abs/1506.02626
- Gunho Park, Baeseong Park, Minsub Kim, Sungjae Lee, Jeonghoon Kim, Beomseok Kwon, Se Jung Kwon, Byeongwook Kim, Youngjoo Lee, Dongsoo Lee, Apr 2023, LUT-GEMM: Quantized Matrix Multiplication based on LUTs for Efficient Inference in Large-Scale Generative Language Models https://arxiv.org/abs/2206.09557
- Joonsang Yu, Junki Park, Seongmin Park, Minsoo Kim, Sihwa Lee, Dong Hyun Lee, Jungwook Choi, Dec 2021, NN-LUT: Neural Approximation of Non-Linear Operations for Efficient Transformer Inference, https://arxiv.org/pdf/2112.02191
- Tae Jun Ham, Sung Jun Jung, Seonghak Kim, Young H Oh, Yeonhong Park, Yoonho Song, Jung-Hun Park, Sanghee Lee, Kyoung Park, Jae W Lee, et al. A^3: Accelerating attention mechanisms in neural networks with approximation. In 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA), pages 328–341. IEEE, 2020. https://arxiv.org/abs/2002.10941
- Neelesh Gupta, Narayanan Kannan, Pengmiao Zhang, Viktor Prasanna, 8 Apr 2024, TabConv: Low-Computation CNN Inference via Table Lookups, https://arxiv.org/abs/2404.05872
- Darshan C. Ganji, Saad Ashfaq, Ehsan Saboori, Sudhakar Sah, Saptarshi Mitra, MohammadHossein AskariHemmat, Alexander Hoffman, Ahmed Hassanien, Mathieu Léonardon, 18 Apr 2023, DeepGEMM: Accelerated Ultra Low-Precision Inference on CPU Architectures using Lookup Tables, https://arxiv.org/abs/2304.09049
- Xiaohu Tang, Yang Wang, Ting Cao, Li Lyna Zhang, Qi Chen, Deng Cai, Yunxin Liu, Mao Yang, 6 Sep 2023 (v2), LUT-NN: Empower Efficient Neural Network Inference with Centroid Learning and Table Lookup, https://arxiv.org/abs/2302.03213
- Grigor Gatchev, Valentin Mollov, 4 Apr 2021, Faster Convolution Inference Through Using Pre-Calculated Lookup Tables, https://arxiv.org/abs/2104.01681
- NM Ho, DT Nguyen, JL Gustafson, WF Wong, 2023, Bedot: Bit Efficient Dot Product for Deep Generative Models, CoNGA 2023: Next Generation Arithmetic, pp. 19–37, https://link.springer.com/chapter/10.1007/978-3-031-32180-1_2 https://www.comp.nus.edu.sg/~wongwf/papers/CONGA23-Bedot.pdf
- David Spuler, March 2024, Chapter 35. Lookup Tables & Precomputation, Generative AI in C++: Coding Transformers and LLMs, https://www.amazon.com/dp/B0CXJKCWX9
- H. Bagherinezhad, M. Rastegari, and A. Farhadi. Lcnn: Lookup-based convolutional neural network. arXiv preprint arXiv:1611.06473, 2016, https://arxiv.org/abs/1611.06473
- Tianyi Zhang, Jonah Wonkyu Yi, Bowen Yao, Zhaozhuo Xu, Anshumali Shrivastava, 2 Mar 2024, NoMAD-Attention: Efficient LLM Inference on CPUs Through Multiply-add-free Attention, https://arxiv.org/abs/2403.01273 Code: https://github.com/tonyzhang617/nomad-dist (Converts 4-bit vector dot products to using SIMD registers as lookup tables on CPUs.)
- Wang, J., Chen, K., Chen, G., Shou, L., McAuley, J.: Skipbert: Efficient inference with shallow layer skipping. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 7287–7301 (2022) https://aclanthology.org/2022.acl-long.503/ (Skips early layers of a model via precomputed lookup tables based on detecting known token n-grams in the prompt.)
- Jianyu Wei, Shijie Cao, Ting Cao, Lingxiao Ma, Lei Wang, Yanyong Zhang, Mao Yang, 25 Jun 2024, T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge, https://arxiv.org/abs/2407.00088 Code: https://github.com/microsoft/T-MAC (Table lookup for low-bit quantization on CPUs.)
- Han Guo, William Brandon, Radostin Cholakov, Jonathan Ragan-Kelley, Eric P. Xing, Yoon Kim, 15 Jul 2024, Fast Matrix Multiplications for Lookup Table-Quantized LLMs, https://arxiv.org/abs/2407.10960
- Beom Jin Kang, Hae In Lee, Seok Kyu Yoon, Young Chan Kim, Sang Beom Jeong, Seong Jun O, Hyun Kim, October 2024, A survey of FPGA and ASIC designs for transformer inference acceleration and optimization, Journal of Systems Architecture, Volume 155, 103247, https://www.sciencedirect.com/science/article/abs/pii/S138376212400184X
- Zhiwen Mo, Lei Wang, Jianyu Wei, Zhichen Zeng, Shijie Cao, Lingxiao Ma, Naifeng Jing, Ting Cao, Jilong Xue, Fan Yang, Mao Yang, 12 Aug 2024, LUT Tensor Core: Lookup Table Enables Efficient Low-Bit LLM Inference Acceleration, https://arxiv.org/abs/2408.06003 (Lookup tables for mixed-precision MatMul/GEMM kernels using low-bit quantization mixed with full precision.)
- Fangzhou He, Ke Ding, DingjiangYan, Jie Li, Jiajun Wang, Mingzhe Chen, ANovel Quantization and Model Compression Approach for Hardware Accelerators in Edge Computing, https://cdn.techscience.cn/files/cmc/2024/TSP_CMC-80-2/TSP_CMC_53632/TSP_CMC_53632.pdf (Power-of-two quantization with bitshifting further accelerated with LUTs.)
- Zachary Susskind 2024, Weightless Neural Networks for Fast, Low-Energy Inference, Ph.D. Dissertation, The University of Texas at Austin, https://zsknd.com/dissertation_final.pdf
- J. Li, C. Chen, Z. Cheng and Z. Xiong, 2024, Toward DNN of LUTs: Learning Efficient Image Restoration with Multiple Look-Up Tables, IEEE Transactions on Pattern Analysis and Machine Intelligence, doi: 10.1109/TPAMI.2024.3401048, https://ieeexplore.ieee.org/abstract/document/10530442 Code: https://github.com/ddlee-cn/MuLUT
- David Spuler, March 2024, Lookup Table Precomputation, in Generative AI in C++, https://www.aussieai.com/book/ch13-lookup-table-precomputation
- SZ Lin, YC Chen, YH Chang, TW Kuo, HP Li, 2024, LUTIN: Efficient Neural Network Inference with Table Lookup, ISLPED ’24, August 5-7, 2024, Newport Beach, CA, USA, https://dl.acm.org/doi/pdf/10.1145/3665314.3670804
- Davis Blalock, John Guttag, 21 Jun 2021, Multiplying Matrices Without Multiplying, https://arxiv.org/abs/2106.10860
- Q. Deng, Y. Zhang, M. Zhang and J. Yang, "LAcc: Exploiting Lookup Table-based Fast and Accurate Vector Multiplication in DRAM-based CNN Accelerator," 2019 56th ACM/IEEE Design Automation Conference (DAC), Las Vegas, NV, USA, 2019, pp. 1-6. https://ieeexplore.ieee.org/document/8806810 PDF: https://dl.acm.org/doi/pdf/10.1145/3316781.3317845
- Yongkweon Jeon, Baeseong Park, Se Jung Kwon, Byeongwook Kim, Jeongin Yun, Dongsoo Lee, 31 Aug 2020 (v2), BiQGEMM: Matrix Multiplication with Lookup Table For Binary-Coding-based Quantized DNNs, https://arxiv.org/abs/2005.09904
- Duy-Thanh Nguyen, Abhiroop Bhattacharjee, Abhishek Moitra, Priyadarshini Panda, 9 Feb 2023, DeepCAM: A Fully CAM-based Inference Accelerator with Variable Hash Lengths for Energy-efficient Deep Neural Networks, https://arxiv.org/abs/2302.04712
- Jie Ran, Rui Lin, Jason Chun Lok Li, Jiajun Zhou, Ngai Wong, 13 Aug 2022, PECAN: A Product-Quantized Content Addressable Memory Network, https://arxiv.org/abs/2208.13571
- S. Nag et al., "LogicNets vs. ULEEN : Comparing two novel high throughput edge ML inference techniques on FPGA," 2024 IEEE 67th International Midwest Symposium on Circuits and Systems (MWSCAS), Springfield, MA, USA, 2024, pp. 1206-1211, doi: 10.1109/MWSCAS60917.2024.10658913. https://ieeexplore.ieee.org/abstract/document/10658913/ (Analyzing two lookup-table methods.)
More AI Research
Read more about:
- Hashing
- Zero-Multiplication Models
- Matrix Algebra
- Logarithmic Models
- Approximate Computing
- « Research Home
- Inference Optimizations
- Loop Optimizations
- Code Optimizations
- « Research Home