Aussie AI

Approximate Computing for Faster AI

  • Last Updated 17 November, 2025
  • by David Spuler, Ph.D.

Approximate computing is a longstanding technique to improve speed at the cost of accuracy in many areas of Computer Science. The idea has recently been garnering much interest in the AI research community with many papers. There is interest in approximation research for speeding up low-level arithmetic (i.e. the multiplication bottleneck) and at the higher-level of whole model components.

Approximate Multiplication. Multiplication can be sped up using approximate algorithms in software and/or hardware. Some of the areas where approximations can improve model inference with approximate arithmetic include:

Approximate Components. Some higher-level Transformer components are also being considered for acceleration via approximation:

Approximate Multipliers for Faster Model Inference

There has been an explosion of papers on approximate multiplication algorithms and their use in model inference and training. For analysis of low-level approximate multiplication algorithms and their theory, including logarithmic approximate multiplication and non-logarithmic approximate multiplication, see advanced AI mathematics. Also related is the Logarithmic number system (LNS) and other obscure number systems such as Dyadic numbers, the Residue Number System (RNS) and Posit Number System (PNS); see advanced number systems. See also additive neural networks and multiplier-free inference.

AI Approximate Multiplication Research: Papers focused on the specific use of approximate multiplication algorithms for neural networks and Transformers, include:

  • S. S. Sarwar, S. Venkataramani et al., “Energy-efficient neural computing with approximate multipliers,” J. Emerg. Technol. Comput. Syst., vol. 14, no. 2, pp. 16:1–16:23, Jul. 2018, https://dl.acm.org/doi/10.1145/3097264
  • Q. Zhang, T. Wang, Y. Tian, F. Yuan, and Q. Xu, “Approxann: An approximate computing framework for artificial neural network,” in DATE’15, March 2015, pp. 701–706, https://ieeexplore.ieee.org/document/7092478
  • M. A. Hanif, R. Hafiz, and M. Shafique, Error resilience analysis for systematically employing approximate computing in convolutional neural networks, Design, Automation and Test in Europe Conference and Exhibition (DATE), 2018, IEEE (2018), pp. 913–916, https://ieeexplore.ieee.org/document/8342139
  • M. A. Hanif, A. Marchisio et al., “X-DNNs: Systematic cross-layer approximations for energy-efficient deep neural networks,” Journal of Low Power Electronics, vol. 14, no. 4, pp. 520–534, Dec. 2018. https://www.semanticscholar.org/paper/X-DNNs:-Systematic-Cross-Layer-Approximations-for-Hanif-Marchisio/5ddaf1aff7d5a4a3484963849828c8d2d1315bc3
  • V. Mrazek, S. S. Sarwar, L. Sekanina, Z. Vasicek, and K. Roy, Design of power-efficient approximate multipliers for approximate artificial neural networks, 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), November (2016), pp. 1–7, https://ieeexplore.ieee.org/document/7827658
  • S. Kim, P. Howe, T. Moreau, A. Alaghi, L. Ceze, and V. Sathe, MATIC: Learning Around Errors for Efficient Low-Voltage Neural Network Accelerators, Design, Automation and Test in Europe Conference and Exhibition (DATE), 2018, IEEE (2018), pp. 1–6, https://arxiv.org/abs/1706.04332
  • S. De, J. Huisken, and H. Corporaal, “Designing energy efficient approximate multipliers for neural acceleration,” in 2018 21st Euromicro Conference on Digital System Design (DSD). IEEE, 2018, pp. 288–295, https://ieeexplore.ieee.org/document/8491830
  • X. He, L. Ke, W. Lu, G. Yan, and X. Zhang, Axtrain: Hardware-oriented neural network training for approximate inference. arXiv preprint arXiv:1805.08309 (2018), https://arxiv.org/abs/1805.08309v1
  • P. Gysel, J. Pimentel et al., “Ristretto: A framework for empirical study of resource-efficient inference in convolutional neural networks,” IEEE Trans. Neural Netw. Learn. Syst., 2018, https://ieeexplore.ieee.org/abstract/document/8318896
  • Min Soo Kim; Alberto A. Del Barrio; Leonardo Tavares Oliveira; Román Hermida; Nader Bagherzadeh, "Efficient Mitchell’s Approximate Log Multipliers for Convolutional Neural Networks", IEEE Transactions on Computers, Volume 68 Issue 5, p.660-675, November 2018, https://ieeexplore.ieee.org/abstract/document/8532287
  • T. Mogami, Deep neural network training without multiplications, In Beyond BackPropagation WS at 34th Conference on Neural Information Processing Systems, 2020, https://arxiv.org/abs/2012.03458 (multiplication of floating-point numbers with integer addition, using Mitchell's approximate multiplication)
  • Lingyun Yao, Martin Trapp, Karthekeyan Periasamy, Jelin Leslin, Gaurav Singh, Martin Andraud, June 2023, Logarithm-Approximate Floating-Point Multiplier for Hardware-efficient Inference in Probabilistic Circuits, Proceedings of The 6th Workshop on Tractable Probabilistic Modeling, https://openreview.net/forum?id=WL7YDLOLfK, PDF: https://openreview.net/pdf?id=WL7YDLOLfK (Probabilistic speed improvement; uses Mogami's approximate multiplier.)
  • T. Hokchhay, S. Hashemi, R. I. Bahar, and S. Reda, “Hardware-software codesign of accurate, multiplier-free deep neural networks,” in Proc. 54th Annu. Design Autom. Conf. (DAC), 2017, pp. 1–6., https://arxiv.org/abs/1705.04288
  • M. S. Ansari, B. F. Cockburn, and J. Han, “An improved logarithmic multiplier for energy-efficient neural computing,” IEEE Transactions on Computers, 2020, https://ieeexplore.ieee.org/document/9086744
  • U. Lotric and P. Bulic, "Applicability of approximate multipliers in hardware neural networks," Neurocomput., vol. 96, pp. 57–65, Nov. 2012, https://dl.acm.org/doi/10.1016/j.neucom.2011.09.039
  • Z. Du, K. Palem, A. Lingamneni, O. Temam, Y. Chen, and C. Wu, "Leveraging the error resilience of machine-learning applications for designing highly energy efficient accelerators," in Proc. 19th Asia South Pacific Des. Autom. Conf., 2014, pp. 201–206, https://pages.saclay.inria.fr/olivier.temam/files/eval/DLCPTW2014.pdf
  • S. S. Sarwar, S. Venkataramani, A. Raghunathan, and K. Roy, "Multiplier-less artificial neurons exploiting error resiliency for energy-efficient neural computing," in Proc. Des. Autom. Test Eur. Conf. Exhib., 2016, pp. 145–150, https://arxiv.org/abs/1602.08557
  • J. Choi and S. Venkataramani, Approximate Computing Techniques for Deep Neural Networks. Cham: Springer, 2019, pp. 307–329, Chapter 15, https://link.springer.com/chapter/10.1007/978-3-319-99322-5_15
  • M. S. Ansari, V. Mrazek, B. F. Cockburn, L. Sekanina, Z. Vasicek, and J. Han, 2019, “Improving the accuracy and hardware efficiency of neural networks using approximate multipliers,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 28, no. 2, pp. 317–328, Oct 2019, https://ieeexplore.ieee.org/document/8863138
  • Biyanu Zerom, Mohammed Tolba, Huruy Tesfai, Hani Saleh, Mahmoud Al-Qutayri, Thanos Stouraitis, Baker Mohammad, Ghada Alsuhli, 2022, Approximate Logarithmic Multiplier For Convolutional Neural Network Inference With Computational Reuse, 2022 29th IEEE International Conference on Electronics, Circuits and Systems (ICECS), 24-26 October 2022, https://doi.org/10.1109/ICECS202256217.2022.9970861, https://ieeexplore.ieee.org/abstract/document/9970861/
  • M. S. Ansari, B. F. Cockburn, and J. Han, 2020, “An improved logarithmic multiplier for energy-efficient neural computing,” IEEE Transactions on Computers, vol. 70, no. 4, pp. 614–625, May 2020. https://ieeexplore.ieee.org/document/9086744
  • Tso-Bing Juang; Cong-Yi Lin; Guan-Zhong Lin, 2018, “Area-delay product efficient design for convolutional neural network circuits using logarithmic number systems,” in International SoC Design Conference (ISOCC). IEEE, 2018, pp. 170–171, https://ieeexplore.ieee.org/abstract/document/8649961
  • Ourania Spantidi, Iraklis Anagnostopoulos, "The Perfect Match: Selecting Approximate Multipliers for Energy-Efficient Neural Network Inference", 2023 IEEE 24th International Conference on High Performance Switching and Routing (HPSR), pp.27-32, 2023. https://ieeexplore.ieee.org/document/10147918
  • O. Spantidi, G. Zervakis, I. Anagnostopoulos, H. Amrouch and J. Henkel, "Positive/negative approximate multipliers for dnn accelerators", arXiv preprint arXiv:2107.09366, 2021. https://arxiv.org/abs/2107.09366 (Approximate multiplication for DNNs without needing retraining.)
  • Vojtech Mrazek, "Approximation of Hardware Accelerators driven by Machine-Learning Models: (Embedded Tutorial)", 2023 26th International Symposium on Design and Diagnostics of Electronic Circuits and Systems (DDECS), pp.91-92, 2023. https://ieeexplore.ieee.org/document/10139484
  • Michal Pinos, Vojtech Mrazek, Filip Vaverka, Zdenek Vasicek, Lukas Sekanina, "Acceleration Techniques for Automated Design of Approximate Convolutional Neural Networks", IEEE Journal on Emerging and Selected Topics in Circuits and Systems, vol.13, no.1, pp.212-224, 2023. https://ieeexplore.ieee.org/document/10011413
  • Mohammad Hasan Ahmadilivani, Mario Barbareschi, Salvatore Barone, Alberto Bosio, Masoud Daneshtalab, Salvatore Della Torca, Gabriele Gavarini, Maksim Jenihhin, Jaan Raik, Annachiara Ruospo, Ernesto Sanchez, Mahdi Taheri, "Special Session: Approximation and Fault Resiliency of DNN Accelerators", 2023 IEEE 41st VLSI Test Symposium (VTS), pp.1-10, 2023. https://ieeexplore.ieee.org/document/10140043
  • Zahra Ebrahimi, Muhammad Zaid, Mark Wijtvliet, Akash Kumar, "RAPID: Approximate Pipelined Soft Multipliers and Dividers for High Throughput and Energy Efficiency", IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol.42, no.3, pp.712-725, 2023. https://ieeexplore.ieee.org/document/9802734
  • U. Anil Kumar, Pavankumar Bikki, Sreehari Veeramachaneni, Syed Ershad Ahmed, "Power Efficient Approximate Multiplier Architectures for Error Resilient Applications", 2022 IEEE 19th India Council International Conference (INDICON), pp.1-5, 2022. https://ieeexplore.ieee.org/document/10039748
  • Qiao Shen, Renyuan Zhang, Hao Zhang, Hao Cai, Bo Liu, Jian Xiao, "A CGP-based Efficient Approximate Multiplier with Error Compensation", 2022 IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA), pp.48-49, 2022. https://ieeexplore.ieee.org/document/9963083
  • Siyuan Liang, Ke Chen, Bi Wu, Weiqiang Liu, "A Survey of Approximation based Hardware Acceleration Techniques for Deep Neural Networks (Invited)", 2022 IEEE 16th International Conference on Solid-State & Integrated Circuit Technology (ICSICT), pp.1-4, 2022. https://ieeexplore.ieee.org/document/9963257
  • Zhen Li, Su Zheng, Jide Zhang, Yao Lu, Jingbo Gao, Jun Tao, Lingli Wang, "Adaptable Approximate Multiplier Design Based on Input Distribution and Polarity", IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol.30, no.12, pp.1813-1826, 2022. https://ieeexplore.ieee.org/document/9861394
  • Ourania Spantidi, Georgios Zervakis, Iraklis Anagnostopoulos, Jörg Henkel, "Energy-Efficient DNN Inference on Approximate Accelerators Through Formal Property Exploration", IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol.41, no.11, pp.3838-3849, 2022. https://ieeexplore.ieee.org/document/9852790
  • Ourania Spantidi, Iraklis Anagnostopoulos, "How much is too much error? Analyzing the impact of approximate multipliers on DNNs", 2022 23rd International Symposium on Quality Electronic Design (ISQED), pp.1-6, 2022. https://ieeexplore.ieee.org/document/9806282
  • Hao Zhang, Seok-Bum Ko, "Variable-Precision Approximate Floating-Point Multiplier for Efficient Deep Learning Computation", IEEE Transactions on Circuits and Systems II: Express Briefs, vol.69, no.5, pp.2503-2507, 2022. https://ieeexplore.ieee.org/document/9739768
  • S Raghuram, N Shashank, "Approximate Adders for Deep Neural Network Accelerators", 2022 35th International Conference on VLSI Design and 2022 21st International Conference on Embedded Systems (VLSID), pp.210-215, 2022. https://ieeexplore.ieee.org/document/9885998
  • Georgios Zervakis, Iraklis Anagnostopoulos, Sami Salamin, Ourania Spantidi, Isai Roman-Ballesteros, Jörg Henkel, Hussam Amrouch, "Thermal-Aware Design for Approximate DNN Accelerators", IEEE Transactions on Computers, vol.71, no.10, pp.2687-2697, 2022. https://ieeexplore.ieee.org/document/9672753
  • Tao Li, Yitao Ma, Ko Yoshikawa, Osamu Nomura, Tetsuo Endoh, "Energy-Efficient Convolution Module With Flexible Bit-Adjustment Method and ADC Multiplier Architecture for Industrial IoT", IEEE Transactions on Industrial Informatics, vol.18, no.5, pp.3055-3065, 2022. https://ieeexplore.ieee.org/document/9519513
  • Tong Li, Hong-Lan Jiang, Hai Mo, Jie Han, Lei-Bo Liu, Zhi-Gang Mao, "Approximate Processing Element Design and Analysis for the Implementation of CNN Accelerators", Journal of Computer Science and Technology, vol.38, no.2, pp.309, 2023. https://doi.org/10.1007/s11390-023-2548-8
  • M. Esmali Nojehdeh, L. Aksoy, M. Altun, Efficient hardware implementation of artificial neural networks using approximate multiply-accumulate blocks, in 2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) (2020), pp. 96–101, https://ieeexplore.ieee.org/document/9154973
  • Giorgos Armeniakos, Georgios Zervakis, Dimitrios Soudris, Jörg Henkel, "Hardware Approximate Techniques for Deep Neural Network Accelerators: A Survey", ACM Computing Surveys, vol.55, no.4, pp.1, 2023. https://doi.org/10.1145/3527156, https://arxiv.org/abs/2203.08737 (Survey of many approximate techiques in AI.)
  • Anjankar, S., Hemant Gillurkar, Joshi, P., & Dwaramwar, P. (2022). Design and Analysis of Multipliers for DNN application using approximate 4:2 Compressors. International Journal of Next-Generation Computing, 13(5). https://doi.org/10.47164/ijngc.v13i5.918, https://ijngc.perpetualinnovation.net/index.php/ijngc/article/view/918
  • Hao Zhang, Mohammadreza Asadikouhanjani, Jie Han, Deivalakshmi Subbian, Seok-Bum Ko, "Approximate Computing for Efficient Neural Network Computation: A Survey", In: Approximate Computing, Editors: Weiqiang Liu, Fabrizio Lombardi, pp.397, 2022. https://doi.org/10.1007/978-3-030-98347-5_16, Amazon: https://www.amazon.com/Approximate-Computing-Weiqiang-Liu-ebook/dp/B0BBKR65SB/
  • Sudeh Shirkavand Saleh Abad, Mohammad Hossein Moaiyeri, "A Hardware- and Accuracy-Efficient Approximate Multiplier with Error Compensation for Neural Network and Image Processing Applications", Circuits, Systems, and Signal Processing, vol.41, no.12, pp.7057, 2022. https://doi.org/10.1007/s00034-022-02110-7
  • Cecilia De la Parra, Andre Guntoro, Akash Kumar, Efficient Accuracy Recovery in Approximate Neural Networks by Systematic Error Modelling, ASPDAC '21: Proceedings of the 26th Asia and South Pacific Design Automation Conference, January 2021, Pages 365–371, https://doi.org/10.1145/3394885.3431533, https://dl.acm.org/doi/10.1145/3394885.3431533
  • Issam Hammad; Kamal El-Sankary; Jason Gu, 2019, Deep Learning Training with Simulated Approximate Multipliers. In 2019 IEEE International Conference on Robotics and Biomimetics (ROBIO), https://ieeexplore.ieee.org/abstract/document/8961780
  • Issam Hammad and Kamal El-Sankary. 2018. Impact of Approximate Multipliers on VGG Deep Learning Network. IEEE Access (2018). https://ieeexplore.ieee.org/document/8488463
  • Vojtech Mrazek, Zdenek Vasícek, Lukás Sekanina, Muhammad Abdullah Hanif, and Muhammad Shafique. 2019. ALWANN: Automatic Layer-Wise Approximation of Deep Neural Network Accelerators without Retraining. ICCAD '19 (2019) https://arxiv.org/abs/1907.07229
  • Michal Pinos, Vojtech Mrazek, and Lukás Sekanina. 2021. Evolutionary Neural Architecture Search Supporting Approximate Multipliers. In Genetic Programming-24th European Conference, EuroGP 2021, Virtual Event, April 7--9, 2021. https://arxiv.org/abs/2101.11883
  • Uros Lotric and Patricio Bulic. 2012. Applicability of approximate multipliers in hardware neural networks. Neurocomputing 96 (2012), 57--65. https://dl.acm.org/doi/10.1016/j.neucom.2011.09.039
  • Cecilia De la Parra, Andre Guntoro, and Akash Kumar. 2020. ProxSim: GPU-based Simulation Framework for Cross-Layer Approximate DNN Optimization. In 2020 Design, Automation & Test in Europe Conference & Exhibition, DATE 2020, Grenoble, France, March 9--13, 2020. https://ieeexplore.ieee.org/abstract/document/9116476, PDF: https://cfaed.tu-dresden.de/files/Images/people/chair-pd/Papers/date_framework.pdf
  • Cecilia De la Parra, Andre Guntoro, and Akash Kumar. 2020. Full Approximation of Deep Neural Networks through Efficient Optimization. In IEEE International Symposium on Circuits and Systems, ISCAS 2020, Sevilla, Spain, October 10--21, 2020 https://ieeexplore.ieee.org/document/9181236 (Evaluates over 400 different approximate multipliers.)
  • Min Soo Kim; Alberto A. Del Barrio; Román Hermida; Nader Bagherzadeh, 2018, “Low-power implementation of Mitchell’s approximate logarithmic multiplication for convolutional neural networks,” in Asia and South Pacific Design Automation Conference (ASP-DAC). IEEE, 2018, pp. 617–622. https://ieeexplore.ieee.org/document/8297391
  • U. Lotric and P. Bulic, 2011, “Logarithmic multiplier in hardware implementation of neural networks,” in International Conference on Adaptive and Natural Computing Algorithms. Springer, April 2011, pp. 158–168. https://dl.acm.org/doi/10.5555/1997052.1997071
  • X Li, B Liu, RH Yang, V Courville, C Xing, VP Nia, 2023, DenseShift: Towards Accurate and Efficient Low-Bit Power-of-Two Quantization, Proceedings of the IEEE/CVF, https://openaccess.thecvf.com/content/ICCV2023/papers/Li_DenseShift_Towards_Accurate_and_Efficient_Low-Bit_Power-of-Two_Quantization_ICCV_2023_paper.pdf (Shows how multiplication by a power-of-two, which could be optimized to a bitshift in integers, can also be calculated quickly for floating point operands using integer addition on the sign and exponent bits of a floating point number.)

Approximate Caching

Caching or "memoization" is the optimization of storing a computation to re-use it later. Typically, this is the same exact computation, but some newer techniques have used caching of approximate values, so as to get an approximation of the calculated value when reused later. An example is the use of Locality Sensitive Hashing to detect "near-exact" vectors, so as to cache and reuse an entire vector dot product calculation. See more about hashing algorithms and caching optimizations in neural networks.

Papers with approximate caching optimizations:

Advanced Number Systems and Model Inference

There are a variety of alternative mathematical constructs such as the Residue Number System (RNS) and the Posit Number System (PNS); see advanced number systems. For an addition-based method of approximate multiplication, see the Logarithmic Number System (LNS). Papers on the use of advanced number systems with neural networks include:

  • G Alsuhli, V Sakellariou, H Saleh, M Al-Qutayri, Number Systems for Deep Neural Network Architectures: A Survey, 2023, https://arxiv.org/abs/2307.05035 (A very comprehensive survey.)
  • Zhewei Yao, Zhen Dong, Zhangcheng Zheng, Amir Gholami, Jiali Yu, Eric Tan, Leyuan Wang, Qijing Huang, Yida Wang, Michael Mahoney, Kurt Keutzer, HAWQ-V3: Dyadic Neural Network Quantization, Proceedings of the 38th International Conference on Machine Learning, PMLR 139:11875-11886, 2021, https://arxiv.org/abs/2011.10680 (Dyadic numbers.)
  • S. Salamat, M. Imani, S. Gupta, and T. Rosing, RNSnet: In-memory neural network acceleration using residue number system, 2018, In Proceedings of the 2018 IEEE International Conference on Rebooting Computing (ICRC’18), 1–12, https://ieeexplore.ieee.org/document/8638592 (Residue Number System)
  • Z. Carmichael, H. F. Langroudi, C. Khazanov, J. Lillie, J. L. Gustafson, and D. Kudithipudi, Deep positron: A deep neural network using the posit number system. 2019, In Proceedings of the 2019 Design, Automation, and Test in Europe Conference and Exhibition (DATE’19). 1421–1426, https://arxiv.org/abs/1812.01762 (Posit Number System)
  • Zachariah Carmichael, Hamed F. Langroudi, Char Khazanov, Jeffrey Lillie, John L. Gustafson, and Dhireesha Kudithipudi, Performance-efficiency trade-off of low-precision numerical formats in deep neural networks, 2019, In Proceedings of the 2019 Conference for Next Generation Arithmetic (CoNGA’19), ACM, New York, NY, Article 3, 9 pages, https://doi.org/10.1145/3316279.3316282

Approximate Transformer Components

Research has turned to approximating the larger building block components inside the Transformer architecture. See also high-level Transformer optimization techniques, such as quantization, attention head pruning and layer pruning. Papers on high-level approximations of Transformer components are below in areas such as:

Attention Head Approximation

See the research on approximate attention head architectures and attention optimization in general.

Activation Function Approximation

See approximations of activation functions.

Softmax Approximation

See research on softmax optimization and approximation.

Approximating Normalization

The normalization layer can be coded as an approximate normalization layer or alternatively, there is also pruned normalization (removed).

Approximating Other Transformer Components

Other general papers on approximations for Transformer architectures (and neural networks in general):

  • Joonsang Yu, Junki Park, Seongmin Park, Minsoo Kim, Sihwa Lee, Dong Hyun Lee, Jungwook Choi, NN-LUT: Neural Approximation of Non-Linear Operations for Efficient Transformer Inference, Dec 2021, https://arxiv.org/pdf/2112.02191 (Approximation using look-up tables.)
  • Chen, M. X., Firat, O., Bapna, A., Johnson, M., Macherey, W., Foster, G., Jones, L., Schuster, M., Shazeer, N., Parmar, N., Vaswani, A., Uszkoreit, J., Kaiser, L., Chen, Z., Wu, Y., and Hughes, M. The best of both worlds: Combining recent advances in neural machine translation. In ACL, 2018, https://arxiv.org/abs/1804.09849 (Hybrid Transformer architectures.)
  • Ma, J. and Yarats, D. On the adequacy of untuned warmup for adaptive optimization. arXiv:1910.04209, 2019. https://arxiv.org/abs/1910.04209
  • J Zhong, Z Liu, X Chen, Apr 2023, Transformer-based models and hardware acceleration analysis in autonomous driving: A survey, https://arxiv.org/abs/2304.10891 (Sections on approximating various components of Transformers.)

Approximate Neural Networks

More research papers on approximation used with neural networks, in general:

  • Z. Peng et al. 2018. AXNet: ApproXimate computing using an end-to-end trainable neural network. 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) https://ieeexplore.ieee.org/document/8605388 (Ensemble dual-model method where one model is a fast approximatation of the other.)
  • Matevž Fabjančič, Octavian Machidon, Hashim Sharif, Yifan Zhao, Saša Misailović, Veljko Pejović, March 2023, Mobiprox: Supporting Dynamic Approximate Computing on Mobiles, https://arxiv.org/abs/2303.11291 (Uses probabilistic approximations, such as loop perforation, for fast neural networks on mobile.)
  • Jorge Castro-Godínez, Deykel Hernández-Araya, Muhammad Shafique, Jörg Henkel, 2020, Approximate acceleration for CNN-based applications on IoT edge devices, 2020 IEEE 11th Latin American Symposium on Circuits & Systems (LASCAS), https://ieeexplore.ieee.org/document/9069040
  • Sparsh Mittal. 2016. A survey of techniques for approximate computing. ACM Computing Surveys (CSUR) 48, 4 (2016), 1–33. https://dl.acm.org/doi/10.1145/2893356 (Examines some early approximate neural networks such as AxNN.)
  • W Dong, G Kestor, D Li, 2023, Auto-HPCnet: An Automatic Framework to Build Neural Network-based Surrogate for High-Performance Computing Applications, HPDC '23: Proceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing, August 2023, Pages 31–44, https://doi.org/10.1145/3588195.3592985, https://dl.acm.org/doi/abs/10.1145/3588195.3592985
  • Seungyeop Han, Haichen Shen, Matthai Philipose, Sharad Agarwal, Alec Wolman, and Arvind Krishnamurthy. MCDNN: An Approximation-Based Execution Framework for Deep Stream Processing Under Resource Constraints, In Proceedings of the 14th Annual International Conference on Mobile Systems, Applications, and Services, ACM MobiSys, Singapore, 26–30 June 2016, pp. 123–136, https://dl.acm.org/doi/10.1145/2906388.2906396
  • F Manca, F Ratto, 2023, ONNX-to-Hardware Design Flow for the Generation of Adaptive Neural-Network Accelerators on FPGAs arXiv preprint arXiv:2309.13321, https://arxiv.org/pdf/2309.13321.pdf (Approximation techniques applied to edge computing.)
  • HJ Damsgaard, A Ometov, J Nurmi, 2023, ACM Computing Surveys, Approximation Opportunities in Edge Computing Hardware: A Systematic Literature Review https://dl.acm.org/doi/abs/10.1145/3572772, PDF: https://dl.acm.org/doi/pdf/10.1145/3572772
  • M. A. Hanif, F. Khalid, and M. Shafique, CANN: Curable approximations for high-performance deep neural network accelerators, in Proc. 56th Annu. Design Automat. Conf. (DAC). New York, NY, USA: Association for Computing Machinery, 2019, pp. 1–6. https://ieeexplore.ieee.org/document/8806937

Approximation to Avoid Redundant Computations

A simple approximate calculation can sometimes be performed as a preliminary, so as to avoid expensive calculations in some cases (see also conditional computation and caching optimizations). This method is sometimes called "common case first" or "simple case first". General papers about using this at the low-level are below, but the logical extension to the high-level is "big-little models" (see ensemble architectures).

  • Duvindu Piyasena, Rukshan Wickramasinghe, Debdeep Paul, Siew Kei Lam, and Meiqing Wu. 2019. Reducing dynamic power in streaming CNN hardware accelerators by exploiting computational redundancies. Proceedings 29th International Conference on Field-Programmable Logic and Applications, FPL 2019 (9 2019), 354–359, https://ieeexplore.ieee.org/document/8891989, PDF: https://siewkeilam.github.io/ei-research-group/Paper/2019H-Duvindu-FPL.pdf (Calculates an approximate result to inexactly avoid cases where the exact computations would be negative, and would be reduced to zero by RELU activation.)
  • Yuxiang Huan, Yifan Qin, Yantian You, Lirong Zheng, and Zhuo Zou. Sep 2016. A multiplication reduction technique with near-zero approximation for embedded learning in IoT devices. 2016 29th IEEE International System-on-Chip Conference (SOCC), 102–107. https://ieeexplore.ieee.org/abstract/document/7905445 (Avoids near-zero low multiplications on small values, that would result in zero, thereby skipping a wasteful multiplication.)
  • Maedeh Hemmat, Joshua San Miguel, and Azadeh Davoodi. 2020. AirNN: A Featherweight Framework for Dynamic Input-Dependent Approximation of CNNs. Transactions on Computer-Aided Design of Integrated Circuits and Systems. https://ieeexplore.ieee.org/document/9239327 (Approximates weight computations by pre-computing them into groups offline, and only using some of the weights in calculations during inference, effectively dynamically pruning the other weights to zero.)
  • Minkyu Kim and Jae Sun Seo. 2021. An energy-efficient deep convolutional neural network accelerator featuring conditional computing and low external memory access. IEEE Journal of Solid-State Circuits 56, 3 (2021), 803–813, https://ieeexplore.ieee.org/document/9229157 (Approximate convolutions with most-significant bits are done first.)

Approximate Computing General Research

The general idea of using approximations in computing, as a general algorithm with trade-offs, has a considerable body of research. Here are a few of the theoretical papers:

  • D. Palomino, M. Shafique, A. Susin, J. Henkel, “Thermal Optimization using Adaptive Approximate Computing for Video Coding”, IEEE/ACM 19th Design, Automation and Test in Europe Conference (DATE), 2016, https://ieeexplore.ieee.org/document/7459495
  • V. Mrazek, M. A. Hanif et al., “autoax: An automatic design space exploration and circuit building methodology utilizing libraries of approximate components,” in DAC’19. ACM, 2019, https://arxiv.org/abs/1902.10807
  • R. Nair, “Big data needs approximate computing: technical perspective”, ACM Communications, 58(1): 104, 2015. https://dl.acm.org/doi/10.1145/2688072
  • A. K. Mishra, R. Barik, S. Paul, “iACT: A Software-Hardware Framework for Understanding the Scope of Approximate Computing”, Workshop on Approximate Computing Across the System Stack (WACAS), 2014. PDF: https://sampa.cs.washington.edu/wacas14/papers/mishra.pdf
  • H. Esmaeilzadeh, A. Sampson, L. Ceze, and D. Burger, “Architecture support for disciplined approximate programming”, International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2012. PDF: https://www.cs.cornell.edu/~asampson/media/papers/truffle-asplos2012.pdf
  • V. Chippa, S. Chakradhar, K. Roy, and A. Raghunathan, “Analysis and characterization of inherent application resilience for approximate computing”, Design Automation Conference (DAC), 2013. https://ieeexplore.ieee.org/document/6560706
  • J. Choi and S. Venkataramani, Approximate Computing Techniques for Deep Neural Networks. Cham: Springer, 2019, pp. 307–329, Chapter 15, https://link.springer.com/chapter/10.1007/978-3-319-99322-5_15
  • S. Venkataramani, S. T. Chakradhar, K. Roy, and A. Raghunathan, Approximate computing and the quest for computing efficiency, Proceedings of the 52nd Annual Design Automation Conference, ACM (2015), p. 120, https://ieeexplore.ieee.org/document/7167251
  • G. Pekhimenko, D. Koutra, K. Qian, “Approximate computing: Application analysis and hardware design”, May 2013, PDF: www.cs.cmu.edu/~gpekhime/Projects/15740/paper.pdf
  • Weiqiang Liu, Fabrizio Lombardi (Book Editors), Approximate Computing, 2022, https://link.springer.com/book/10.1007/978-3-030-98347-5, https://www.amazon.com/Approximate-Computing-Weiqiang-Liu-ebook/dp/B0BBKR65SB/
  • Sparsh Mittal. 2016. A survey of techniques for approximate computing. ACM Computing Surveys (CSUR) 48, 4 (2016), 1–33. https://dl.acm.org/doi/10.1145/2893356
  • VV Kulkarni, 2020, Approximate computing techniques for accelerating compute intensive workloads, https://www.ideals.illinois.edu/items/115960, PDF: https://www.ideals.illinois.edu/items/115960/bitstreams/379143/object?dl=1
  • Amir Yazdanbakhsh; Divya Mahajan; Hadi Esmaeilzadeh; Pejman Lotfi-Kamran, 2017, AxBench: A multiplatform benchmark suite for approximate computing, IEEE Design & Test, Volume 34, Issue 2, April 2017, https://ieeexplore.ieee.org/abstract/document/7755728/, PDF: https://ieeexplore.ieee.org/ielaam/6221038/7862860/7755728-aam.pdf, PDF: http://axbench.org/papers/dt.darksilicon16-camera.pdf
  • Michael Ringenburg, Adrian Sampson, Isaac Ackerman, Luis Ceze, and Dan Grossman. 2015. Monitoring and debugging the quality of results in approximate programs. In International Conference on Architectural Support for Programming Languages and Operating Systems. 399–411. https://dl.acm.org/doi/10.1145/2775054.2694365, PDF: https://homes.cs.washington.edu/~luisceze/publications/approxdebug-asplos15.pdf
  • Thomas Y. Yeh, Petros Faloutsos, Milos Ercegovac, Sanjay J. Patel, and Glenn Reinman. 2007. The art of deception: Adaptive precision reduction for area efficient physics acceleration. 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007). pp. 394–406. https://ieeexplore.ieee.org/document/4408271
  • Mehrzad Samadi, Davoud Anoushe Jamshidi, Janghaeng Lee, and Scott Mahlke. 2014. Paraprox: Pattern-based approximation for data parallel applications. In ACM SIGARCH Computer Architecture News, Vol. 42. 35–50, https://dl.acm.org/doi/10.1145/2654822.2541948
  • Giorgos Armeniakos, Georgios Zervakis, Dimitrios Soudris, Jörg Henkel, 2022, Hardware Approximate Techniques for Deep Neural Network Accelerators: A Survey, ACM Computing Surveys, Volume 55, Issue 4, No. 83, pp 1–36 https://doi.org/10.1145/3527156, https://dl.acm.org/doi/10.1145/3527156, https://arxiv.org/abs/2203.08737
  • Tae Jun Ham, Sung Jun Jung, Seonghak Kim, Young H Oh, Yeonhong Park, Yoonho Song, Jung-Hun Park, Sanghee Lee, Kyoung Park, Jae W Lee, et al. A^3: Accelerating attention mechanisms in neural networks with approximation. In 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA), pages 328–341. IEEE, 2020. https://arxiv.org/abs/2002.10941
  • Dimitrios Danopoulos, Georgios Zervakis, Dimitrios Soudris, Jörg Henkel, 12 Feb 2024, TransAxx: Efficient Transformers with Approximate Computing, https://arxiv.org/abs/2402.07545 (Using approximations in Vision Transformer architectures.)
  • Salar Shakibhamedan, Amin Aminifar, Nima TaheriNejad, Axel Jantsch, 2024, EASE: Energy Optimization through Adaptation — A Review of Runtime Energy-Aware Approximate Deep Learning Algorithms, https://eclectx.org/Publications/2024_M13.pdf (Survey paper on techniques for adaptive inference with a focus on approximations of inference, including loop performance, stochastic algorithms, approximate arithmetic, quantization, pruning and low-rank.)
  • John Fraser Hart, Jul 1, 1978 Computer Approximations, https://www.amazon.com/Computer-Approximations-John-Fraser-Hart/dp/0882756427/
  • Teofilo F. Gonzalez, Sep 30, 2020 Handbook of Approximation Algorithms and Metaheuristics, Second Edition: Two-Volume Set (Chapman & Hall/CRC Computer and Information Science Series), https://www.amazon.com/Handbook-Approximation-Algorithms-Metaheuristics-Second/dp/0367570289/
  • Ivan Markovsky, Aug 3, 2018, Low-Rank Approximation: Algorithms, Implementation, Applications (Communications and Control Engineering) Part of: Communications and Control Engineering (62 books), https://www.amazon.com/Low-Rank-Approximation-Implementation-Applications-Communications/dp/3319896199/
  • Vijay V. Vazirani, Jul 2, 2001, Approximation Algorithms, https://www.amazon.com/Approximation-Algorithms-Vijay-V-Vazirani/dp/3540653678/
  • David P. Williamson and David B. Shmoys, Apr 26, 2011, The Design of Approximation Algorithms, https://www.amazon.com/Design-Approximation-Algorithms-David-Williamson-ebook/dp/B009019XCG/
  • A. M. Dalloo, A. J. Humaidi, A. K. A. Mhdawi and H. Al-Raweshidy, "Approximate Computing: Concepts, Architectures, Challenges, Applications, and Future Directions," in IEEE Access, doi: 10.1109/ACCESS.2024.3467375. https://ieeexplore.ieee.org/document/10693435
  • Jinhao Li, Jiaming Xu, Shan Huang, Yonghua Chen, Wen Li, Jun Liu, Yaoxiu Lian, Jiayi Pan, Li Ding, Hao Zhou, Guohao Dai, 6 Oct 2024, Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective, https://arxiv.org/abs/2410.04466
  • Yu-Ching Hu, September 2024, Efficient Accelerator-Rich Computers for Future Applications, Ph.D. Thesis, Computer Science, https://escholarship.org/content/qt68w3z4vq/qt68w3z4vq.pdf
  • Ayad M. Dalloo, Amjad J. Humaidi, Optimizing Machine Learning Models with Data-level Approximate Computing: The Role of Diverse Sampling, Precision Scaling, Quantization and Feature Selection Strategies, Results in Engineering, 2024, 103451, ISSN 2590-1230, https://doi.org/10.1016/j.rineng.2024.103451 https://www.sciencedirect.com/science/article/pii/S2590123024017031 https://github.com/AyadMDalloo/DatalvlAxC
  • Gansen Hu, Zhaoguo Wang, Jinglin Wei, Wei Huang, Haibo Chen, 17 Jan 2025, Accelerating Large Language Models through Partially Linear Feed-Forward Network, https://arxiv.org/abs/2501.10054 (Inspired by constant folding, the optimization is merging the two MatMuls in an FFN by approximating the itervening non-linear activation function (e.g., RELU or GELU), with linear functions and merging the two matrices using matrix-multiplication associativity.)
  • Wonjae Lee, Taeyoung Kim, Hyungbin Park, 23 Jul 2025, Fourier Neural Operators for Non-Markovian Processes:Approximation Theorems and Experiments, https://arxiv.org/abs/2507.17887
  • Anand Ganesh, Babhrubahan Bose, Anand Rajagopalan, 24 Jul 2025, On the Approximation of Stationary Processes using the ARMA Model, https://arxiv.org/abs/2408.10610
  • Yunfei Yang, 18 Jul 2025, On the optimal approximation of Sobolev and Besov functions using deep ReLU neural networks, https://arxiv.org/abs/2409.00901
  • Daniel Greenhut and Dan Feldman, 19 Jul 2025, $k$-PCA for (non-squared) Euclidean Distances: Polynomial Time Approximation, https://arxiv.org/abs/2507.14631
  • Jianghang Gu, Ling Wen, Yuntian Chen, Shiyi Chen, 20 Jul 2025, An explainable operator approximation framework under the guideline of Green's function, https://arxiv.org/abs/2412.16644
  • Sachin Garg, Micha{\l} Derezi\'nski, 19 Jul 2025, Faster Low-Rank Approximation and Kernel Ridge Regression via the Block-Nystr\"om Method, https://arxiv.org/abs/2506.17556
  • Guanqun Ma, David Lenz, Hanqi Guo, Tom Peterka, Bei Wang, 11 Aug 2025, Extracting Complex Topology from Multivariate Functional Approximation: Contours, Jacobi Sets, and Ridge-Valley Graphs, https://arxiv.org/abs/2508.07637
  • Matthew Fahrbach, Mehrdad Ghadiri, 8 Aug 2025, A Tight Lower Bound for the Approximation Guarantee of Higher-Order Singular Value Decomposition, https://arxiv.org/abs/2508.06693
  • Bogdan Butyrin, Artemy Rubtsov, Alexey Naumov, Vladimir Ulyanov, Sergey Samsonov, 11 Aug 2025, Gaussian Approximation for Two-Timescale Linear Stochastic Approximation, https://arxiv.org/abs/2508.07928
  • Shim Soon Yong, 11 Aug 2025, ZClassifier: Temperature Tuning and Manifold Approximation via KL Divergence on Logit Space, https://arxiv.org/abs/2507.10638
  • Sheng-Feng Yu, Jia-Jiun Yao, and Wei-Chen Chiu, 29 Jul 2025, Boost Self-Supervised Dataset Distillation via Parameterization, Predefined Augmentation, and Approximation, https://arxiv.org/abs/2507.21455
  • Jiawei Liu, Chenwang Wu, Defu Lian, Enhong Chen, 31 Jul 2025, Efficient Machine Unlearning via Influence Approximation, https://arxiv.org/abs/2507.23257
  • Anthony Nouy and Bertrand Michel, 31 Jul 2025, Weighted least-squares approximation with determinantal point processes and generalized volume sampling, https://arxiv.org/abs/2312.14057
  • Gary Froyland and Kevin K\"uhl, 30 Jul 2025, Learning dynamically inspired invariant subspaces for Koopman and transfer operator approximation, https://arxiv.org/abs/2505.05085
  • Yongchao Huang, 31 Jul 2025, RL as Regressor: A Reinforcement Learning Approach for Function Approximation, https://arxiv.org/abs/2508.00174
  • Sergei Gleyzer, Hanh Nguyen, Dinesh P. Ramakrishnan, Eric A. F. Reinhardt, 1 Aug 2025, Sinusoidal Approximation Theorem for Kolmogorov-Arnold Networks, https://arxiv.org/abs/2508.00247
  • Soumyajit Guin, Vivek S. Borkar, Shalabh Bhatnagar, 3 Aug 2025, An Actor-Critic Algorithm with Function Approximation for Risk Sensitive Cost Markov Decision Processes, https://arxiv.org/abs/2502.11604
  • Anastasis Kratsios, Bum Jun Kim, Takashi Furuya, 6 Aug 2025, Approximation Rates in Besov Norms and Sample-Complexity of Kolmogorov-Arnold Networks with Residual Connections, https://arxiv.org/abs/2504.15110
  • Prashant Gupta, Aashi Jindal, Jayadeva, and Debarka Sengupta, 7 Aug 2025, Guided Random Forest and its application to data approximation, https://arxiv.org/abs/1909.00659
  • Hannes Waclawek and Stefan Huber, 7 Aug 2025, Energy Optimized Piecewise Polynomial Approximation Utilizing Modern Machine Learning Optimizers, https://arxiv.org/abs/2503.09329
  • Ben Adcock, 8 Aug 2025, Optimal sampling for least-squares approximation, https://arxiv.org/abs/2409.02342
  • Johannes Aspman, Vyacheslav Kungurtsev, Reza Roohi Seraji, 12 Aug 2025, Tame Riemannian Stochastic Approximation, https://arxiv.org/abs/2302.00709
  • Liwei Jiang, Abhishek Roy, Krishna Balasubramanian, Damek Davis, Dmitriy Drusvyatskiy, Sen Na, 12 Aug 2025, Online Covariance Estimation in Nonsmooth Stochastic Approximation, https://arxiv.org/abs/2502.05305
  • Gen Li, Yuchen Zhou, Yuting Wei, Yuxin Chen, 13 Aug 2025, Faster Diffusion Models via Higher-Order Approximation, https://arxiv.org/abs/2506.24042
  • Mohammad Mozaffari, Amir Yazdanbakhsh, Maryam Mehri Dehnavi, 14 Aug 2025, SLiM: One-shot Quantization and Sparsity with Low-rank Approximation for LLM Weight Compression, https://arxiv.org/abs/2410.09615
  • Quentin Ploussard, Xiang Li, Matija Pavi\v{c}evi\'c, 13 Aug 2025, Tightening the mixed integer linear formulation for the piecewise linear approximation in general dimensions, https://arxiv.org/abs/2508.09395
  • Daniel Hsu, 18 Aug 2025, Dimension lower bounds for linear approaches to function approximation, https://arxiv.org/abs/2508.13346
  • Marina Sheshukova, Sergey Samsonov, Denis Belomestny, Eric Moulines, Qi-Man Shao, Zhuo-Song Zhang, Alexey Naumov, 19 Aug 2025, Gaussian Approximation and Multiplier Bootstrap for Stochastic Gradient Descent, https://arxiv.org/abs/2502.06719
  • Haoru Tan, Sitong Wu, Xiuzhe Wu, Wang Wang, Bo Zhao, Zeke Xie, Gui-Song Xia, and Xiaojuan Qi, 20 Aug 2025, Understanding Data Influence with Differential Approximation, https://arxiv.org/abs/2508.14648
  • Bahareh Tasdighi, Nicklas Werge, Yi-Shan Wu, Melih Kandemir, 20 Aug 2025, Improving Actor-Critic Training with Steerable Action-Value Approximation Errors, https://arxiv.org/abs/2406.03890
  • Mohammad Amin Esabat, Saeed Jaamei, Fatemeh Asadi, 24 Aug 2025, DeepCFD: Efficient near-ground airfoil lift coefficient approximation with deep convolutional neural networks, https://arxiv.org/abs/2508.17278
  • Keisuke Kamahori, Jungo Kasai, Noriyuki Kojima, Baris Kasikci, 23 Aug 2025, LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation, https://arxiv.org/abs/2502.20583
  • Tobias Weber, B\'alint Mucs\'anyi, Lenard Rommel, Thomas Christie, Lars Kas\"uschke, Marvin Pf\"ortner, Philipp Hennig, 22 Jul 2025, laplax -- Laplace Approximations with JAX, https://arxiv.org/abs/2507.17013
  • Ariel Neufeld, Tuan Anh Nguyen, 22 Jul 2025, Multilevel Picard approximations and deep neural networks with ReLU, leaky ReLU, and softplus activation overcome the curse of dimensionality when approximating semilinear parabolic partial differential equations in $L^p$-sense, https://arxiv.org/abs/2409.20431
  • Nedeljko Radulovic, Albert Bifet, Fabian Suchanek, 12 Aug 2025, BELLA: Black box model Explanations by Local Linear Approximations, https://arxiv.org/abs/2305.11311
  • Michael Mayer and Mario V. W\"uthrich, 18 Aug 2025, Shapley Values: Paired-Sampling Approximations, https://arxiv.org/abs/2508.12947
  • My Le, Luana Ruiz, and Souvik Dhara, 5 Sep 2025, Landmark-Based Node Representations for Shortest Path Distance Approximations in Random Graphs, https://arxiv.org/abs/2504.08216
  • Erion Morina, Martin Holler, 27 Aug 2025, $\mathcal{C}^1$-approximation with rational functions and rational neural networks, https://arxiv.org/abs/2508.19672
  • Vojtech Mrazek, Konstantinos Balaskas, Paula Carolina Lozano Duarte, Zdenek Vasicek, Mehdi B. Tahoori, Georgios Zervakis, 27 Aug 2025, Arbitrary Precision Printed Ternary Neural Networks with Holistic Evolutionary Approximation, https://arxiv.org/abs/2508.19660
  • Pengcheng Xie and Zihao Zhou and Zijian Zhou, 27 Aug 2025, Objective Value Change and Shape-Based Accelerated Optimization for the Neural Network Approximation, https://arxiv.org/abs/2508.20290
  • Gen Li, Yuchen Jiao, Yu Huang, Yuting Wei, Yuxin Chen, 28 Aug 2025, Transformers Meet In-Context Learning: A Universal Approximation Theory, https://arxiv.org/abs/2506.05200
  • Tatyana Matveeva, Aleksandr Katrutsa, Evgeny Frolov, 28 Aug 2025, Dynamic Low-rank Approximation of Full-Matrix Preconditioner for Training Generalized Linear Models, https://arxiv.org/abs/2508.21106
  • Prashansa Panda and Shalabh Bhatnagar, 29 Aug 2025, Two-Timescale Critic-Actor for Average Reward MDPs with Function Approximation, https://arxiv.org/abs/2402.01371
  • Pedro Savarese, 29 Aug 2025, Principled Approximation Methods for Efficient and Scalable Deep Learning, https://arxiv.org/abs/2509.00174
  • Anastasis Kratsios, Tin Sum Cheng, Daniel Roy, 31 Aug 2025, Beyond Universal Approximation Theorems: Algorithmic Uniform Approximation by Neural Networks Trained with Noisy Data, https://arxiv.org/abs/2509.00924
  • Dong Liu, Yanxuan Yu, Jiayi Zhang, Yifan Li, Ben Lengerich, Ying Nian Wu, 3 Sep 2025, FastCache: Fast Caching for Diffusion Transformer Through Learnable Linear Approximation, https://arxiv.org/abs/2505.20353
  • Li Lin, Xiaojun Wan, 8 Sep 2025, LoaQ: Layer-wise Output Approximation Quantization, https://arxiv.org/abs/2509.06297
  • Victor Toscano-Duran, Rocio Gonzalez-Diaz and Miguel A. Guti\'errez-Naranjo, 8 Sep 2025, Barycentric Neural Networks and Length-Weighted Persistent Entropy Loss: A Green Geometric and Topological Framework for Function Approximation, https://arxiv.org/abs/2509.06694
  • Yian Huang, Zhen Huang, 7 Sep 2025, Randomized Quasi-Monte Carlo Features for Kernel Approximation, https://arxiv.org/abs/2503.06041
  • Kyriakos Stylianopoulos, George C. Alexandropoulos, 8 Sep 2025, Universal Approximation with XL MIMO Systems: OTA Classification via Trainable Analog Combining, https://arxiv.org/abs/2504.12758
  • Charles Frye, Nathan Wang, Timothy Feng, September 26, 2025, We reverse-engineered Flash Attention 4, https://modal.com/blog/reverse-engineer-flash-attention-4 (Flash Attention 4 has Blackwell CUDA C++ improvements, approximate Softmax via exponential approximation, and faster scaling factor updates.)
  • Nicol N. Schraudolph, 1999, A Fast, Compact Approximation of the Exponential Function https://nic.schraudolph.org/pubs/Schraudolph99.pdf
  • V Leon, MA Hanif, G Armeniakos, X Jiao, 2025, Approximate computing survey, part i: Terminology and software & hardware approximation techniques, ACM Computing, https://dl.acm.org/doi/pdf/10.1145/3716845
  • V Leon, MA Hanif, G Armeniakos, X Jiao, 2025, Approximate computing survey, Part II: Application-specific & architectural approximation techniques and applications, ACM Computing, https://dl.acm.org/doi/pdf/10.1145/3711683
  • V. Akhlaghi, A. Yazdanbakhsh, K. Samadi, R. K. Gupta and H. Esmaeilzadeh, "SnaPEA: Predictive Early Activation for Reducing Computation in Deep Convolutional Neural Networks," 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA), Los Angeles, CA, USA, 2018, pp. 662-673, doi: 10.1109/ISCA.2018.00061. https://ieeexplore.ieee.org/document/8416863 https://cseweb.ucsd.edu/~vakhlagh/ISCA18-SnaPEA.pdf
  • Rupert Mitchell and Kristian Kersting, 12 Sep 2025, Multipole Semantic Attention: A Fast Approximation of Softmax Attention for Pretraining, https://arxiv.org/abs/2509.10406
  • Chien-Ming Chi, 12 Sep 2025, Constructive Universal Approximation and Sure Convergence for Multi-Layer Neural Networks, https://arxiv.org/abs/2507.04779
  • Hanfei Zhou and Lei Shi, 11 Sep 2025, Expressive Power of Deep Networks on Manifolds: Simultaneous Approximation, https://arxiv.org/abs/2509.09362
  • Tim Gyger, Reinhard Furrer, Fabio Sigrist, 11 Sep 2025, Iterative Methods for Full-Scale Gaussian Process Approximations for Large Spatial Data, https://arxiv.org/abs/2405.14492
  • Rodion Nazarov and Allen Gehret and Robert Shorten and Jakub Marecek, 18 Sep 2025, Stochastic Sample Approximations of (Local) Moduli of Continuity, https://arxiv.org/abs/2509.15368
  • Jie Yin, Ke Sun, Han Wu, 16 Sep 2025, Unbiased Online Curvature Approximation for Regularized Graph Continual Learning, https://arxiv.org/abs/2509.12727
  • Weiming Chen, Zhihan Zhu, Yijia Wang, Zhihai He, 16 Sep 2025, Runge-Kutta Approximation and Decoupled Attention for Rectified Flow Inversion and Semantic Editing, https://arxiv.org/abs/2509.12888
  • Dongseok Kim, Wonjun Jeong, Gisung Oh, 13 Sep 2025, FACTORS: Factorial Approximation for Complementary Two-factor Optimization with Risk-aware Scoring, https://arxiv.org/abs/2509.10825
  • Zixi Chen, Yumin Xu, Ruixun Zhang, 14 Sep 2025, Convergence Rate in Nonlinear Two-Time-Scale Stochastic Approximation with State (Time)-Dependence, https://arxiv.org/abs/2509.11039
  • Jia-Qi Yang, Lei Shi, 14 Sep 2025, Kernel-based Stochastic Approximation Framework for Nonlinear Operator Learning, https://arxiv.org/abs/2509.11070
  • Eric Eaton, Marcel Hussing, Michael Kearns, Aaron Roth, Sikata Bela Sengupta, Jessica Sorrell, 10 Sep 2025, Replicable Reinforcement Learning with Linear Function Approximation, https://arxiv.org/abs/2509.08660
  • Marat Khusainov, Marina Sheshukova, Alain Durmus, Sergey Samsonov, 17 Sep 2025, On the Rate of Gaussian Approximation for Linear Regression Problems, https://arxiv.org/abs/2509.14039
  • Erin Carson, Xinye Chen, Cheng Kang, 17 Sep 2025, LLM-ABBA: Understanding time series via symbolic approximation, https://arxiv.org/abs/2411.18506
  • Saptarshi Mandal, Yashaswini Murthy and R. Srikant, 2 Oct 2025, Finite-Time Bounds for Distributionally Robust TD Learning with Linear Function Approximation, https://arxiv.org/abs/2510.01721
  • Kaustubh Ponkshe, Raghav Singhal, Eduard Gorbunov, Alexey Tumanov, Samuel Horvath, Praneeth Vepakomma, 2 Oct 2025, Initialization using Update Approximation is a Silver Bullet for Extremely Efficient Low-Rank Fine-Tuning, https://arxiv.org/abs/2411.19557
  • Johannes Voss, 2 Oct 2025, Machine learning for accuracy in density functional approximations, https://arxiv.org/abs/2311.00196
  • Wangxuan Fan, Siqi Li, Doudou Zhou, Yohei Okada, Chuan Hong, Molei Liu and Nan Liu, 2 Oct 2025, SIM-Shapley: A Stable and Computationally Efficient Approach to Shapley Value Approximation, https://arxiv.org/abs/2505.08198
  • Narine Kokhlikyan, Kamalika Chaudhuri, Saeed Mahloujifar, 13 Oct 2025, Z0-Inf: Zeroth Order Approximation for Data Influence, https://arxiv.org/abs/2510.11832
  • Ziqi Zhao and Vivek Sarin, 14 Oct 2025, nuGPR: GPU-Accelerated Gaussian Process Regression with Iterative Algorithms and Low-Rank Approximations, https://arxiv.org/abs/2510.12128
  • Bogdan Butyrin, Eric Moulines, Alexey Naumov, Sergey Samsonov, Qi-Man Shao, Zhuo-Song Zhang, 14 Oct 2025, Improved Central Limit Theorem and Bootstrap Approximations for Linear Stochastic Approximation, https://arxiv.org/abs/2510.12375
  • Toshinori Kitamura, Arnob Ghosh, Tadashi Kozuno, Wataru Kumagai, Kazumi Kasaura, Kenta Hoshino, Yohei Hosoe, Yutaka Matsuo, 14 Oct 2025, Provably Efficient RL under Episode-Wise Safety in Constrained MDPs with Linear Function Approximation, https://arxiv.org/abs/2502.10138
  • Junghyun Lee, Eunsang Lee, Young-Sik Kim, Yongwoo Lee, Joon-Woo Lee, Yongjune Kim, Jong-Seon No, 14 Oct 2025, Optimized Layerwise Approximation for Efficient Private Inference on Fully Homomorphic Encryption, https://arxiv.org/abs/2310.10349
  • Osman Bicer, Ali D. Kara and Serdar Yuksel, 14 Oct 2025, Quantizer Design for Finite Model Approximations, Model Learning, and Quantized Q-Learning for MDPs with Unbounded Spaces, https://arxiv.org/abs/2510.04355
  • Yang Zhang, Huiwen Yan and Mushuang Liu, 30 Sep 2025, Directed-MAML: Meta Reinforcement Learning Algorithm with Task-directed Approximation, https://arxiv.org/abs/2510.00212
  • Chuntao Chen, Tapio Helin, Nuutti Hyv\"onen, Yuya Suzuki, 1 Oct 2025, Approximation of differential entropy in Bayesian optimal experimental design, https://arxiv.org/abs/2510.00734
  • Prabhat Karmakar, Sayan Gupta, Ilaksh Adlakha, 24 Sep 2025, Extended Low-Rank Approximation Accelerates Learning of Elastic Response in Heterogeneous Materials, https://arxiv.org/abs/2509.20276
  • Babak Barazandeh, Subhabrata Majumdar, Om Rajyaguru, George Michailidis, 23 Sep 2025, Localized LoRA: A Structured Low-Rank Approximation for Efficient Fine-Tuning, https://arxiv.org/abs/2506.00236
  • Songyuan Li, Teng Wang, Jinrong Tang, Ruiqi Liu, Yuyao Lu, Feng Xu, Bin Gao, Xiangwei Zhu, 24 Oct 2025, Bridging Function Approximation and Device Physics via Negative Differential Resistance Networks, https://arxiv.org/abs/2510.23638
  • Wenyi Wang, Piotr Pi\k{e}kos, Li Nanbo, Firas Laakom, Yimeng Chen, Mateusz Ostaszewski, Mingchen Zhuge, and J\"urgen Schmidhuber, 28 Oct 2025, Huxley-G\"odel Machine: Human-Level Coding Agent Development by an Approximation of the Optimal Self-Improving Machine, https://arxiv.org/abs/2510.21614
  • Sisipho Hamlomo, Marcellin Atemkeng, 28 Oct 2025, Clustering-Based Low-Rank Matrix Approximation for Medical Image Compression, https://arxiv.org/abs/2505.08256
  • Yilin Xie, Shiqiang Zhang, Joel A. Paulson, Calvin Tsay, 28 Oct 2025, Global Optimization of Gaussian Process Acquisition Functions Using a Piecewise-Linear Kernel Approximation, https://arxiv.org/abs/2410.16893
  • Soham Bonnerjee, Sayar Karmakar, Wei Biao Wu, 22 Oct 2025, Sharp Gaussian approximations for Decentralized Federated Learning, https://arxiv.org/abs/2505.08125
  • Yutong Wang, Haiyu Wang, Sai Qian Zhang, 18 Oct 2025, QSVD: Efficient Low-rank Approximation for Unified Query-Key-Value Weight Compression in Low-Precision Vision-Language Models, https://arxiv.org/abs/2510.16292
  • Cassidy Ashworth, Pietro Li\`o, Francesco Caso, 18 Oct 2025, Symmetry and Generalisation in Neural Approximations of Renormalisation Transformations, https://arxiv.org/abs/2510.16591
  • Philippe Magalh\~aes (LabHC), Virginie Fresse (LabHC), Beno\^it Suffran, Olivier Alata (LabHC), 3 Oct 2025, Impl\'ementation Efficiente de Fonctions de Convolution sur FPGA \`a l'Aide de Blocs Param\'etrables et d'Approximations Polynomiales, https://arxiv.org/abs/2510.15930
  • Wilson E. Marc\'ilio-Jr and Danilo M. Eler and Fernando V. Paulovich and Rafael M. Martins, 20 Oct 2025, HUMAP: Hierarchical Uniform Manifold Approximation and Projection, https://arxiv.org/abs/2106.07718
  • Fengdi Che, Chenjun Xiao, Jincheng Mei, Bo Dai, Ramki Gummadi, Oscar A Ramirez, Christopher K Harris, A. Rupam Mahmood, Dale Schuurmans, 19 Oct 2025, Target Networks and Over-parameterization Stabilize Off-policy Bootstrapping with Function Approximation, https://arxiv.org/abs/2405.21043
  • Youngjae Min, Navid Azizan, 19 Oct 2025, HardNet: Hard-Constrained Neural Networks with Universal Approximation Guarantees, https://arxiv.org/abs/2410.10807
  • Christian Beck, Sebastian Becker, Patrick Cheridito, Arnulf Jentzen, Ariel Neufeld, 20 Oct 2025, Deep learning based numerical approximation algorithms for stochastic partial differential equations, https://arxiv.org/abs/2012.01194
  • Ryan Cory-Wright, Jean Pauphilet, 17 Oct 2025, Improved Approximation Algorithms for Low-Rank Problems Using Semidefinite Optimization, https://arxiv.org/abs/2501.02942
  • Yinsong Chen, Samson S. Yu, Zhong Li, Chee Peng Lim, 22 Sep 2025, Addressing the Inconsistency in Bayesian Deep Learning via Generalized Laplace Approximation, https://arxiv.org/abs/2405.13535
  • Luwei Sun, Dongrui Shen and Han Feng, 21 Sep 2025, Theoretical Insights into CycleGAN: Analyzing Approximation and Estimation Errors in Unpaired Data Generation, https://arxiv.org/abs/2407.11678
  • Mohammad Tariqul Islam, Du Liu, Deblina Sarkar, 27 Oct 2025, Manifold Approximation leads to Robust Kernel Alignment, https://arxiv.org/abs/2510.22953
  • Guoji Fu, Wee Sun Lee, 25 Oct 2025, Approximation and Generalization Abilities of Score-based Neural Network Generative Models for Sub-Gaussian Distributions, https://arxiv.org/abs/2505.10880
  • Siddhartha Ganguly, Shubham Gupta, Debasish Chatterjee, 15 Oct 2025, Data-driven learning of feedback maps for explicit robust predictive control: an approximation theoretic view, https://arxiv.org/abs/2510.13522
  • Penghao Yu, Haotian Jiang, Zeyu Bao, Ruoxi Yu, Qianxiao Li, 8 Oct 2025, The Effect of Attention Head Count on Transformer Approximation, https://arxiv.org/abs/2510.06662
  • Joris Dommel and Sven A. Wegner, 8 Oct 2025, An in-depth look at approximation via deep and narrow neural networks, https://arxiv.org/abs/2510.07202
  • Heyang Zhao and Jiafan He and Quanquan Gu, 3 Oct 2025, A Nearly Optimal and Low-Switching Algorithm for Reinforcement Learning with General Function Approximation, https://arxiv.org/abs/2311.15238
  • Anton Bj\"orklund, Mykola Zaitsev, Marta Kwiatkowska, 3 Oct 2025, Efficient Preimage Approximation for Neural Network Certification, https://arxiv.org/abs/2505.22798
  • Weichen Wu, Gen Li, Yuting Wei, Alessandro Rinaldo, 3 Oct 2025, Statistical Inference for Temporal Difference Learning with Linear Function Approximation, https://arxiv.org/abs/2410.16106
  • Jian Lu, Xiaohuang Huang, 21 Oct 2025, Approximation Rates of Shallow Neural Networks: Barron Spaces, Activation Functions and Optimality Analysis, https://arxiv.org/abs/2510.18388
  • Frank Cole, Yuxuan Zhao, Yulong Lu, Tianhao Zhang, 21 Oct 2025, In-Context Learning of Linear Dynamical Systems with Transformers: Approximation Bounds and Depth-Separation, https://arxiv.org/abs/2502.08136
  • Jingpu Cheng, Ting Lin, Zuowei Shen, Qianxiao Li, 21 Oct 2025, A unified framework for establishing the universal approximation of transformer-type architectures, https://arxiv.org/abs/2506.23551
  • Wei-Cheng Lee and Francesco Orabona, 25 Sep 2025, A Finite-Time Analysis of TD Learning with Linear Function Approximation without Projections or Strong Convexity, https://arxiv.org/abs/2506.01052
  • Geonwoo Cho, Jaegyun Im, Jihwan Lee, Hojun Yi, Sejin Kim, Sundong Kim, 25 Sep 2025, TRACED: Transition-aware Regret Approximation with Co-learnability for Environment Design, https://arxiv.org/abs/2506.19997
  • Vanja Stojanovi\'c, Bor Panger\v{s}i\v{c}, 25 Sep 2025, Empirical Analysis Of Heuristic and Approximation Algorithms for the The Mutual-Visibility Problem, https://arxiv.org/abs/2507.01076
  • Steve Hong, Runa Eschenhagen, Bruno Mlodozeniec, Richard Turner, 27 Sep 2025, Better Hessians Matter: Studying the Impact of Curvature Approximations in Influence Functions, https://arxiv.org/abs/2509.23437
  • Maedeh Zarvandi, Michael Timothy, Theresa Wasserer, Debarghya Ghoshdastidar, 29 Sep 2025, Interpretable Kernel Representation Learning at Scale: A Unified Framework Utilizing Nystr\"om Approximation, https://arxiv.org/abs/2509.24467
  • Siddharth Chandak, Shaan Ul Haque, Nicholas Bambos, 28 Sep 2025, Finite-Time Bounds for Two-Time-Scale Stochastic Approximation with Arbitrary Norm Contractions and Markovian Noise, https://arxiv.org/abs/2503.18391
  • Xin Yu, Yujia Wang, Jinghui Chen, Lingzhou Xue, 27 Sep 2025, AltLoRA: Towards Better Gradient Approximation in Low-Rank Adaptation with Alternating Projections, https://arxiv.org/abs/2505.12455
  • Anthony Zhou and Amir Barati Farimani, 29 Sep 2025, Hamiltonian Neural PDE Solvers through Functional Approximation, https://arxiv.org/abs/2505.13275
  • Frederik Baymler Mathiesen, Nikolaus Vertovec, Francesco Fabiano, Luca Laurenti, Alessandro Abate, 29 Sep 2025, Certified Neural Approximations of Nonlinear Dynamics, https://arxiv.org/abs/2505.15497
  • Siddharth Chandak, 29 Sep 2025, Non-Expansive Mappings in Two-Time-Scale Stochastic Approximation: Finite-Time Analysis, https://arxiv.org/abs/2501.10806
  • Wei Wang, Xiao-Yong Wei, Qing Li, 17 Oct 2025, ParaFormer: Shallow Parallel Transformers with Progressive Approximation, https://arxiv.org/abs/2510.15425
  • Yi-Shan Chu, Yueh-Cheng Kuo, 16 Oct 2025, From Universal Approximation Theorem to Tropical Geometry of Multi-Layer Perceptrons, https://arxiv.org/abs/2510.15012
  • Zixun Wang and Ben Dai, 17 Oct 2025, RankSEG-RMA: An Efficient Segmentation Algorithm via Reciprocal Moment Approximation, https://arxiv.org/abs/2510.15362
  • Zi Liang and Zhiyao Wu and Haoyang Shang and Yulin Jin and Qingqing Ye and Huadi Zheng and Peizhao Hu and Haibo Hu, 27 Sep 2025, Decision Potential Surface: A Theoretical and Practical Approximation of LLM's Decision Boundary, https://arxiv.org/abs/2510.03271
  • Jinyang Jiang, Bernd Heidergott, Jiaqiao Hu, Yijie Peng, 6 Oct 2025, Stochastic Approximation Methods for Distortion Risk Measure Optimization, https://arxiv.org/abs/2510.04563
  • Weixin Wang, Haoyang Zheng, Guang Lin, Wei Deng, Pan Xu, 6 Oct 2025, Rethinking Langevin Thompson Sampling from A Stochastic Approximation Perspective, https://arxiv.org/abs/2510.05023
  • Tong Mao, Jinchao Xu, 5 Oct 2025, Sharp Lower Bounds for Linearized ReLU^k Approximation on the Sphere, https://arxiv.org/abs/2510.04060
  • Lin Fan, Peter W. Glynn, 5 Oct 2025, Diffusion Approximations for Thompson Sampling in the Small Gap Regime, https://arxiv.org/abs/2105.09232
  • Yuling Jiao, Yang Wang and Bokai Yan, 5 Oct 2025, Approximation Bounds for Recurrent Neural Networks with Application to Regression, https://arxiv.org/abs/2409.05577
  • Yu Fu, Michael Stanley Smith and Anastasios Panagiotelis, 6 Oct 2025, Vector Copula Variational Inference and Dependent Block Posterior Approximations, https://arxiv.org/abs/2503.01072
  • Xinwen Hu, Yunqing Huang, Nianyu Yi, and Peimeng Yin, 9 Oct 2025, Weights initialization of neural networks for function approximation, https://arxiv.org/abs/2510.08780
  • Orin Levy and Liad Erez and Alon Cohen and Yishay Mansour, 10 Oct 2025, Regret Bounds for Adversarial Contextual Bandits with General Function Approximation and Delayed Feedback, https://arxiv.org/abs/2510.09127
  • Mihriban Ceylan, David J. Pr\"omel, 10 Oct 2025, Distributionally robust approximation property of neural networks, https://arxiv.org/abs/2510.09177
  • Nathan Corecco, Batuhan Yardim, Vinzenz Thoma, Zebang Shen, Niao He, 24 Oct 2025, Scalable Neural Incentive Design with Parameterized Mean-Field Approximation, https://arxiv.org/abs/2510.21442
  • Davide Murari, Takashi Furuya, Carola-Bibiane Sch\"onlieb, 10 Oct 2025, Approximation theory for 1-Lipschitz ResNets, https://arxiv.org/abs/2505.12003
  • Shiyun Lin, Simon Mauras, Nadav Merlis, Vianney Perchet, 22 Oct 2025, Stable Matching with Ties: Approximation Ratios and Learning, https://arxiv.org/abs/2411.03270
  • Maryam Aliakbarpour, Zhan Shi, Ria Stevens, Vincent X. Wang, 22 Oct 2025, Nearly-Linear Time Private Hypothesis Selection with the Optimal Approximation Factor, https://arxiv.org/abs/2506.01162
  • Yongchao Huang, 25 Sep 2025, Sampling via Gaussian Mixture Approximations, https://arxiv.org/abs/2509.25232
  • Arturo De Marinis, Davide Murari, Elena Celledoni, Nicola Guglielmi, Brynjulf Owren, Francesco Tudisco, 30 Sep 2025, Approximation properties of neural ODEs, https://arxiv.org/abs/2503.15696
  • Guillaume Godin, 7 Oct 2025, Fast Leave-One-Out Approximation from Fragment-Target Prevalence Vectors (molFTP) : From Dummy Masking to Key-LOO for Leakage-Free Feature Construction, https://arxiv.org/abs/2510.06029
  • Haotian Feng, 15 Oct 2025, Neural Network approximation power on homogeneous and heterogeneous reaction-diffusion equations, https://arxiv.org/abs/2510.14094
  • Jingwen Gu, Yiting He, Zhishuai Liu, Pan Xu, 16 Oct 2025, Policy Regularized Distributionally Robust Markov Decision Processes with Linear Function Approximation, https://arxiv.org/abs/2510.14246
  • Tong Mao, Jonathan W. Siegel, Jinchao Xu, 16 Oct 2025, Approximation Rates for Shallow ReLU$^k$ Neural Networks on Sobolev Spaces via the Radon Transform, https://arxiv.org/abs/2408.10996

AI Books from Aussie AI



The Sweetest Lesson: Your Brain Versus AI The Sweetest Lesson: Your Brain Versus AI: new book on AI intelligence theory:
  • Your brain is 50 times bigger than the best AI engines.
  • Truly intelligent AI will require more compute!
  • Another case of the bitter lesson?
  • Maybe it's the opposite of that: the sweetest lesson.

Get your copy from Amazon: The Sweetest Lesson



RAG Optimization RAG Optimization: Accurate and Efficient LLM Applications: new book on RAG architectures:
  • Smarter RAG
  • Faster RAG
  • Cheaper RAG
  • Agentic RAG
  • RAG reasoning

Get your copy from Amazon: RAG Optimization



Generative AI in C++ Generative AI Applications book:
  • Deciding on your AI project
  • Planning for success and safety
  • Designs and LLM architectures
  • Expediting development
  • Implementation and deployment

Get your copy from Amazon: Generative AI Applications



Generative AI in C++ Generative AI programming book:
  • Generative AI coding in C++
  • Transformer engine speedups
  • LLM models
  • Phone and desktop AI
  • Code examples
  • Research citations

Get your copy from Amazon: Generative AI in C++



CUDA C++ Optimization CUDA C++ Optimization book:
  • Faster CUDA C++ kernels
  • Optimization tools & techniques
  • Compute optimization
  • Memory optimization

Get your copy from Amazon: CUDA C++ Optimization



CUDA C++ Optimization CUDA C++ Debugging book:
  • Debugging CUDA C++ kernels
  • Tools & techniques
  • Self-testing & reliability
  • Common GPU kernel bugs

Get your copy from Amazon: CUDA C++ Debugging

More AI Research

Read more about: