Aussie AI
Heuristics in AI Architectures
-
Last Updated 2 March, 2025
-
by David Spuler, Ph.D.
What are Heuristics?
These days, the term "heuristics" seems to mean "anything except LLMs" in the AI research literature. We used to have software that did things prior to ChatGPT, and amazingly, some of that stuff is still useful. For starters, heuristics are:
- Much faster than LLMs, and
- Sometimes more accurate than LLMs (e.g., no hallucinations)
There are many decades of research on software algorithms that existed long before AI was any good. And now it's come full circle, with some of the research on heuristics starting to be used with LLM architectures.
Research on Heuristics
Research papers on the use of "heuristics" in AI include:
- X Zhang, 2024, Disentangling syntactics, semantics, and pragmatics in natural language processing, Doctoral thesis, Nanyang Technological University, Singapore, https://hdl.handle.net/10356/177426 https://dr.ntu.edu.sg/bitstream/10356/177426/2/Final%20Thesis%20for%20DRNTU.pdf
- Jindrich Libovicky, Jindrich Helcl, Marek Tlusty, Ondrej Bojar, and Pavel Pecina. 2016. CUNI system for WMT16 automatic post-editing and multimodal translation tasks. In Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers, pages 646–654, Berlin, Germany. https://arxiv.org/abs/1606.07481 (Post-editing of machine translation.)
- M Sponner, B Waschneck, A Kumar , 2024, Adapting Neural Networks at Runtime: Current Trends in At-Runtime Optimizations for Deep Learning, ACM Computing Surveys,, PDF: https://dl.acm.org/doi/pdf/10.1145/3657283 (Survey of various adaptive inference optimization techniques with much focus on image and video processing optimization for LLMs.)
- Camilo Chacón Sartori, Christian Blum, Filippo Bistaffa, Guillem Rodríguez Corominas, 28 May 2024, Metaheuristics and Large Language Models Join Forces: Towards an Integrated Optimization Approach, https://arxiv.org/abs/2405.18272
- Mingjie Sun, Xinlei Chen, J. Zico Kolter, Zhuang Liu, 27 Feb 2024, Massive Activations in Large Language Models, https://arxiv.org/abs/2402.17762 (Examines the range of values of activations, focused on very large outlier values, in models such as LLaMA2-7B, LLaMA2-13B, and Mixtral-8x7B.)
- K. Liao, Y. Zhang, X. Ren, Q. Su, X. Sun, and B. He, “A global past-future early exit method for accelerating inference of pre trained language models,” in Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021, pp. 2013–2023. https://aclanthology.org/2021.naacl-main.162/
- Ke Hong, Guohao Dai, Jiaming Xu, Qiuli Mao, Xiuhong Li, Jun Liu, kangdi chen, Yuhan Dong, Yu Wang, 2024, FlashDecoding++: Faster Large Language Model Inference with Asynchronization, Flat GEMM Optimization, and Heuristics, Part of Proceedings of Machine Learning and Systems 6 (MLSys 2024) Conference, PDF: https://proceedings.mlsys.org/paper_files/paper/2024/file/5321b1dabcd2be188d796c21b733e8c7-Paper-Conference.pdf (Next generation of Flash Decoding, with improved ascynchronous parallelism of Softmax in both prefill and decoding phases, heuristic dataflow management algorithms, and enhanced GEMM during the decoding phase.)
- Fei Liu, Xialiang Tong, Mingxuan Yuan, Xi Lin, Fu Luo, Zhenkun Wang, Zhichao Lu, Qingfu Zhang, 1 Jun 2024 (v3), Evolution of Heuristics: Towards Efficient Automatic Algorithm Design Using Large Language Model, https://arxiv.org/abs/2401.02051 (Using an LLM to find heuristics, rather than the other way around.)
- David Spuler, June 2024, Aussie AI, Heuristic Optimization of Transformer On-Device Inference: IP Australia, https://ipsearch.ipaustralia.gov.au/patents/2024901670
- Cal Paterson, April 2021, We were promised Strong AI, but instead we got metadata analysis, https://calpaterson.com/metadata.html
- Peng Jiang and Xiaodong Cai, 12 Sep 2024, A Survey of Semantic Parsing Techniques Symmetry 2024, 16(9), 1201; https://doi.org/10.3390/sym16091201 https://www.mdpi.com/2073-8994/16/9/1201 PDF: https://www.mdpi.com/2073-8994/16/9/1201/pdf?version=1726149198
- Anees Ahmed, Dec 2024, Heuristics in AI: The Secret Ingredient to Solving Complex Problems Quickly, https://aistacked.com/heuristics-in-artificial-intelligence/
- Geeks for Geeks, June 2024, Heuristic Function in AI, https://www.geeksforgeeks.org/heuristic-function-in-ai/
- Jiejun Tan, Zhicheng Dou, Yutao Zhu, Peidong Guo, Kun Fang, Ji-Rong Wen, 30 May 2024 (v3), Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs, https://arxiv.org/abs/2402.12052 https://github.com/plageon/SlimPLM
- Damien de Mijolla, Wen Yang, Philippa Duckett, Christopher Frye, Mark Worrall, 8 Dec 2024, Language hooks: a modular framework for augmenting LLM reasoning that decouples tool usage from the model and its prompt, https://arxiv.org/abs/2412.05967
- Vincent-Pierre Berges, Barlas Oguz, December 12, 2024, Memory Layers at Scale, Meta, https://ai.meta.com/research/publications/memory-layers-at-scale/ https://github.com/facebookresearch/memory (Augmention of an LLM with an additional key-value associative memory, by replacing some FFNs with a "memory layer".)
- LCM team, Loïc Barrault, Paul-Ambroise Duquenne, Maha Elbayad, Artyom Kozhevnikov, Belen Alastruey, Pierre Andrews, Mariano Coria, Guillaume Couairon, Marta R. Costa-jussà, David Dale, Hady Elsahar, Kevin Heffernan, João Maria Janeiro, Tuan Tran, Christophe Ropers, Eduardo Sánchez, Robin San Roman, Alexandre Mourachko, Safiyyah Saleem, Holger Schwenk, 15 Dec 2024 (v2), Large Concept Models: Language Modeling in a Sentence Representation Space, https://arxiv.org/abs/2412.08821 https://github.com/facebookresearch/large_concept_model (Model operates at the sentence concept level, using SONAR sentence embeddings.)
- Zijie Chen, Zhanchao Zhou, Yu Lu, Renjun Xu, Lili Pan, Zhenzhong Lan, 30 Dec 2024, UBER: Uncertainty-Based Evolution with Large Language Models for Automatic Heuristic Design, https://arxiv.org/abs/2412.20694
- Haoran Wang, Kai Shu, Jan 2025, MakeEveryTokenCount: ASystematic Survey on Decoding Methods for Foundation Model, https://www.researchgate.net/profile/Haoran-Wang-96/publication/387703971_Make_Every_Token_Count_A_Systematic_Survey_on_Decoding_Methods_for_Foundation_Models/links/67784c8ce74ca64e1f49eb15/Make-Every-Token-Count-A-Systematic-Survey-on-Decoding-Methods-for-Foundation-Models.pdf https://github.com/wang2226/Awesome-LLM-Decoding
- A. Mishra, S. Kirmani and K. Madduri, "Fast Sentence Classification using Word Co-occurrence Graphs*," 2024 IEEE International Conference on Big Data (BigData), Washington, DC, USA, 2024, pp. 620-629, doi: 10.1109/BigData62323.2024.10825869. https://ieeexplore.ieee.org/abstract/document/10825869
- Wendi Cui, Jiaxin Zhang, Zhuohang Li, Hao Sun, Damien Lopez, Kamalika Das, Bradley A. Malin, Sricharan Kumar, 26 Feb 2025, Automatic Prompt Optimization via Heuristic Search: A Survey, https://arxiv.org/abs/2502.18746 (Survey of auto prompting, from basic LLM enhancements to some methods quite similar to RALM and TALM.)
More AI Research
Read more about: