Aussie AI

Tool Augmented Language Models (TALM)

Last Updated 2 March, 2025

by David Spuler, Ph.D.

What is Tool Augmented Language Models (TALM)?

Tool Augmented Language Models (TALM) is the use of non-LLM tools to augment the processing of LLMs. Tools can be used to compute more data based on the prompts, and can be used on conjunction with reasoning like Chain-of-Thought, RAG retrievals, or agentic architectures.

Research on TALM

Yechen Xu, Xinhao Kong, Tingjun Chen, Danyang Zhuo, 4 Jun 2024 (v2), Conveyor: Efficient Tool-aware LLM Serving with Tool Partial Execution, https://arxiv.org/abs/2406.00059 Code: https://github.com/conveyor-sys/conveyor (Speeding up inference by partially running tools in parallel to the LLM query procesisng, rather than sequentially after the LLM request, by detecting tool requests deep inside the decoding algorithm and starting them off immediately, before the LLM has finished generating the fully decoed output.)
Yubo Ma, Zhibin Gou, Junheng Hao, Ruochen Xu, Shuohang Wang, Liangming Pan, Yujiu Yang, Yixin Cao, Aixin Sun, Hany Awadalla, Weizhu Chen, 21 Feb 2024 (v2), SciAgent: Tool-augmented Language Models for Scientific Reasoning, https://arxiv.org/abs/2402.11451
Aaron Parisi, Yao Zhao, and Noah Fiedel. Talm: Tool augmented language models. arXiv preprint arXiv:2205.12255, 2022. https://arxiv.org/abs/2205.12255
Simranjit Singh, Andreas Karatzas, Michael Fore, Iraklis Anagnostopoulos, Dimitrios Stamoulis, 7 May 2024, An LLM-Tool Compiler for Fused Parallel Function Calling, https://arxiv.org/abs/2405.17438
Reyna Abhyankar, Zijian He, Vikranth Srivatsa, Hao Zhang, Yiying Zhang, 2024, INFERCEPT: Efficient Intercept Support for Augmented Large Language Model Inference, https://openreview.net/pdf?id=wDDGQabYPQ
Yisheng Xiao, Lijun Wu, Junliang Guo, Juntao Li, Min Zhang, Tao Qin, Tie-yan Liu, 6 Jul 2023 (v2), A Survey on Non-Autoregressive Generation for Neural Machine Translation and Beyond, https://arxiv.org/pdf/2204.09269.pdf
Reyna Abhyankar, Zijian He, Vikranth Srivatsa, Hao Zhang, Yiying Zhang, July 2024, InferCept: Efficient Intercept Support for Augmented Large Language Model Inference, Proceedings of the 41st International Conference on Machine Learning, PMLR 235:81-95, 2024, https://proceedings.mlr.press/v235/abhyankar24a.html PDF: https://raw.githubusercontent.com/mlresearch/v235/main/assets/abhyankar24a/abhyankar24a.pdf
Asim Biswal, Liana Patel, Siddarth Jha, Amog Kamsetty, Shu Liu, Joseph E. Gonzalez, Carlos Guestrin, Matei Zaharia, 27 Aug 2024, Text2SQL is Not Enough: Unifying AI and Databases with TAG, https://arxiv.org/abs/2408.14717 https://github.com/TAG-Research/TAG-Bench
Yaroslav Zharov, Yury Khudyakov, Evgeniia Fedotova, Evgeny Grigorenko, Egor Bogomolov, 18 Feb 2024, Tool-Augmented LLMs as a Universal Interface for IDEs, https://arxiv.org/abs/2402.11635
Amy Marks, Jun 11, 2024, Clarifying Function Calling / Tool Use in LLMs, https://medium.com/@aevalone/clarifying-function-calling-tool-use-in-llms-6511af510f99
Bohan Lyu, Yadi Cao, Duncan Watson-Parris, Leon Bergen, Taylor Berg-Kirkpatrick, Rose Yu, 1 Nov 2024, Adapting While Learning: Grounding LLMs for Scientific Problems with Intelligent Tool Usage Adaptation, https://arxiv.org/abs/2411.00412
Gaya Mehenni, Amal Zouaq, 23 Nov 2024, Ontology-Constrained Generation of Domain-Specific Clinical Summaries, https://arxiv.org/abs/2411.15666
Damien de Mijolla, Wen Yang, Philippa Duckett, Christopher Frye, Mark Worrall, 8 Dec 2024, Language hooks: a modular framework for augmenting LLM reasoning that decouples tool usage from the model and its prompt, https://arxiv.org/abs/2412.05967
Vincent-Pierre Berges, Barlas Oguz, December 12, 2024, Memory Layers at Scale, Meta, https://ai.meta.com/research/publications/memory-layers-at-scale/ https://github.com/facebookresearch/memory (Augmention of an LLM with an additional key-value associative memory, by replacing some FFNs with a "memory layer".)
Maxwell Zeff, November 20, 2024, Current AI scaling laws are showing diminishing returns, forcing AI labs to change course, https://techcrunch.com/2024/11/20/ai-scaling-laws-are-showing-diminishing-returns-forcing-ai-labs-to-change-course/ ("at least 10 to 20x gains in model performance ...intelligent prompting, UX decisions, and passing context at the right time into the models...")
Xinyu Pang, Ruixin Hong, Zhanke Zhou, Fangrui Lv, Xinwei Yang, Zhilong Liang, Bo Han, Changshui Zhang, 18 Dec 2024, Physics Reasoner: Knowledge-Augmented Reasoning for Solving Physics Problems with Large Language Models, https://arxiv.org/abs/2412.13791 (Augmented reasoning by retrieving physics formulas, checklists, and other relevant information.)
Dian Yu, Yuheng Zhang, Jiahao Xu, Tian Liang, Linfeng Song, Zhaopeng Tu, Haitao Mi, Dong Yu, 22 Dec 2024, Teaching LLMs to Refine with Tools, https://arxiv.org/abs/2412.16871
Muhayy Ud Din, Jan Rosell, Waseem Akram, Isiah Zaplana, Maximo A Roa, Lakmal Seneviratne, Irfan Hussain, 10 Dec 2024, Ontology-driven Prompt Tuning for LLM-based Task and Motion Planning, https://arxiv.org/abs/2412.07493 https://muhayyuddin.github.io/llm-tamp/ (Detecting objects in the prompt text and then using a RALM algorithm to query an ontology database.)
Oleksandr Palagin, Vladislav Kaverinskiy, Anna Litvin, Kyrylo Malakhov, 11 Jul 2023, OntoChatGPT Information System: Ontology-Driven Structured Prompts for ChatGPT Meta-Learning, International Journal of Computing, 22(2), 170-183, https://arxiv.org/abs/2307.05082 https://doi.org/10.47839/ijc.22.2.3086 https://computingonline.net/computing/article/view/3086
Xiangjue Dong, Maria Teleki, James Caverlee, 18 Dec 2024, A Survey on LLM Inference-Time Self-Improvement, https://arxiv.org/abs/2412.14352 https://github.com/dongxiangjue/Awesome-LLM-Self-Improvement (Broad survey of reasoning improvement methods from multi-step inference to RALM to decoding algorithms.)
Florian Dietz, Dietrich Klakow, 1 Jan 2025, IGC: Integrating a Gated Calculator into an LLM to Solve Arithmetic Tasks Reliably and Efficiently, https://arxiv.org/abs/2501.00684
Alhassan Mumuni, Fuseini Mumuni, 6 Jan 2025, Large language models for artificial general intelligence (AGI): A survey of foundational principles and approaches, https://arxiv.org/abs/2501.03151
Yixin Ji, Juntao Li, Hai Ye, Kaixin Wu, Jia Xu, Linjian Mo, Min Zhang, 5 Jan 2025, Test-time Computing: from System-1 Thinking to System-2 Thinking, https://arxiv.org/abs/2501.02497
Julian Perry, Surasakdi Siripong, Thanakorn Phonchai, 15 Jan 2025, Dynamic Knowledge Integration for Enhanced Vision-Language Reasoning, https://arxiv.org/abs/2501.08597 (Augment training data dynamically by retrieving extra information.)
Zhibin Gou, Zhihong Shao, Yeyun Gong, Yelong Shen, Yujiu Yang, Minlie Huang, Nan Duan, Weizhu Chen, 21 Feb 2024 (v4), ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving, https://arxiv.org/abs/2309.17452
Xinyuan Lu, Liangming Pan, Yubo Ma, Preslav Nakov, Min-Yen Kan, 18 Sep 2024, TART: An Open-Source Tool-Augmented Framework for Explainable Table-based Reasoning, https://arxiv.org/abs/2409.11724 https://github.com/XinyuanLu00/TART
Jianfeng Pan, Senyou Deng, Shaomang Huang, 4 Feb 2025, CoAT: Chain-of-Associated-Thoughts Framework for Enhancing Large Language Models Reasoning, https://arxiv.org/abs/2502.02390 (Integrating results from an "associative memory" in CoT reasoning paths at inference time.)
Ling Yang, Zhaochen Yu, Bin Cui, Mengdi Wang, 10 Feb 2025, ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates, https://arxiv.org/abs/2502.06772 https://github.com/Gen-Verse/ReasonFlux (RALM-like retrieval of reasoning prompt templates at inference time.)
Sam Lin, Wenyue Hua, Lingyao Li, Zhenting Wang, Yongfeng Zhang, 17 Feb 2025. ADO: Automatic Data Optimization for Inputs in LLM Prompts, https://arxiv.org/pdf/2502.11436 (Reformulating the input context such as by semantical marking of relevant content or formatting changes.)
Andrew Neeser, Kaylen Latimer, Aadyant Khatri, Chris Latimer, Naren Ramakrishnan, 16 Feb 2025, QuOTE: Question-Oriented Text Embeddings, https://arxiv.org/abs/2502.10976 (Augmenting RAG chunks with additional information, such as questions the chunk might answer.)
C Winston, R Just, Feb 2025, A Taxonomy of Failures in Tool-Augmented LLMs, https://homes.cs.washington.edu/~rjust/publ/tallm_testing_ast_2025.pdf
Wendi Cui, Jiaxin Zhang, Zhuohang Li, Hao Sun, Damien Lopez, Kamalika Das, Bradley A. Malin, Sricharan Kumar, 26 Feb 2025, Automatic Prompt Optimization via Heuristic Search: A Survey, https://arxiv.org/abs/2502.18746 (Survey of auto prompting, from basic LLM enhancements to some methods quite similar to RALM and TALM.)

Aussie AI

Tool Augmented Language Models (TALM)

What is Tool Augmented Language Models (TALM)?

Research on TALM

More AI Research

Quick Links

Product

New to Writing?

Writing Styles