Aussie AI

LLM Reasoning Research

Last Updated 15 October, 2025

by David Spuler, Ph.D.

Reasoning is a key part of intelligence, and much work is ongoing to improve higher-level reasoning of AI models. Examples include solving mathematical problems or performing multi-step planning such as booking a holiday.

There are two main categories of methods to improve reasoning ability:

Training methods ("white box reasoning")
Multi-step inference methods ("black box reasoning")

You may also be interested in our recent research and blog articles:

Training-Based Reasoning

White Box Reasoning is the training of the weights internal to an LLM so that it performs better on reasoning tasks. Historically, the first idea to create smarter models was always to train an LLM using better data and better techniques. This has improved raw results on "reasoning" and "generalization" tasks.

Lately, this has given rise to the Large Reasoner Model (LRM) architectures, in two main types. There are the trained reasoning models that still give an answer in one step, and there are the multi-step inference models that use multiple steps and "test time compute" to give better answers to complex questions.

The single-shot inference types of reasoning models do rely on prompt engineering to get the LLM to do its reasoning steps. Many of the basic prmpt engineering ideas are applicable here:

Basic step prompting ("Let's think step by step")
Emotional prompting
Roles/personas
CoT prompting
Zero-shot CoT prompting
Echo prompting ("Let's repeat the question")
Self-consistency
Self-ask (followup questions)
Exemplars (In-Content Learning)

The major LRMs are using more advanced meta-prompts for reasoning, for either single-step or multi-step reasoning, but these prompts are commercially sensitive and not usually available. Interestingly, the meta-prompt for the single-step DeepSeek R1 reasoning model was disclosed in their paper (https://arxiv.org/abs/2501.12948):

    A conversation between User and Assistant. The user asks a question, and the Assistant solves it.
    The assistant first thinks about the reasoning process in the mind and then provides the user
    with the answer. The reasoning process and answer are enclosed within <think> </think> and
    <answer> </answer> tags, respectively, i.e., <think> reasoning process here </think>
    <answer> answer here </answer>. User: PROMPT. Assistant:

Fine-tuning on a more specialized subset of relevant data is a particular submethod of this area. There has been much improvement in this area, in both the capabilities of high-end large SOTA models and also at the other end of the spectrum with Small Language Models (SLMs). See more about training methods, but note that there hasn't yet been much research about fine-tuning of reasoning capabilities.

Inference-Based Reasoning

Black Box Reasoning is the use of multiple steps of inference, wrapped around an LLM. The second idea is to treat the LLM as a "black box" and try to use more LLM calls to improve its reasoning abilities. These are called "few-shot" or "many-shot" or "multi-step" reasoning methods.

Chain-of-thought is the best known of these methods, having been adopted by OpenAI for the "o1" models released in September, 2024. However, multi-step reasoning is a longstanding area of research, with much overlap with prompt engineering techniques. There are numerous methods of doing this type of multiple calls to LLMs in the literature:

Chain-of-thought (CoT)
Self-reflection
Skeleton-of-thought
Best-of-N (BoN) method
Majority voting
Self-consistency decoding
Programmatic prompting
Tree-of-Thoughts (ToT) prompting
Chain-of-Symbols (CoS) prompting
Graph-of-Thoughts (GoT)
Algorithm-of-Thoughts (AoT)
Buffer of Thoughts
Least-to-Most prompting
Chain-of-Table prompting
Thread-of-Thought (ThoT) prompting
System 2 Attention (S2A) prompting
Chain-of-Verification (CoVe) prompting
ReAct prompting (reason-and-act)
Rephrase-and-Respond (RaR) prompting
Chain-of-Knowledge (CoK) prompting
Contrastive Chain-of-Thought (CCoT) prompting
Program of Thoughts (PoT) prompting
Structured Chain-of-Thought (SCoT) prompting
Chain-of-Code (CoC) prompting
Take a Step Back prompting

Also related to these areas are the various other ways to have the LLM give a "better" answer, even if it's not really using improved reasoning. The simplest ideas include prompt engineering techniques to give the LLM a better query, RAG architectures and Retrieval Augmented Language Models (RALM) to give an LLM more relevant source data, and also dynamic tool usage integrations to generalize the LLM's capabilities to handle answers that require computations. Also relevant is the research on improving answers by fixing specific LLM limitations such as hallucinations, mathematical problem solving difficulties, and language wordplay (in)abilities.

Long Answers versus Multiple Inference Steps

One of the nuances in the distinction between zero-shot reasoner models and multiple steps of inference is the simplest of ideas: output longer answers. Large Reasoner Models with a single-step architecture, such as DeepSeek R1, mimic the steps of reasoning by repeatedly extending the answers with re-phrased reasoning steps about the problem. This is analogous to multi-step inference reasoning, but the model is "talking to itself" about how to reason through the problem, all in one step of inference.

In effect, the sequence of multiple outputs in chained multi-step reasoning is merged into a single output stream of text. The model is deciding whether or not another step is required as part of the normal decoding phase. The output from these types of single-step reasoner models is a readable sequence showing how the model thought through a problem. Hence, the output to achieve a final answer can be a very long token sequence, which can be costly, and it's important to not restrict the "max tokens" settings in these cases.

Inference costs are obviously higher for producing an extended answer with many of the intermediate thoughts written to the answer. However, the number of tokens in multi-step inference is also high. Whether a single-inference model's long answer will be more or less tokens than a multi-step implementation of Chain-of-Thought is not really clear (need some papers!), but the reasoning ability is high for either approach.

Survey Papers on LLM Reasoning

Survey and review papers on reasoning:

Xiangjue Dong, Maria Teleki, James Caverlee, 18 Dec 2024, A Survey on LLM Inference-Time Self-Improvement, https://arxiv.org/abs/2412.14352 https://github.com/dongxiangjue/Awesome-LLM-Self-Improvement (Broad survey of reasoning improvement methods from multi-step inference to RALM to decoding algorithms.)
Aske Plaat, Annie Wong, Suzan Verberne, Joost Broekens, Niki van Stein, Thomas Back, 16 Jul 2024, Reasoning with Large Language Models, a Survey, https://arxiv.org/abs/2407.11511
Alhassan Mumuni, Fuseini Mumuni, 6 Jan 2025, Large language models for artificial general intelligence (AGI): A survey of foundational principles and approaches, https://arxiv.org/abs/2501.03151
Yixin Ji, Juntao Li, Hai Ye, Kaixin Wu, Jia Xu, Linjian Mo, Min Zhang, 5 Jan 2025, Test-time Computing: from System-1 Thinking to System-2 Thinking, https://arxiv.org/abs/2501.02497
Violet Xiang, Charlie Snell, Kanishk Gandhi, Alon Albalak, Anikait Singh, Chase Blagden, Duy Phung, Rafael Rafailov, Nathan Lile, Dakota Mahan, Louis Castricato, Jan-Philipp Franken, Nick Haber, Chelsea Finn, 8 Jan 2025, Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Thought, https://arxiv.org/abs/2501.04682
Fengli Xu, Qianyue Hao, Zefang Zong, Jingwei Wang, Yunke Zhang, Jingyi Wang, Xiaochong Lan, Jiahui Gong, Tianjian Ouyang, Fanjin Meng, Chenyang Shao, Yuwei Yan, Qinglong Yang, Yiwen Song, Sijian Ren, Xinyuan Hu, Yu Li, Jie Feng, Chen Gao, Yong Li, 17 Jan 2025 (v2), Towards Large Reasoning Models: A Survey on Scaling LLM Reasoning Capabilities, https://arxiv.org/abs/2501.09686
Jie Huang and Kevin Chen-Chuan Chang. July 2023. Towards Reasoning in Large Language Models: A Survey. In Findings of the Association for Computational Linguistics: ACL 2023, pages 1049–1065, Toronto, Canada. Association for Computational Linguistics. https://aclanthology.org/2023.findings-acl.67/
Seungpil Lee, Woochang Sim, Donghyeon Shin, Wongyu Seo, Jiwon Park, Seokki Lee, Sanha Hwang, Sejin Kim, and Sundong Kim. Jan 2025. Reasoning Abilities of Large Language Models: In-Depth Analysis on the Abstraction and Reasoning Corpus. ACM Trans. Intell. Syst. Technol. https://doi.org/10.1145/3712701 https://dl.acm.org/doi/10.1145/3712701 https://dl.acm.org/doi/pdf/10.1145/3712701
Maciej Besta, Julia Barth, Eric Schreiber, Ales Kubicek, Afonso Catarino, Robert Gerstenberger, Piotr Nyczyk, Patrick Iff, Yueling Li, Sam Houliston, Tomasz Sternal, Marcin Copik, Grzegorz Kwaśniewski, Jürgen Müller, Łukasz Flis, Hannes Eberhard, Hubert Niewiadomski, Torsten Hoefler, 23 Jan 2025 (v3), Reasoning Language Models: A Blueprint, https://arxiv.org/abs/2501.11223 (Survey and blueprint for how to build a Large Reasoning Model.)
Mohit Sewak, Ph.D., January 29, 2025, Achieving General Intelligence (AGI) and Super Intelligence (ASI): Pathways, Uncertainties, and Ethical Concerns, https://towardsai.net/p/l/achieving-general-intelligence-agi-and-super-intelligence-asi-pathways-uncertainties-and-ethical-concerns
Avinash Patil, 5 Feb 2025, Advancing Reasoning in Large Language Models: Promising Methods and Approaches, https://arxiv.org/abs/2502.03671
Hieu Minh "Jord" Nguyen, 10 Feb 2025, A Survey of Theory of Mind in Large Language Models: Evaluations, Representations, and Safety Risks, https://arxiv.org/abs/2502.06470
Hanmeng Liu, Zhizhang Fu, Mengru Ding, Ruoxi Ning, Chaoli Zhang, Xiaozhang Liu, Yue Zhang, 13 Feb 2025, Logical Reasoning in Large Language Models: A Survey, https://arxiv.org/abs/2502.09100
Fengxiang Cheng, Haoxuan Li, Fenrong Liu, Robert van Rooij, Kun Zhang, Zhouchen Lin, 24 Feb 2025 (v2), Empowering LLMs with Logical Reasoning: A Comprehensive Survey, https://arxiv.org/abs/2502.15652
Cameron R. Wolfe, Feb 18, 2025, Demystifying Reasoning Models: Understanding reasoning models and their relation to standard LLMs... https://cameronrwolfe.substack.com/p/demystifying-reasoning-models
Zhong-Zhi Li, Duzhen Zhang, Ming-Liang Zhang, Jiaxin Zhang, Zengyan Liu, Yuxuan Yao, Haotian Xu, Junhao Zheng, Pei-Jie Wang, Xiuyi Chen, Yingying Zhang, Fei Yin, Jiahua Dong, Zhijiang Guo, Le Song, Cheng-Lin Liu, 25 Feb 2025 (v2), From System 1 to System 2: A Survey of Reasoning Large Language Models, https://arxiv.org/abs/2502.17419
Komal Kumar, Tajamul Ashraf, Omkar Thawakar, Rao Muhammad Anwer, Hisham Cholakkal, Mubarak Shah, Ming-Hsuan Yang, Phillip H.S. Torr, Salman Khan, Fahad Shahbaz Khan, 28 Feb 2025, LLM Post-Training: A Deep Dive into Reasoning Large Language Models, https://arxiv.org/abs/2502.21321 https://github.com/mbzuai-oryx/Awesome-LLM-Post-training
Guiyao Tie, Zeli Zhao, Dingjie Song, Fuyang Wei, Rong Zhou, Yurou Dai, Wen Yin, Zhejian Yang, Jiangyue Yan, Yao Su, Zhenhan Dai, Yifeng Xie, Yihan Cao, Lichao Sun, Pan Zhou, Lifang He, Hechang Chen, Yu Zhang, Qingsong Wen, Tianming Liu, Neil Zhenqiang Gong, Jiliang Tang, Caiming Xiong, Heng Ji, Philip S. Yu, Jianfeng Gao, 8 Mar 2025, A Survey on Post-training of Large Language Models, https://arxiv.org/abs/2503.06072
Qiguang Chen, Libo Qin, Jinhao Liu, Dengyun Peng, Jiannan Guan, Peng Wang, Mengkang Hu, Yuhang Zhou, Te Gao, Wanxiang Che, 13 Mar 2025 (v2), Towards Reasoning Era: A Survey of Long Chain-of-Thought for Reasoning Large Language Models, https://arxiv.org/abs/2503.09567 (Massive and broad survey of all types of reasoning.)
Yaoting Wang, Shengqiong Wu, Yuecheng Zhang, William Wang, Ziwei Liu, Jiebo Luo, Hao Fei, 16 Mar 2025, Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey, https://arxiv.org/abs/2503.12605
Dibyanayan Bandyopadhyay, Soham Bhattacharjee, Asif Ekbal, 13 Mar 2025, Thinking Machines: A Survey of LLM based Reasoning Strategies, https://arxiv.org/abs/2503.10814
Xiaoye Qu, Yafu Li, Zhaochen Su, Weigao Sun, Jianhao Yan, Dongrui Liu, Ganqu Cui, Daizong Liu, Shuxian Liang, Junxian He, Peng Li, Wei Wei, Jing Shao, Chaochao Lu, Yue Zhang, Xian-Sheng Hua, Bowen Zhou, Yu Cheng, 27 Mar 2025, A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond, https://arxiv.org/abs/2503.21614
Ali Forootani, 22 Mar 2025, A Survey on Mathematical Reasoning and Optimization with Large Language Models, https://arxiv.org/abs/2503.17726
Qianjun Pan, Wenkai Ji, Yuyang Ding, Junsong Li, Shilian Chen, Junyi Wang, Jie Zhou, Qin Chen, Min Zhang, Yulan Wu, Liang He, 8 May 2025 (v2), A Survey of Slow Thinking-based Reasoning LLMs using Reinforced Learning and Inference-time Scaling Law, https://arxiv.org/abs/2505.02665
Zixuan Ke, Fangkai Jiao, Yifei Ming, Xuan-Phi Nguyen, Austin Xu, Do Xuan Long, Minzhi Li, Chengwei Qin, Peifeng Wang, Silvio Savarese, Caiming Xiong, Shafiq Joty, 5 Aug 2025 (v3), A Survey of Frontiers in LLM Reasoning: Inference Scaling, Learning to Reason, and Agentic Systems, https://arxiv.org/abs/2504.09037

Reasoning Theory

Papers about the deeper theory of what "reasoning" means:

Eghbal Hosseini, Colton Casto, Noga Zaslavsky, Colin Conwell, Mark Richardson, Evelina Fedorenko, Dec 2024, Universality of representation in biological and artificial neural networks, bioRxiv 2024.12.26.629294; doi: https://doi.org/10.1101/2024.12.26.629294 https://www.biorxiv.org/content/10.1101/2024.12.26.629294
Kuang-Huei Lee, Ian Fischer, Yueh-Hua Wu, Dave Marwood, Shumeet Baluja, Dale Schuurmans, Xinyun Chen, 17 Jan 2025, Evolving Deeper LLM Thinking, https://arxiv.org/abs/2501.09891 (An alternative search strategy broad/deep, compared to CoT and reflection.)
G Bao, H Zhang, C Wang, L Yang, Y Zhang, Jan 2025, How Likely Do LLMs with CoT Mimic Human Reasoning? Proceedings of the 31st International Conference on Computational Linguistics, pages 7831–7850, January 19–24, 2025, https://aclanthology.org/2025.coling-main.524.pdf
Santosh Kumar Radha, Oktay Goktas, 23 Jan 2025, On the Reasoning Capacity of AI Models and How to Quantify It, https://arxiv.org/abs/2501.13833
Alireza Amiri, Xinting Huang, Mark Rofin, Michael Hahn, 4 Feb 2025, Lower Bounds for Chain-of-Thought Reasoning in Hard-Attention Transformers, https://arxiv.org/abs/2502.02393
Ahmed El-Kishky, Alexander Wei, Andre Saraiva, Borys Minaev, Daniel Selsam, David Dohan, Francis Song, Hunter Lightman, Ignasi Clavera, Jakub Pachocki, Jerry Tworek, Lorenz Kuhn, Lukasz Kaiser, Mark Chen, Max Schwarzer, Mostafa Rohaninejad, Nat McAleese, o3 contributors, Oleg Mürk, Rhythm Garg, Rui Shu, Szymon Sidor, Vineet Kosaraju, Wenda Zhou, 3 Feb 2025, Competitive Programming with Large Reasoning Models, https://arxiv.org/abs/2502.06807 (OpenAI's paper on o3 that has similar conclusions to what DeepSeek showed about Reinforcement Learning for reasoning models, namely that "scaling general-purpose reinforcement learning" still works.)
Xinhao Yao, Ruifeng Ren, Yun Liao, Yong Liu, 7 Feb 2025, Unveiling the Mechanisms of Explicit CoT Training: How Chain-of-Thought Enhances Reasoning Generalization, https://arxiv.org/abs/2502.04667
Hanmeng Liu, Zhizhang Fu, Mengru Ding, Ruoxi Ning, Chaoli Zhang, Xiaozhang Liu, Yue Zhang, 13 Feb 2025, Logical Reasoning in Large Language Models: A Survey, https://arxiv.org/abs/2502.09100
Kechen Li, Wenqi Zhu, Coralia Cartis, Tianbo Ji, Shiwei Liu, 27 Feb 2025, SoS1: O1 and R1-Like Reasoning LLMs are Sum-of-Square Solvers, https://arxiv.org/abs/2502.20545
Yijiong Yu, 16 Jan 2025 (v4), Do LLMs Really Think Step-by-step In Implicit Reasoning? https://arxiv.org/abs/2411.15862 https://github.com/yuyijiong/if_step_by_step_implicit_CoT
Marius Jahrens, Thomas Martinetz, 12 Mar 2025, Why LLMs Cannot Think and How to Fix It, https://arxiv.org/abs/2503.09211
Pengcheng Wen, Jiaming Ji, Chi-Min Chan, Juntao Dai, Donghai Hong, Yaodong Yang, Sirui Han, Yike Guo, 17 Mar 2025, ThinkPatterns-21k: A Systematic Study on the Impact of Thinking Patterns in LLMs, https://arxiv.org/abs/2503.12918
Dibyanayan Bandyopadhyay, Soham Bhattacharjee, Asif Ekbal, 13 Mar 2025, Thinking Machines: A Survey of LLM based Reasoning Strategies, https://arxiv.org/abs/2503.10814
Jiaran Ye, Zijun Yao, Zhidian Huang, Liangming Pan, Jinxin Liu, Yushi Bai, Amy Xin, Liu Weichuan, Xiaoyin Che, Lei Hou, Juanzi Li, 29 May 2025, How does Transformer Learn Implicit Reasoning? https://arxiv.org/abs/2505.23653
Róbert Csordás, Christopher D. Manning, Christopher Potts, 30 May 2025 (v2), Do Language Models Use Their Depth Efficiently? https://arxiv.org/abs/2505.13898

Reasoning Model Evaluation

Papers about testing LLMs (and overall systems) for their reasoning abilities:

Aske Plaat, Annie Wong, Suzan Verberne, Joost Broekens, Niki van Stein, Thomas Back, 16 Jul 2024, Reasoning with Large Language Models, a Survey, https://arxiv.org/abs/2407.11511
Fengli Xu, Qianyue Hao, Zefang Zong, Jingwei Wang, Yunke Zhang, Jingyi Wang, Xiaochong Lan, Jiahui Gong, Tianjian Ouyang, Fanjin Meng, Chenyang Shao, Yuwei Yan, Qinglong Yang, Yiwen Song, Sijian Ren, Xinyuan Hu, Yu Li, Jie Feng, Chen Gao, Yong Li, 17 Jan 2025 (v2), Towards Large Reasoning Models: A Survey on Scaling LLM Reasoning Capabilities, https://arxiv.org/abs/2501.09686
Maciej Besta, Julia Barth, Eric Schreiber, Ales Kubicek, Afonso Catarino, Robert Gerstenberger, Piotr Nyczyk, Patrick Iff, Yueling Li, Sam Houliston, Tomasz Sternal, Marcin Copik, Grzegorz Kwaśniewski, Jürgen Müller, Łukasz Flis, Hannes Eberhard, Hubert Niewiadomski, Torsten Hoefler, 23 Jan 2025 (v3), Reasoning Language Models: A Blueprint, https://arxiv.org/abs/2501.11223 (Survey and blueprint for how to build a Large Reasoning Model.)
Santosh Kumar Radha, Oktay Goktas, 23 Jan 2025, On the Reasoning Capacity of AI Models and How to Quantify It, https://arxiv.org/abs/2501.13833
Ben Dickson, January 31, 2025, Beyond benchmarks: How DeepSeek-R1 and o1 perform on real-world tasks, https://venturebeat.com/ai/beyond-benchmarks-how-deepseek-r1-and-o1-perform-on-real-world-tasks/
Guizhen Chen, Weiwen Xu, Hao Zhang, Hou Pong Chan, Chaoqun Liu, Lidong Bing, Deli Zhao, Anh Tuan Luu, Yu Rong, 27 Feb 2025, FINEREASON: Evaluating and Improving LLMs' Deliberate Reasoning through Reflective Puzzle Solving, https://arxiv.org/abs/2502.20238
Avinash Patil, 5 Feb 2025, Advancing Reasoning in Large Language Models: Promising Methods and Approaches, https://arxiv.org/abs/2502.03671
Komal Kumar, Tajamul Ashraf, Omkar Thawakar, Rao Muhammad Anwer, Hisham Cholakkal, Mubarak Shah, Ming-Hsuan Yang, Phillip H.S. Torr, Salman Khan, Fahad Shahbaz Khan, 28 Feb 2025, LLM Post-Training: A Deep Dive into Reasoning Large Language Models, https://arxiv.org/abs/2502.21321 https://github.com/mbzuai-oryx/Awesome-LLM-Post-training
Qiguang Chen, Libo Qin, Jinhao Liu, Dengyun Peng, Jiannan Guan, Peng Wang, Mengkang Hu, Yuhang Zhou, Te Gao, Wanxiang Che, 13 Mar 2025 (v2), Towards Reasoning Era: A Survey of Long Chain-of-Thought for Reasoning Large Language Models, https://arxiv.org/abs/2503.09567 (Massive and broad survey of all types of reasoning.)
Yansheng Qiu, Li Xiao, Zhaopan Xu, Pengfei Zhou, Zheng Wang, Kaipeng Zhang, 16 May 2025, Human-Aligned Bench: Fine-Grained Assessment of Reasoning Ability in MLLMs vs. Humans, https://arxiv.org/abs/2505.11141
Michael Nuñez, July 15, 2025, OpenAI, Google DeepMind and Anthropic sound alarm: ‘We may be losing the ability to understand AI’, https://venturebeat.com/ai/openai-google-deepmind-and-anthropic-sound-alarm-we-may-be-losing-the-ability-to-understand-ai/ (Monitoring the text-based interim "thinking-out-loud" reasoning of models in CoT.)
Tomek Korbak, Mikita Balesni, (and many more authors) July 2025, Chain of Thought Monitorability: A New and Fragile Opportunity for AI Safety, https://tomekkorbak.com/cot-monitorability-is-a-fragile-opportunity/cot_monitoring.pdf
Shima Imani, Liang Du, Harsh Shrivastava, 4 Mar 2023, MathPrompter: Mathematical Reasoning using Large Language Models, https://arxiv.org/abs/2303.05398 (Assess confidence of math reasoning and also modify the prompting, uses a CoT style reasoning that generates multiple equations and Python scripts to solve a problem, and then evaluates them.)

Large Reasoning Models (LRMs)

Large Reasoning Models (LRMs) are large-scale LLMs that have been trained on advanced reasoning capabilities. Their architecture may be training-only, but increasingly the architectures include multi-step inference or "test time compute" reasoning capabilities such as Chain-of-Thought.

Papers on large reasoning models:

Ignacio de Gregorio, Dec 2024, Uncovering OpenAI’s Frontier AI Strategy, https://medium.com/@ignacio.de.gregorio.noblejas/uncovering-openais-frontier-ai-strategy-a02e0aa5320e
Xiaoxi Li, Guanting Dong, Jiajie Jin, Yuyao Zhang, Yujia Zhou, Yutao Zhu, Peitian Zhang, Zhicheng Dou, 9 Jan 2025, Search-o1: Agentic Search-Enhanced Large Reasoning Models, https://arxiv.org/abs/2501.05366 https://github.com/sunnynexus/Search-o1 (RAG retrieval and agentic methods applied to Large Reasoning Models.)
Fengli Xu, Qianyue Hao, Zefang Zong, Jingwei Wang, Yunke Zhang, Jingyi Wang, Xiaochong Lan, Jiahui Gong, Tianjian Ouyang, Fanjin Meng, Chenyang Shao, Yuwei Yan, Qinglong Yang, Yiwen Song, Sijian Ren, Xinyuan Hu, Yu Li, Jie Feng, Chen Gao, Yong Li, 17 Jan 2025 (v2), Towards Large Reasoning Models: A Survey on Scaling LLM Reasoning Capabilities, https://arxiv.org/abs/2501.09686
OpenAI, September 12, 2024 Learning to reason with LLMs. We are introducing OpenAI o1, a new large language model trained with reinforcement learning to perform complex reasoning. o1 thinks before it answers—it can produce a long internal chain of thought before responding to the user. https://openai.com/index/learning-to-reason-with-llms/
Aske Plaat, Annie Wong, Suzan Verberne, Joost Broekens, Niki van Stein, Thomas Back, 16 Jul 2024, Reasoning with Large Language Models, a Survey, https://arxiv.org/abs/2407.11511
Jie Huang and Kevin Chen-Chuan Chang. July 2023. Towards Reasoning in Large Language Models: A Survey. In Findings of the Association for Computational Linguistics: ACL 2023, pages 1049–1065, Toronto, Canada. Association for Computational Linguistics. https://aclanthology.org/2023.findings-acl.67/
Seungpil Lee, Woochang Sim, Donghyeon Shin, Wongyu Seo, Jiwon Park, Seokki Lee, Sanha Hwang, Sejin Kim, and Sundong Kim. Jan 2025. Reasoning Abilities of Large Language Models: In-Depth Analysis on the Abstraction and Reasoning Corpus. ACM Trans. Intell. Syst. Technol. https://doi.org/10.1145/3712701 https://dl.acm.org/doi/10.1145/3712701 https://dl.acm.org/doi/pdf/10.1145/3712701
Demis Hassabis, Jan 2025, X post: Announcing Gemini 2.0 Flash https://x.com/demishassabis/status/1881844417746632910 (Gemini 2.0 Flash from Google is a Large Reasoning Model with a 1M ultra-long context.)
Maciej Besta, Julia Barth, Eric Schreiber, Ales Kubicek, Afonso Catarino, Robert Gerstenberger, Piotr Nyczyk, Patrick Iff, Yueling Li, Sam Houliston, Tomasz Sternal, Marcin Copik, Grzegorz Kwaśniewski, Jürgen Müller, Łukasz Flis, Hannes Eberhard, Hubert Niewiadomski, Torsten Hoefler, 23 Jan 2025 (v3), Reasoning Language Models: A Blueprint, https://arxiv.org/abs/2501.11223 (Survey and blueprint for how to build a Large Reasoning Model.)
Alberto Romero, Jan 2025, DeepSeek, a little-known Chinese startup, released R1 yesterday, https://substack.com/@thealgorithmicbridge/note/c-87664591-
DeepSeek-AI, Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Ruoyu Zhang, Runxin Xu, Qihao Zhu, Shirong Ma, Peiyi Wang, Xiao Bi, Xiaokang Zhang, Xingkai Yu, Yu Wu, Z.F. Wu, Zhibin Gou, Zhihong Shao, Zhuoshu Li, Ziyi Gao, Aixin Liu, Bing Xue, Bingxuan Wang, Bochao Wu, Bei Feng, Chengda Lu, Chenggang Zhao, Chengqi Deng, Chenyu Zhang, Chong Ruan, Damai Dai, Deli Chen, Dongjie Ji, Erhang Li, Fangyun Lin, Fucong Dai, Fuli Luo, Guangbo Hao, Guanting Chen, Guowei Li, H. Zhang, Han Bao, Hanwei Xu, Haocheng Wang, Honghui Ding, Huajian Xin, Huazuo Gao, Hui Qu, Hui Li, Jianzhong Guo, et al. (100+ additional authors not shown), 22 Jan 2025, DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning, https://arxiv.org/abs/2501.12948 (The DeepSeek R1 large reasoning model.)
G Wang, S Zhang, T Zhan, Z Shen, J Li, X Hu, X Sun, Jan 2025, Unlocking the Mysteries of OpenAI o1: A Survey of the Reasoning Abilities of Large Language Models, https://openreview.net/pdf?id=J0ADLa2rNp
Ben Dickson, January 31, 2025, Beyond benchmarks: How DeepSeek-R1 and o1 perform on real-world tasks, https://venturebeat.com/ai/beyond-benchmarks-how-deepseek-r1-and-o1-perform-on-real-world-tasks/
Deqian Kong, Minglu Zhao, Dehong Xu, Bo Pang, Shu Wang, Edouardo Honig, Zhangzhang Si, Chuan Li, Jianwen Xie, Sirui Xie, Ying Nian Wu, 3 Feb 2025, Scalable Language Models with Posterior Inference of Latent Thought Vectors, https://arxiv.org/abs/2502.01567
Ahmed El-Kishky, Alexander Wei, Andre Saraiva, Borys Minaev, Daniel Selsam, David Dohan, Francis Song, Hunter Lightman, Ignasi Clavera, Jakub Pachocki, Jerry Tworek, Lorenz Kuhn, Lukasz Kaiser, Mark Chen, Max Schwarzer, Mostafa Rohaninejad, Nat McAleese, o3 contributors, Oleg Mürk, Rhythm Garg, Rui Shu, Szymon Sidor, Vineet Kosaraju, Wenda Zhou, 3 Feb 2025, Competitive Programming with Large Reasoning Models, https://arxiv.org/abs/2502.06807 (OpenAI's paper on o3 that has similar conclusions to what DeepSeek showed about Reinforcement Learning for reasoning models, namely that "scaling general-purpose reinforcement learning" still works.)
DiJia Su, Hanlin Zhu, Yingchen Xu, Jiantao Jiao, Yuandong Tian, Qinqing Zheng, 5 Feb 2025. Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning, https://arxiv.org/abs/2502.03275
Cameron R. Wolfe, Feb 18, 2025, Demystifying Reasoning Models: Understanding reasoning models and their relation to standard LLMs... https://cameronrwolfe.substack.com/p/demystifying-reasoning-models
Jeremy Kahn, February 28, 2025, OpenAI launches long-awaited GPT-4.5 — but ‘Orion’s’ capabilities already lag competitors, https://fortune.com/2025/02/27/openai-gpt-4-5-orion-launch-sam-altman-benchmarks/
Komal Kumar, Tajamul Ashraf, Omkar Thawakar, Rao Muhammad Anwer, Hisham Cholakkal, Mubarak Shah, Ming-Hsuan Yang, Phillip H.S. Torr, Salman Khan, Fahad Shahbaz Khan, 28 Feb 2025, LLM Post-Training: A Deep Dive into Reasoning Large Language Models, https://arxiv.org/abs/2502.21321 https://github.com/mbzuai-oryx/Awesome-LLM-Post-training
Asif Razzaq, March 5, 2025, Qwen Releases QwQ-32B: A 32B Reasoning Model that Achieves Significantly Enhanced Performance in Downstream Task, https://www.marktechpost.com/2025/03/05/qwen-releases-qwq-32b-a-32b-reasoning-model-that-achieves-significantly-enhanced-performance-in-downstream-task/ (Features 32B parameters, 32K context length, 64 layers, RoPE, SwiGLU, RMSNorm, and attention enhancements.)
Parshin Shojaee, Maxwell Horton, Iman Mirzadeh, Samy Bengio, Keivan Alizadeh, June 2025, The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity, Apple, https://machinelearning.apple.com/research/illusion-of-thinking https://ml-site.cdn-apple.com/papers/the-illusion-of-thinking.pdf
Dr. Ashish Bamania, June 2025, Apple’s New Research Shows That LLM Reasoning Is Completely Broken: A deep dive into Apple research that exposes the flawed thinking process in state-of-the-art Reasoning LLMs, https://ai.gopubby.com/apples-new-research-shows-that-llm-reasoning-is-completely-broken-47b5be71a06a
Ryan Browne, Jun 10 2025, Microsoft-backed AI lab Mistral is launching its first reasoning model in challenge to OpenAI, https://www.cnbc.com/2025/06/10/microsoft-backed-ai-lab-mistral-debuts-reasoning-model-to-rival-openai.html (Mistral's new LRM has multilingual reasoning.)
Bowen Ding, Yuhan Chen, Futing Wang, Lingfeng Ming, Tao Lin, 30 Jun 2025, Do Thinking Tokens Help or Trap? Towards More Efficient Large Reasoning Model, https://arxiv.org/abs/2506.23840
Bin Hong, Jiayu Liu, Zhenya Huang, Kai Zhang, Mengdi Zhang, 13 Aug 2025, Pruning Long Chain-of-Thought of Large Reasoning Models via Small-Scale Preference Optimization, https://arxiv.org/abs/2508.10164
Zhipeng Chen, Xiaobo Qin, Youbin Wu, Yue Ling, Qinghao Ye, Wayne Xin Zhao, Guang Shi, 14 Aug 2025, Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models, https://arxiv.org/abs/2508.10751
Datta Nimmaturi, Vaishnavi Bhargava, Rajat Ghosh, Johnu George, Debojyoti Dutta, 24 Jul 2025, Predictive Scaling Laws for Efficient GRPO Training of Large Reasoning Models, https://arxiv.org/abs/2507.18014
Kaiwen Chen, Xin Tan, Minchen Yu, Hong Xu, 29 Jul 2025, MemShare: Memory Efficient Inference for Large Reasoning Models through KV Cache Reuse, https://arxiv.org/abs/2507.21433
Tao He, Rongchuan Mu, Lizi Liao, Yixin Cao, Ming Liu, and Bing Qin, 31 Jul 2025, Good Learners Think Their Thinking: Generative PRM Makes Large Reasoning Model More Efficient Math Learner, https://arxiv.org/abs/2507.23317
Dadi Guo, Jiayu Liu, Zhiyuan Fan, Zhitao He, Haoran Li, Yumeng Wang, Yi R. Fung, 31 Jul 2025, Mathematical Proof as a Litmus Test: Revealing Failure Modes of Advanced Large Reasoning Models, https://arxiv.org/abs/2506.17114
Linan Yue, Yichao Du, Yizhi Wang, Weibo Gao, Fangzhou Yao, Li Wang, Ye Liu, Ziyu Xu, Qi Liu, Shimin Di, Min-Ling Zhang, 4 Aug 2025, Don't Overthink It: A Survey of Efficient R1-style Large Reasoning Models, https://arxiv.org/abs/2508.02120
Yule Liu, Jingyi Zheng, Zhen Sun, Zifan Peng, Wenhan Dong, Zeyang Sha, Shiwen Cui, Weiqiang Wang, Xinlei He, 4 Aug 2025, Thought Manipulation: External Thought Can Be Efficient for Large Reasoning Models, https://arxiv.org/abs/2504.13626
Junhong Wu, Jinliang Lu, Zixuan Ren, Ganqiang Hu, Zhi Wu, Dai Dai, Hua Wu, 5 Aug 2025, LLMs Have a Heart of Stone: Demystifying the Soft Thinking Ability of Large Reasoning Models, https://arxiv.org/abs/2508.03440
Yuan Xun, Xiaojun Jia, Xinwei Liu, Hua Zhang, 6 Aug 2025, The Emotional Baby Is Truly Deadly: Does your Multimodal Large Reasoning Model Have Emotional Flattery towards Humans?, https://arxiv.org/abs/2508.03986
Rui Ha, Chaozhuo Li, Rui Pu, Sen Su, 6 Aug 2025, From "Aha Moments" to Controllable Thinking: Toward Meta-Cognitive Reasoning in Large Reasoning Models via Decoupled Reasoning and Control, https://arxiv.org/abs/2508.04460
Thilo Hagendorff, Erik Derner, Nuria Oliver, 4 Aug 2025, Large Reasoning Models Are Autonomous Jailbreak Agents, https://arxiv.org/abs/2508.04039
Yuquan Wang, Mi Zhang, Yining Wang, Geng Hong, Xiaoyu You, Min Yang, 6 Aug 2025, ReasoningGuard: Safeguarding Large Reasoning Models with Inference-time Safety Aha Moments, https://arxiv.org/abs/2508.04204
Yongjiang Liu, Haoxi Li, Xiaosong Ma, Jie Zhang, Song Guo, 6 Aug 2025, Think How to Think: Mitigating Overthinking with Autonomous Difficulty Cognition in Large Reasoning Models, https://arxiv.org/abs/2507.02663
Youcheng Huang, Bowen Qin, Chen Huang, Duanyu Feng, Xi Yang, Wenqiang Lei, 15 Aug 2025, Beyond Solving Math Quiz: Evaluating the Ability of Large Reasoning Models to Ask for Information, https://arxiv.org/abs/2508.11252
Nuo Chen, Zhiyuan Hu, Qingyun Zou, Jiaying Wu, Qian Wang, Bryan Hooi, Bingsheng He, 20 Aug 2025, JudgeLRM: Large Reasoning Models as a Judge, https://arxiv.org/abs/2504.00050
Haonan Dong, Haoran Ye, Wenhao Zhu, Kehan Jiang, Guojie Song, 24 Aug 2025, Meta-R1: Empowering Large Reasoning Models with Metacognition, https://arxiv.org/abs/2508.17291
Yi Liu and Xiangyu Liu and Zequn Sun and Wei Hu, 26 Aug 2025, Answering the Unanswerable Is to Err Knowingly: Analyzing and Mitigating Abstention Failures in Large Reasoning Models, https://arxiv.org/abs/2508.18760
Microsoft, 17 Sep, 2025, GPT-5 vs GPT-4.1: choosing the right model for your use case https://learn.microsoft.com/en-us/azure/ai-foundry/foundry-models/how-to/model-choice-guide

Open Source Reasoning

Open source reasoning projects are those that either: (a) use open-source code to implement multi-step inference-based reasoning algorithms such as Chain-of-Thought (on any underlying model), or (b) Large Reasoning Models where the model weights and architectural details have been open-sourced, such as Deepseek R3.

DeepSeek, Dec 2024, DeepSeek V3 Technical Report, https://github.com/deepseek-ai/DeepSeek-V3/blob/main/DeepSeek_V3.pdf (DeepSeek V3 is now the leading open-source frontier model.)
Tim Urista, Dec 2024, Dramatically Reduce Inference Costs with DeepSeek-V3: A New Era in Open-Source LLMs, https://ai.gopubby.com/dramatically-reduce-inference-costs-with-deepseek-v3-a-new-era-in-open-source-llms-4f1adf760ee1
Fengli Xu, Qianyue Hao, Zefang Zong, Jingwei Wang, Yunke Zhang, Jingyi Wang, Xiaochong Lan, Jiahui Gong, Tianjian Ouyang, Fanjin Meng, Chenyang Shao, Yuwei Yan, Qinglong Yang, Yiwen Song, Sijian Ren, Xinyuan Hu, Yu Li, Jie Feng, Chen Gao, Yong Li, 17 Jan 2025 (v2), Towards Large Reasoning Models: A Survey on Scaling LLM Reasoning Capabilities, https://arxiv.org/abs/2501.09686
Di Zhang, Jianbo Wu, Jingdi Lei, Tong Che, Jiatong Li, Tong Xie, Xiaoshui Huang, Shufei Zhang, Marco Pavone, Yuqiang Li, Wanli Ouyang, Dongzhan Zhou, 21 Nov 2024 (v2), LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning, https://arxiv.org/abs/2410.02884 (Use multi-step inference reasoning on the LLama open source models.)
Edward Beeching, Lewis Tunstall, Sasha Rush Dec 16, 2024, Scaling Test Time Compute with Open Source Models, https://huggingface.co/spaces/HuggingFaceH4/blogpost-scaling-test-time-compute
Dan Zhang, Sining Zhoubian, Ziniu Hu, Yisong Yue, Yuxiao Dong, Jie Tang, 18 Nov 2024 (v3), ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search, https://arxiv.org/abs/2406.03816 https://github.com/THUDM/ReST-MCTS
Jun Wang, Meng Fang, Ziyu Wan, Muning Wen, Jiachen Zhu, Anjie Liu, Ziqin Gong, Yan Song, Lei Chen, Lionel M. Ni, Linyi Yang, Ying Wen, Weinan Zhang, 12 Oct 2024, OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models, https://arxiv.org/abs/2410.09671 https://openreasoner.github.io/
Yiwei Qin, Xuefeng Li, Haoyang Zou, Yixiu Liu, Shijie Xia, Zhen Huang, Yixin Ye, Weizhe Yuan, Hector Liu, Yuanzhi Li, Pengfei Liu, 8 Oct 2024, O1 Replication Journey: A Strategic Progress Report -- Part 1. https://arxiv.org/abs/2410.18982
DeepSeek-AI, Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Ruoyu Zhang, Runxin Xu, Qihao Zhu, Shirong Ma, Peiyi Wang, Xiao Bi, Xiaokang Zhang, Xingkai Yu, Yu Wu, Z.F. Wu, Zhibin Gou, Zhihong Shao, Zhuoshu Li, Ziyi Gao, Aixin Liu, Bing Xue, Bingxuan Wang, Bochao Wu, Bei Feng, Chengda Lu, Chenggang Zhao, Chengqi Deng, Chenyu Zhang, Chong Ruan, Damai Dai, Deli Chen, Dongjie Ji, Erhang Li, Fangyun Lin, Fucong Dai, Fuli Luo, Guangbo Hao, Guanting Chen, Guowei Li, H. Zhang, Han Bao, Hanwei Xu, Haocheng Wang, Honghui Ding, Huajian Xin, Huazuo Gao, Hui Qu, Hui Li, Jianzhong Guo, et al. (100+ additional authors not shown), 22 Jan 2025, DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning, https://arxiv.org/abs/2501.12948 (The DeepSeek R1 large reasoning model.)
Ryan Browne, Feb 4 2025, DeepSeek’s breakthrough emboldens open-source AI models like Meta’s Llama, https://www.cnbc.com/2025/02/04/deepseek-breakthrough-emboldens-open-source-ai-models-like-meta-llama.html
Mohammed Karimkhan Pathan, February 3, 2025, Open-source revolution: How DeepSeek-R1 challenges OpenAI’s o1 with superior processing, cost efficiency, https://venturebeat.com/ai/open-source-revolution-how-deepseek-r1-challenges-openais-o1-with-superior-processing-cost-efficiency/
Maxwell Zeff, February 5, 2025, Researchers created an open rival to OpenAI’s o1 ‘reasoning’ model for under $50, https://techcrunch.com/2025/02/05/researchers-created-an-open-rival-to-openais-o1-reasoning-model-for-under-50/
Kyle Wiggers, January 11, 2025, Researchers open source Sky-T1, a ‘reasoning’ AI model that can be trained for less than $450,https://techcrunch.com/2025/01/11/researchers-open-source-sky-t1-a-reasoning-ai-model-that-can-be-trained-for-less-than-450/
XYZ Labs, Feb 23, 2025, Open Reasoner Zero: A Breakthrough in AI Training Efficiency Matches DeepSeek with Just 1/30th of Training Steps. Major AI Figures Including Kai-Fu Lee, Harry Shum, and Xiangyu Zhang Unveil Revolutionary Open-Source Training Method. https://xyzlabs.substack.com/p/open-reasoner-zero-a-breakthrough
Asif Razzaq, March 5, 2025, Qwen Releases QwQ-32B: A 32B Reasoning Model that Achieves Significantly Enhanced Performance in Downstream Task, https://www.marktechpost.com/2025/03/05/qwen-releases-qwq-32b-a-32b-reasoning-model-that-achieves-significantly-enhanced-performance-in-downstream-task/ (Features 32B parameters, 32K context length, 64 layers, RoPE, SwiGLU, RMSNorm, and attention enhancements.)
Carl Franzen, March 5, 2025, New open-source math model Light-R1-32B surpasses equivalent DeepSeek performance with only $1000 in training costs, https://venturebeat.com/ai/new-open-source-math-model-light-r1-32b-surpasses-equivalent-deepseek-performance-with-only-1000-in-training-costs/
X Zhang, F Zhang, C Du, C Du, T Pang, W Gao, M Lin, Mar 2025, LightTransfer: Your Long-Context LLM is Secretly a Hybrid Model with Effortless Adaptation, https://openreview.net/pdf?id=DfgfGTfObm
Carl Franzen, March 17, 2025, Baidu delivers new LLMs ERNIE 4.5 and ERNIE X1 undercutting DeepSeek, OpenAI on cost — but they’re not open source (yet), https://venturebeat.com/ai/baidu-delivers-new-llms-ernie-4-5-and-ernie-x1-undercutting-deepseek-openai-on-cost-but-theyre-not-open-source-yet/
Carl Franzen, September 24, 2025, Chinese food delivery app Meituan's open source AI model LongCat-Flash-Thinking rivals GPT-5, https://venturebeat.com/ai/chinese-food-delivery-firm-meituans-open-source-ai-model-longcat-flash

General Research on Intelligence

What does it mean to be smart? There are various answers to this, and it's a very nuanced question.

Research on intelligence or "smartness" of AI systems:

Liangming Pan, Michael Saxon, Wenda Xu, Deepak Nathani, Xinyi Wang, William Yang Wang, May 03 2024, Automatically Correcting Large Language Models: Surveying the Landscape of Diverse Automated Correction Strategies, https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00660/120911
MC Planning, 2024, Can Language Models Be Used in Multistep Commonsense Planning Domains? Artificial General Intelligence https://link.springer.com/book/10.1007/978-3-031-33469-6#page=288
JESSICA STILLMAN, APRIL 9, 2024, Scientists Pitted 4-Year-Olds Against AI. The Kids Crushed the Machines at This 1 Crucial Skill, https://www.inc-aus.com/jessica-stillman/scientists-pitted-4-year-olds-against-ai-kids-crushed-machines-1-skill.html (AI engines failed at using unusual objects for tasks, such as using something else to bang a nail that wasn't a hammer, i.e., a type of reasoning or thinking creatively.)
Diana Hu, 29/03/2024, Building AI Models is faster and cheaper than you probably think, Y Combinator, https://www.ycombinator.com/blog/building-ai-models
David Spuler, March 2024, Chapter 43. Overview of AI Research, Generative AI in C++: Coding Transformers and LLMs, https://www.amazon.com/dp/B0CXJKCWX9
Rachel Metz, July 12, 2024, OpenAI Scale Ranks Progress Toward ‘Human-Level’ Problem Solving: The company believes its technology is approaching the second level of five on the path to artificial general intelligence, Bloomberg, https://www.bloomberg.com/news/articles/2024-07-11/openai-sets-levels-to-track-progress-toward-superintelligent-ai?sref=P6Q0mxvj
Vivedha Elango, Dec 2024, How to Make your RAG application Use External Data More Wisely? RAG Optimisation Techniques for Explicit and Implicit Fact Queries with Implementations. https://ai.gopubby.com/how-to-make-your-rag-application-use-external-data-more-wisely-4ff1863752c5

Chain-of-Thought (CoT) Reasoning

Research papers on chain-of-thought (CoT) for reasoning:

Maciej Besta, Florim Memedi, Zhenyu Zhang, Robert Gerstenberger, Guangyuan Piao, Nils Blach, Piotr Nyczyk, Marcin Copik, Grzegorz Kwaśniewski, Jürgen Müller, Lukas Gianinazzi, Ales Kubicek, Hubert Niewiadomski, Aidan O'Mahony, Onur Mutlu, Torsten Hoefler, 5 Apr 2024, Demystifying Chains, Trees, and Graphs of Thoughts, https://arxiv.org/abs/2401.14295 http://htor.ethz.ch/publications/img/besta-topologies.pdf
Jacob Pfau, William Merrill, Samuel R. Bowman, 24 Apr 2024, Let's Think Dot by Dot: Hidden Computation in Transformer Language Models, https://arxiv.org/abs/2404.15758
Hongxuan Zhang, Zhining Liu, Jiaqi Zheng, Chenyi Zhuang, Jinjie Gu, Guihai Chen, Nov 2023, Fast Chain-of-Thought: A Glance of Future from Parallel Decoding Leads to Answers Faster, https://arxiv.org/abs/2311.08263
Hunter Lightman, Vineet Kosaraju, Yura Burda, Harri Edwards, Bowen Baker, Teddy Lee, Jan Leike, John Schulman, Ilya Sutskever, Karl Cobbe, May 2023, Let's Verify Step by Step, https://arxiv.org/abs/2305.20050
Xuan Zhang, Chao Du, Tianyu Pang, Qian Liu, Wei Gao, Min Lin, 13 Jun 2024, Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs, https://arxiv.org/abs/2406.09136 Code: https://github.com/sail-sg/CPO
kipply's blog, 2023-03-30, Transformer Taxonomy (the last lit review), https://kipp.ly/transformer-taxonomy/ (Papers for all the Transformer architectures and milestone papers for the major optimization improvements on them.)
Daniel Lopes, June 21, 2024, A Comprehensive Guide to Text Prompt Engineering Techniques, https://journal.daniellopes.dev/p/practical-prompt-engineering-notes
Wenxiao Wang, Wei Chen, Yicong Luo, Yongliu Long, Zhengkai Lin, Liye Zhang, Binbin Lin, Deng Cai, Xiaofei He, 15 Feb 2024, Model Compression and Efficient Inference for Large Language Models: A Survey, https://arxiv.org/abs/2402.09748
Hao Zhou, Chengming Hu, Ye Yuan, Yufei Cui, Yili Jin, Can Chen, Haolun Wu, Dun Yuan, Li Jiang, Di Wu, Xue Liu, Charlie Zhang, Xianbin Wang, Jiangchuan Liu, 17 May 2024, Large Language Model (LLM) for Telecommunications: A Comprehensive Survey on Principles, Key Techniques, and Opportunities, https://arxiv.org/abs/2405.10825
Yu Wang, Shiwan Zhao, Zhihu Wang, Heyuan Huang, Ming Fan, Yubo Zhang, Zhixing Wang, Haijun Wang, Ting Liu, 5 Sep 2024, Strategic Chain-of-Thought: Guiding Accurate Reasoning in LLMs through Strategy Elicitation, https://arxiv.org/abs/2409.03271
Asankhaya Sharma (codelion), Sep 2024, Optillm: Optimizing inference proxy for LLMs, https://github.com/codelion/optillm
Ziqi Jin, Wei Lu, 6 Sep 2024, Self-Harmonized Chain of Thought, https://arxiv.org/abs/2409.04057
Pranab Sahoo, Ayush Kumar Singh, Sriparna Saha, Vinija Jain, Samrat Mondal, Aman Chadha, 5 Feb 2024, A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Applications, https://arxiv.org/abs/2402.07927
Shizhe Diao, Pengcheng Wang, Yong Lin, Rui Pan, Xiang Liu, Tong Zhang, 21 Jul 2024 (v5), Active Prompting with Chain-of-Thought for Large Language Models, https://arxiv.org/abs/2302.12246 https://github.com/shizhediao/active-prompt
Zhuosheng Zhang, Aston Zhang, Mu Li, Alex Smola, 7 Oct 2022, Automatic Chain of Thought Prompting in Large Language Models, https://arxiv.org/abs/2210.03493 https://github.com/amazon-research/auto-cot
Louis Bouchard, Sep 12, 2024, OpenAI's o1 Model: The Future of Reasoning AI? What Sets It Apart, How OpenAI's o1 Model Thinks Through Problems (And Why It's Slower), https://www.louisbouchard.ai/openai-o1/
OpenAI, September 12, 2024, Learning to Reason with LLMs, https://openai.com/index/learning-to-reason-with-llms/
Emilia David, September 12, 2024, How to prompt on OpenAI’s new o1 models, https://venturebeat.com/ai/how-to-prompt-on-openai-o1/ (Prompt engineering is different for o1, such as "don't use chain of thought.")
Du Phan, Matthew D. Hoffman, David Dohan, Sholto Douglas, Tuan Anh Le, Aaron Parisi, Pavel Sountsov, Charles Sutton, Sharad Vikram, Rif A. Saurous, 28 Nov 2023, Training Chain-of-Thought via Latent-Variable Inference, https://arxiv.org/abs/2312.02179
Trung Quoc Luong, Xinbo Zhang, Zhanming Jie, Peng Sun, Xiaoran Jin, Hang Li, 27 Jun 2024 (v2), ReFT: Reasoning with Reinforced Fine-Tuning, https://arxiv.org/abs/2401.08967
Tianqiao Liu, Zui Chen, Zitao Liu, Mi Tian, Weiqi Luo, 13 Sep 2024, Expediting and Elevating Large Language Model Reasoning via Hidden Chain-of-Thought Decoding, https://arxiv.org/abs/2409.08561
Zayne Sprague, Fangcong Yin, Juan Diego Rodriguez, Dongwei Jiang, Manya Wadhwa, Prasann Singhal, Xinyu Zhao, Xi Ye, Kyle Mahowald, Greg Durrett, 18 Sep 2024, To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning, https://arxiv.org/abs/2409.12183
Santosh Kumar Radha, Yasamin Nouri Jelyani, Ara Ghukasyan, Oktay Goktas, 19 Sep 2024, Iteration of Thought: Leveraging Inner Dialogue for Autonomous Large Language Model Reasoning, https://arxiv.org/abs/2409.12618
Artem Shelamanov, Sep 2024, Why OpenAI’s o1 Model Is A Scam, https://pub.towardsai.net/why-openais-o1-model-is-a-scam-eb3356c3d70e
Chung-Yu Wang, Alireza DaghighFarsoodeh, Hung Viet Pham, 24 Sep 2024, Task-oriented Prompt Enhancement via Script Generation, https://arxiv.org/abs/2409.16418
Cassandra A. Cohen, William W. Cohen, 17 Sep 2024, Watch Your Steps: Observable and Modular Chains of Thought, https://arxiv.org/abs/2409.15359
Tongxuan Liu, Wenjiang Xu, Weizhe Huang, Xingyu Wang, Jiaxing Wang, Hailong Yang, Jing Li, 26 Sep 2024, Logic-of-Thought: Injecting Logic into Contexts for Full Reasoning in Large Language Models, https://arxiv.org/abs/2409.17539
Zhenwen Liang, Ye Liu, Tong Niu, Xiangliang Zhang, Yingbo Zhou, Semih Yavuz, 5 Oct 2024, Improving LLM Reasoning through Scaling Inference Computation with Collaborative Verification, https://arxiv.org/abs/2410.05318
Qiguang Chen, Libo Qin, Jiaqi Wang, Jinxuan Zhou, Wanxiang Che, 8 Oct 2024, Unlocking the Boundaries of Thought: A Reasoning Granularity Framework to Quantify and Optimize Chain-of-Thought, https://arxiv.org/abs/2410.05695 https://github.com/LightChen233/reasoning-granularity
Yingqian Cui, Pengfei He, Xianfeng Tang, Qi He, Chen Luo, Jiliang Tang, Yue Xing, 21 Oct 2024, A Theoretical Understanding of Chain-of-Thought: Coherent Reasoning and Error-Aware Demonstration, https://arxiv.org/abs/2410.16540
Banghao Chen, Zhaofeng Zhang, Nicolas Langrené, Shengxin Zhu, 5 Sep 2024 (v5), Unleashing the potential of prompt engineering in Large Language Models: a comprehensive review, https://arxiv.org/abs/2310.14735
Data Camp, Jul 10, 2024, Chain-of-Thought Prompting: Step-by-Step Reasoning with LLMs, https://www.datacamp.com/tutorial/chain-of-thought-prompting
Pankaj, Dec 21, 2023, Chain of Thought Prompting: Guiding LLMs Step-by-Step, https://medium.com/@pankaj_pandey/chain-of-thought-prompting-guiding-llms-step-by-step-e6eac32d02d8
Jason Wei and Denny Zhou, May 11, 2022, Language Models Perform Reasoning via Chain of Thought, https://research.google/blog/language-models-perform-reasoning-via-chain-of-thought/
Cameron R. Wolfe, Jul 24, 2023, Chain of Thought Prompting for LLMs: A practical and simple approach for “reasoning” with LLMs, https://towardsdatascience.com/chain-of-thought-prompting-for-llms-33c963eead38
Siwei Wu, Zhongyuan Peng, Xinrun Du, Tuney Zheng, Minghao Liu, Jialong Wu, Jiachen Ma, Yizhi Li, Jian Yang, Wangchunshu Zhou, Qunshu Lin, Junbo Zhao, Zhaoxiang Zhang, Wenhao Huang, Ge Zhang, Chenghua Lin, J.H. Liu, 22 Oct 2024 (v2), A Comparative Study on Reasoning Patterns of OpenAI's o1 Model, https://arxiv.org/abs/2410.13639
Tanay Jaipuria, Oct 29, 2024, OpenAI's o-1 and inference-time scaling laws, https://www.tanayj.com/p/openais-o-1-and-inference-time-scaling
Junda Wu, Xintong Li, Ruoyu Wang, Yu Xia, Yuxin Xiong, Jianing Wang, Tong Yu, Xiang Chen, Branislav Kveton, Lina Yao, Jingbo Shang, Julian McAuley, 31 Oct 2024, OCEAN: Offline Chain-of-thought Evaluation and Alignment in Large Language Models, https://arxiv.org/abs/2410.23703
Siyun Zhao, Yuqing Yang, Zilong Wang, Zhiyuan He, Luna K. Qiu, Lili Qiu, 23 Sep 2024, Retrieval Augmented Generation (RAG) and Beyond: A Comprehensive Survey on How to Make your LLMs use External Data More Wisely, https://arxiv.org/abs/2409.14924
Guowei Xu, Peng Jin, Li Hao, Yibing Song, Lichao Sun, Li Yuan, 15 Nov 2024, LLaVA-o1: Let Vision Language Models Reason Step-by-Step, https://arxiv.org/abs/2411.10440
Carl Franzen, November 20, 2024, DeepSeek’s first reasoning model R1-Lite-Preview turns heads, beating OpenAI o1 performance, https://venturebeat.com/ai/deepseeks-first-reasoning-model-r1-lite-preview-turns-heads-beating-openai-o1-performance/
Yu Zhao, Huifeng Yin, Bo Zeng, Hao Wang, Tianqi Shi, Chenyang Lyu, Longyue Wang, Weihua Luo, Kaifu Zhang, 21 Nov 2024, Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions, https://arxiv.org/abs/2411.14405
Jun Gao, Yongqi Li, Ziqiang Cao, Wenjie Li, 29 Nov 2024, Interleaved-Modal Chain-of-Thought, https://arxiv.org/abs/2411.19488 (Using CoT on a multimodal/vision model.)
Hieu Tran, Zonghai Yao, Junda Wang, Yifan Zhang, Zhichao Yang, Hong Yu, 5 Dec 2024 (v2), RARE: Retrieval-Augmented Reasoning Enhancement for Large Language Models, https://arxiv.org/abs/2412.02830
Tiernan Ray, Dec. 10, 2024, How Cerebras boosted Meta's Llama to 'frontier model' performance The company also demonstrates initial training of a one-trillion-parameter AI model on a single machine using conventional DDR5 memory chips. https://www.zdnet.com/article/how-cerebras-boosted-metas-llama-to-frontier-model-performance/
Shibo Hao, Sainbayar Sukhbaatar, DiJia Su, Xian Li, Zhiting Hu, Jason Weston, Yuandong Tian, 9 Dec 2024, Training Large Language Models to Reason in a Continuous Latent Space, https://arxiv.org/abs/2412.06769
Ben Dickson, December 10, 2024, OpenAI’s o1 model doesn’t show its thinking, giving open source an advantage, https://venturebeat.com/ai/heres-how-openai-o1-might-lose-ground-to-open-source-models/
Zhe Chen, Weiyun Wang, Yue Cao, Yangzhou Liu, Zhangwei Gao, Erfei Cui, Jinguo Zhu, Shenglong Ye, Hao Tian, Zhaoyang Liu, Lixin Gu, Xuehui Wang, Qingyun Li, Yimin Ren, Zixuan Chen, Jiapeng Luo, Jiahao Wang, Tan Jiang, Bo Wang, Conghui He, Botian Shi, Xingcheng Zhang, Han Lv, Yi Wang, Wenqi Shao, Pei Chu, Zhongying Tu, Tong He, Zhiyong Wu, Huipeng Deng, Jiaye Ge, Kai Chen, Min Dou, Lewei Lu, Xizhou Zhu, Tong Lu, Dahua Lin, Yu Qiao, Jifeng Dai, Wenhai Wang, 6 Dec 2024, Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling, https://arxiv.org/abs/2412.05271
Jiaqi Zhang, Chen Gao, Liyuan Zhang, Yong Li, Hongzhi Yin, 10 Dec 2024, SmartAgent: Chain-of-User-Thought for Embodied Personalized Agent in Cyber World, https://arxiv.org/abs/2412.07472 https://github.com/tsinghua-fib-lab/SmartAgent
Kyle Wiggers, December 14, 2024, ‘Reasoning’ AI models have become a trend, for better or worse, https://techcrunch.com/2024/12/14/reasoning-ai-models-have-become-a-trend-for-better-or-worse/
Alberto Romero, Dec 21, 2024, OpenAI o3 Model Is a Message From the Future: Update All You Think You Know About AI. Incredible, a miracle, more than just a better state-of-the-art AI model. https://www.thealgorithmicbridge.com/p/openai-o3-model-is-a-message-from
Sabrina Ortiz, Dec. 20, 2024, OpenAI unveils its most advanced o3 reasoning model on its last day of 'shipmas', https://www.zdnet.com/article/openai-unveils-its-most-advanced-o3-reasoning-model-on-its-last-day-of-shipmas/
Tyler McDonald, Anthony Colosimo, Yifeng Li, Ali Emami, 2 Dec 2024, Can We Afford The Perfect Prompt? Balancing Cost and Accuracy with the Economical Prompting Index, https://arxiv.org/abs/2412.01690
Jiaxiang Liu, Yuan Wang, Jiawei Du, Joey Tianyi Zhou, Zuozhu Liu, 18 Dec 2024, MedCoT: Medical Chain of Thought via Hierarchical Expert, https://arxiv.org/abs/2412.13736
Changyue Wang, Weihang Su, Qingyao Ai, Yiqun Liu, 23 Dec 2024, Knowledge Editing through Chain-of-Thought, https://arxiv.org/abs/2412.17727 https://github.com/bebr2/EditCoT
Shunyu Yao, Dian Yu, Jeffrey Zhao, Izhak Shafran, Thomas L. Griffiths, Yuan Cao, Karthik Narasimhan, 3 Dec 2023 (v2), Tree of Thoughts: Deliberate Problem Solving with Large Language Models, https://arxiv.org/abs/2305.10601 Code: https://github.com/princeton-nlp/tree-of-thought-llm
Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, Denny Zhou, 10 Jan 2023 (v6), Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. https://arxiv.org/abs/2201.11903
Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, Yusuke Iwasawa, 29 Jan 2023 (v4), Large Language Models are Zero-Shot Reasoners, https://arxiv.org/abs/2205.11916 https://github.com/kojima-takeshi188/zero_shot_cot ("Let's think step by step" prepended to every prompt for a type of zero-shot CoT.)
Xuezhi Wang, Denny Zhou, 23 May 2024 (v2), Chain-of-Thought Reasoning Without Prompting, https://arxiv.org/abs/2402.10200 ("CoT decoding" is examining the alternative paths in the decoding algorithm, which is somewhat similar to Chain-of-Thought reasoning.)
xjdr-alt, Dec 2024, entropix: Entropy Based Sampling and Parallel CoT Decoding, https://github.com/xjdr-alt/entropix (Parallel decoding attempts to get something similar to CoT.)
Huanjin Yao, Jiaxing Huang, Wenhao Wu, Jingyi Zhang, Yibo Wang, Shunyu Liu, Yingjie Wang, Yuxin Song, Haocheng Feng, Li Shen, Dacheng Tao, 24 Dec 2024, Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search, https://arxiv.org/abs/2412.18319 https://github.com/HJYao00/Mulberry (Multimodal multi-step reasoning like CoT.)
Xiangjue Dong, Maria Teleki, James Caverlee, 18 Dec 2024, A Survey on LLM Inference-Time Self-Improvement, https://arxiv.org/abs/2412.14352 https://github.com/dongxiangjue/Awesome-LLM-Self-Improvement (Broad survey of reasoning improvement methods from multi-step inference to RALM to decoding algorithms.)
Jiaan Wang, Fandong Meng, Yunlong Liang, Jie Zhou, 23 Dec 2024, DRT-o1: Optimized Deep Reasoning Translation via Long Chain-of-Thought, https://arxiv.org/abs/2412.17498 https://github.com/krystalan/DRT-o1 (Examines similes and metaphors in literature using long CoT.)
Jiacheng Ye, Shansan Gong, Liheng Chen, Lin Zheng, Jiahui Gao, Han Shi, Chuan Wu, Xin Jiang, Zhenguo Li, Wei Bi, Lingpeng Kong, 5 Dec 2024 (v3), Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models, https://arxiv.org/abs/2402.07754
Shiv Sakhuja, 25 Sep 2024, Chain-of-Thought (CoT) Prompting Explained: 7 Techniques for Optimizing AI Performance, https://hub.athina.ai/athina-originals/guides-chain-of-thought-cot-prompting-explained-7-techniques-for-optimizing-ai-performance/
Aryasomayajula Ram Bharadwaj, 5 Dec 2024, Understanding Hidden Computations in Chain-of-Thought Reasoning, https://arxiv.org/abs/2412.04537
Aske Plaat, Annie Wong, Suzan Verberne, Joost Broekens, Niki van Stein, Thomas Back, 16 Jul 2024, Reasoning with Large Language Models, a Survey, https://arxiv.org/abs/2407.11511
Cheng Yang, Chufan Shi, Siheng Li, Bo Shui, Yujiu Yang, Wai Lam, 29 Dec 2024, LLM2: Let Large Language Models Harness System 2 Reasoning, https://arxiv.org/abs/2412.20372
Mayi Xu, Yunfeng Ning, Yongqi Li, Jianhao Chen, Jintao Wen, Yao Xiao, Shen Zhou, Birong Pan, Zepeng Bao, Xin Miao, Hankun Kang, Ke Sun, Tieyun Qian, 2 Jan 2025, Reasoning based on symbolic and parametric knowledge bases: a survey, https://arxiv.org/abs/2501.01030 (Extensive survey of reasoning from CoT to knowledge graphs to table-based reasoning.)
Yixin Ji, Juntao Li, Hai Ye, Kaixin Wu, Jia Xu, Linjian Mo, Min Zhang, 5 Jan 2025, Test-time Computing: from System-1 Thinking to System-2 Thinking, https://arxiv.org/abs/2501.02497
Violet Xiang, Charlie Snell, Kanishk Gandhi, Alon Albalak, Anikait Singh, Chase Blagden, Duy Phung, Rafael Rafailov, Nathan Lile, Dakota Mahan, Louis Castricato, Jan-Philipp Franken, Nick Haber, Chelsea Finn, 8 Jan 2025, Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Thought, https://arxiv.org/abs/2501.04682
Andrea Matarazzo, Riccardo Torlone, 3 Jan 2025, A Survey on Large Language Models with some Insights on their Capabilities and Limitations, https://arxiv.org/abs/2501.04040 (Broad survey with many LLM topics covered from history to architectures to optimizations.)
Ziyang Ma, Zhuo Chen, Yuping Wang, Eng Siong Chng, Xie Chen, 13 Jan 2025, Audio-CoT: Exploring Chain-of-Thought Reasoning in Large Audio Language Model, https://arxiv.org/abs/2501.07246
Tong Xiao, Jingbo Zhu, 16 Jan 2025, Foundations of Large Language Models, https://arxiv.org/abs/2501.09223 (Huge 230 page paper on many topics such as training, prompting, alignment, and long context.)
G Bao, H Zhang, C Wang, L Yang, Y Zhang, Jan 2025, How Likely Do LLMs with CoT Mimic Human Reasoning? Proceedings of the 31st International Conference on Computational Linguistics, pages 7831–7850, January 19–24, 2025, https://aclanthology.org/2025.coling-main.524.pdf
Son, M., Won, Y.-J., & Lee, S. (2025). Optimizing Large Language Models: A Deep Dive into Effective Prompt Engineering Techniques. Applied Sciences, 15(3), 1430. https://doi.org/10.3390/app15031430 https://www.mdpi.com/2076-3417/15/3/1430
Manish Sanwal, 3 Feb 2025 (v2), Layered Chain-of-Thought Prompting for Multi-Agent LLM Systems: A Comprehensive Approach to Explainable Large Language Models, https://arxiv.org/abs/2501.18645
Jianfeng Pan, Senyou Deng, Shaomang Huang, 4 Feb 2025, CoAT: Chain-of-Associated-Thoughts Framework for Enhancing Large Language Models Reasoning, https://arxiv.org/abs/2502.02390 (Integrating results from an "associative memory" in CoT reasoning paths at inference time.)
Avinash Patil, 5 Feb 2025, Advancing Reasoning in Large Language Models: Promising Methods and Approaches, https://arxiv.org/abs/2502.03671
Daniel Fleischer, Moshe Berchansky, Gad Markovits, Moshe Wasserblat, 13 Feb 2025, SQuARE: Sequential Question Answering Reasoning Engine for Enhanced Chain-of-Thought in Large Language Models, https://arxiv.org/abs/2502.09390 https://github.com/IntelLabs/RAG-FiT/tree/square
Komal Kumar, Tajamul Ashraf, Omkar Thawakar, Rao Muhammad Anwer, Hisham Cholakkal, Mubarak Shah, Ming-Hsuan Yang, Phillip H.S. Torr, Salman Khan, Fahad Shahbaz Khan, 28 Feb 2025, LLM Post-Training: A Deep Dive into Reasoning Large Language Models, https://arxiv.org/abs/2502.21321 https://github.com/mbzuai-oryx/Awesome-LLM-Post-training
Bin Hong, Jiayu Liu, Zhenya Huang, Kai Zhang, Mengdi Zhang, 13 Aug 2025, Pruning Long Chain-of-Thought of Large Reasoning Models via Small-Scale Preference Optimization, https://arxiv.org/abs/2508.10164
Ke Niu, Haiyang Yu, Zhuofan Chen, Mengyang Zhao, Teng Fu, Bin Li, Xiangyang Xue, 13 Aug 2025, From Intent to Execution: Multimodal Chain-of-Thought Reinforcement Learning for Precise CAD Code Generation, https://arxiv.org/abs/2508.10118
Ziyu Guo, Renrui Zhang, Chengzhuo Tong, Zhizheng Zhao, Rui Huang, Haoquan Zhang, Manyuan Zhang, Jiaming Liu, Shanghang Zhang, Peng Gao, Hongsheng Li, Pheng-Ann Heng, 23 Jul 2025, Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step, https://arxiv.org/abs/2501.13926
Ang Li, Charles Wang, Kaiyu Yue, Zikui Cai, Ollie Liu, Deqing Fu, Peng Guo, Wang Bill Zhu, Vatsal Sharan, Robin Jia, Willie Neiswanger, Furong Huang, Tom Goldstein, Micah Goldblum, 22 Jul 2025, Zebra-CoT: A Dataset for Interleaved Vision Language Reasoning, https://arxiv.org/abs/2507.16746
Hulayyil Alshammari, Praveen Rao, 23 Jul 2025, Evaluating the Performance of AI Text Detectors, Few-Shot and Chain-of-Thought Prompting Using DeepSeek Generated Text, https://arxiv.org/abs/2507.17944
Binbin Ji, Siddharth Agrawal, Qiance Tang, and Yvonne Wu, 6 Jul 2025, Enhancing Spatial Reasoning in Vision-Language Models via Chain-of-Thought Prompting and Reinforcement Learning, https://arxiv.org/abs/2507.13362
Qiguang Chen, Libo Qin, Jinhao Liu, Dengyun Peng, Jiannan Guan, Peng Wang, Mengkang Hu, Yuhang Zhou, Te Gao, Wanxiang Che, 18 Jul 2025, Towards Reasoning Era: A Survey of Long Chain-of-Thought for Reasoning Large Language Models, https://arxiv.org/abs/2503.09567
Lei Chen, Xuanle Zhao, Zhixiong Zeng, Jing Huang, Yufeng Zhong, Lin Ma, 21 Jul 2025, Chart-R1: Chain-of-Thought Supervision and Reinforcement for Advanced Chart Reasoner, https://arxiv.org/abs/2507.15509
Luyi Ma, Wanjia Zhang, Kai Zhao, Abhishek Kulkarni, Lalitesh Morishetti, Anjana Ganesh, Ashish Ranjan, Aashika Padmanabhan, Jianpeng Xu, Jason Cho, Praveen Kanumala, Kaushiki Nag, Sumit Dutta, Kamiya Motwani, Malay Patel, Evren Korpeoglu, Sushant Kumar, Kannan Achan, 19 Jul 2025, GRACE: Generative Recommendation via Journey-Aware Sparse Attention on Chain-of-Thought Tokenization, https://arxiv.org/abs/2507.14758
Hao Yang, Qinghua Zhao, Lei Li, 28 Jul 2025, How Chain-of-Thought Works? Tracing Information Flow from Decoding, Projection, and Activation, https://arxiv.org/abs/2507.20758
Eunkyu Park, Wesley Hanwen Deng, Gunhee Kim, Motahhare Eslami, Maarten Sap, 27 Jul 2025, Cognitive Chain-of-Thought: Structured Multimodal Reasoning about Social Situations, https://arxiv.org/abs/2507.20409
Xiangning Yu, Zhuohan Wang, Linyi Yang, Haoxuan Li, Anjie Liu, Xiao Xue, Jun Wang, Mengyue Yang, 26 Jul 2025, Causal Sufficiency and Necessity Improves Chain-of-Thought Reasoning, https://arxiv.org/abs/2506.09853
Ping Yu, Jack Lanchantin, Tianlu Wang, Weizhe Yuan, Olga Golovneva, Ilia Kulikov, Sainbayar Sukhbaatar, Jason Weston, Jing Xu, 31 Jul 2025, CoT-Self-Instruct: Building high-quality synthetic prompts for reasoning and non-reasoning tasks, https://arxiv.org/abs/2507.23751
Xi Chen, Aske Plaat, Niki van Stein, 24 Jul 2025, How does Chain of Thought Think? Mechanistic Interpretability of Chain-of-Thought Reasoning with Sparse Autoencoding, https://arxiv.org/abs/2507.22928
Shixin Yi, Lin Shang, 1 Aug 2025, CoRGI: Verified Chain-of-Thought Reasoning with Visual Grounding, https://arxiv.org/abs/2508.00378
Jianwei Wang, Ziming Wu, Fuming Lai, Shaobing Lian, Ziqian Zeng, 1 Aug 2025, SynAdapt: Learning Adaptive Reasoning in Large Language Models via Synthetic Continuous Chain-of-Thought, https://arxiv.org/abs/2508.00574
Chengshuai Zhao, Zhen Tan, Pingchuan Ma, Dawei Li, Bohan Jiang, Yancheng Wang, Yingzhen Yang, Huan Liu, 2 Aug 2025, Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens, https://arxiv.org/abs/2508.01191
Jialiang Hong, Taihang Zhen, Kai Chen, Jiaheng Liu, Wenpeng Zhu, Jing Huo, Yang Gao, Depeng Wang, Haitao Wan, Xi Yang, Boyan Wang, Fanyu Meng, 4 Aug 2025, Reconsidering Overthinking: Penalizing Internal and External Redundancy in CoT Reasoning, https://arxiv.org/abs/2508.02178
Chloe Li, Mary Phuong, Noah Y. Siegel, 31 Jul 2025, LLMs Can Covertly Sandbag on Capability Evaluations Against Chain-of-Thought Monitoring, https://arxiv.org/abs/2508.00943
Weibo Zhou, Lingbo Li, Shangsong Liang, 2 Aug 2025, D-SCoRE: Document-Centric Segmentation and CoT Reasoning with Structured Export for QA-CoT Data Generation, https://arxiv.org/abs/2508.01309
Fan Gao, Cheng Huang, Nyima Tashi, Yutong Liu, Xiangxiang Wang, Thupten Tsering, Ban Ma-bao, Renzeg Duojie, Gadeng Luosang, Rinchen Dongrub, Dorje Tashi, Xiao Feng, Hao Wang, Yongbin Yu, 4 Aug 2025, TIBSTC-CoT: A Multi-Domain Instruction Dataset for Chain-of-Thought Reasoning in Language Models, https://arxiv.org/abs/2508.01977
Huihan Li, You Chen, Siyuan Wang, Yixin He, Ninareh Mehrabi, Rahul Gupta, Xiang Ren, 4 Aug 2025, Diagnosing Memorization in Chain-of-Thought Reasoning, One Token at a Time, https://arxiv.org/abs/2508.02037
Hongbo Jin, Ruyang Liu, Wenhao Zhang, Guibo Luo, Ge Li, 3 Aug 2025, CoT-Vid: Dynamic Chain-of-Thought Routing with Self Verification for Training-Free Video Reasoning, https://arxiv.org/abs/2505.11830
Zeju Li, Jianyuan Zhong, Ziyang Zheng, Xiangyu Wen, Zhijian Xu, Yingying Cheng, Fan Zhang, Qiang Xu, 5 Aug 2025, Compressing Chain-of-Thought in LLMs via Step Entropy, https://arxiv.org/abs/2508.03346
Jueon Park, Yein Park, Minju Song, Soyon Park, Donghyeon Lee, Seungheun Baek and Jaewoo Kang, 5 Aug 2025, CoTox: Chain-of-Thought-Based Molecular Toxicity Reasoning and Prediction, https://arxiv.org/abs/2508.03159
Junyao Yang, Jianwei Wang, Huiping Zhuang, Cen Chen, Ziqian Zeng, 5 Aug 2025, RCP-Merging: Merging Long Chain-of-Thought Models with Domain-Specific Models by Considering Reasoning Capability as Prior, https://arxiv.org/abs/2508.03140
Weihua Zheng, Xin Huang, Zhengyuan Liu, Tarun Kumar Vangani, Bowei Zou, Xiyan Tao, Yuhao Wu, Ai Ti Aw, Nancy F. Chen, Roy Ka-Wei Lee, 5 Aug 2025, AdaMCoT: Rethinking Cross-Lingual Factual Reasoning through Adaptive Multilingual Chain-of-Thought, https://arxiv.org/abs/2501.16154
Xingyu Chen, Junxiu An, Jun Guo, Li Wang, Jingcai Guo, 6 Aug 2025, KG-Augmented Executable CoT for Mathematical Coding, https://arxiv.org/abs/2508.04072
Xiao Wang, Liye Jin, Xufeng Lou, Shiao Wang, Lan Chen, Bo Jiang, Zhipeng Zhang, 7 Aug 2025, ReasoningTrack: Chain-of-Thought Reasoning for Long-term Vision-Language Tracking, https://arxiv.org/abs/2508.05221
Haonan Shangguan, Xiaocui Yang, Shi Feng, Daling Wang, Yifei Zhang, and Ge Yu, 7 Aug 2025, Resource-Limited Joint Multimodal Sentiment Reasoning and Classification via Chain-of-Thought Enhancement and Distillation, https://arxiv.org/abs/2508.05234
Tianyun Yang, Yunwen Li, Ziniu Li, Zhihang Lin, Ruoyu Sun, Tian Ding, 12 Aug 2025, Bridging Formal Language with Chain-of-Thought Reasoning to Geometry Problem Solving, https://arxiv.org/abs/2508.09099
Haiyun Guo, ZhiYan Hou, Yu Chen, Jinghan He, Yandu Sun, Yuzhe Zhou, Shujing Guo, Kuan Zhu, Jinqiao Wang, 31 Jul 2025, MLLM-CBench:A Comprehensive Benchmark for Continual Instruction Tuning of Multimodal LLMs with Chain-of-Thought Reasoning Analysis, https://arxiv.org/abs/2508.08275
Axel Delaval, Shujian Yang, Haicheng Wang, Han Qiu, Jialiang Lu, 15 Aug 2025, ToxiFrench: Benchmarking and Enhancing Language Models via CoT Fine-Tuning for French Toxicity Detection, https://arxiv.org/abs/2508.11281
Phuong Minh Nguyen, Tien Huu Dang, Naoya Inoue, 17 Aug 2025, Non-Iterative Symbolic-Aided Chain-of-Thought for Logical Reasoning, https://arxiv.org/abs/2508.12425
Zhifeng Kong, Arushi Goel, Joao Felipe Santos, Sreyan Ghosh, Rafael Valle, Wei Ping, Bryan Catanzaro, 15 Aug 2025, Audio Flamingo Sound-CoT Technical Report: Improving Chain-of-Thought Reasoning in Sound Understanding, https://arxiv.org/abs/2508.11818
Ruheng Wang, Hang Zhang, Trieu Nguyen, Shasha Feng, Hao-Wei Pang, Xiang Yu, Li Xiao, Peter Zhiping Zhang, 20 Aug 2025, PepThink-R1: LLM for Interpretable Cyclic Peptide Optimization with CoT SFT and Reinforcement Learning, https://arxiv.org/abs/2508.14765
Josh Barua, Seun Eisape, Kayo Yin, Alane Suhr, 20 Aug 2025, Long Chain-of-Thought Reasoning Across Languages, https://arxiv.org/abs/2508.14828
Wenqiao Zhu, Ji Liu, Rongjuncheng Zhang, Haipang Wu, Yulun Zhang, 21 Aug 2025, CARFT: Boosting LLM Reasoning via Contrastive Learning with Annotated Chain-of-Thought-based Reinforced Fine-Tuning, https://arxiv.org/abs/2508.15868
Jeremy Berman, Sep 17, 2025, How I got the highest score on ARC-AGI again swapping Python for English: Using Multi-Agent Collaboration with Evolutionary Test-Time Compute, https://jeremyberman.substack.com/p/how-i-got-the-highest-score-on-arc-agi-again (Generates multiple solutions then prunes them with "evolution" and iterates in multi-step inference.)
Zeyu Gan, Hao Yi, Yong Liu, 4 Sep 2025, CoT-Space: A Theoretical Framework for Internal Slow-Thinking via Reinforcement Learning, https://arxiv.org/abs/2509.04027
Sunguk Choi, Yonghoon Kwon, Heondeuk Lee, 26 Aug 2025, CAC-CoT: Connector-Aware Compact Chain-of-Thought for Efficient Reasoning Data Synthesis Across Dual-System Cognitive Tasks, https://arxiv.org/abs/2508.18743
Xinglong Yang, Quan Feng, Zhongying Pan, Xiang Chen, Yu Tian, Wentong Li, Shuofei Qiao, Yuxia Geng, Xingyu Zhao, Sheng-Jun Huang, 26 Aug 2025, Tailored Teaching with Balanced Difficulty: Elevating Reasoning in Multimodal Chain-of-Thought via Prompt Curriculum, https://arxiv.org/abs/2508.18673
Rushitha Santhoshi Mamidala, Anshuman Chhabra, Ankur Mali, 22 Aug 2025, Rethinking Reasoning in LLMs: Neuro-Symbolic Local RetoMaton Beyond ICL and CoT, https://arxiv.org/abs/2508.19271
Haimei Pan, Jiyun Zhang, Qinxi Wei, Xiongnan Jin, Chen Xinkai, Jie Cheng, 25 Aug 2025, Robotic Fire Risk Detection based on Dynamic Knowledge Graph Reasoning: An LLM-Driven Approach with Graph Chain-of-Thought, https://arxiv.org/abs/2509.00054
Sheldon Yu, Yuxin Xiong, Junda Wu, Xintong Li, Tong Yu, Xiang Chen, Ritwik Sinha, Jingbo Shang, Julian McAuley, 29 Aug 2025, Explainable Chain-of-Thought Reasoning: An Empirical Analysis on State-Aware Reasoning Dynamics, https://arxiv.org/abs/2509.00190
Hao Yang, Zhiyu Yang, Yunjie Zhang, Shanyi Zhu, Lin Yang, 1 Sep 2025, Rethinking the Chain-of-Thought: The Roles of In-Context Learning and Pre-trained Priors, https://arxiv.org/abs/2509.01236
Ziyun Zeng, Junhao Zhang, Wei Li, Mike Zheng Shou, 2 Sep 2025, Draw-In-Mind: Learning Precise Image Editing via Chain-of-Thought Imagination, https://arxiv.org/abs/2509.01986
Xingyue Huang, Rishabh, Gregor Franke, Ziyi Yang, Jiamu Bai, Weijie Bai, Jinhe Bi, Zifeng Ding, Yiqun Duan, Chengyu Fan, Wendong Fan, Xin Gao, Ruohao Guo, Yuan He, Zhuangzhuang He, Xianglong Hu, Neil Johnson, Bowen Li, Fangru Lin, Siyu Lin, Tong Liu, Yunpu Ma, Hao Shen, Hao Sun, Beibei Wang, Fangyijie Wang, Hao Wang, Haoran Wang, Yang Wang, Yifeng Wang, Zhaowei Wang, Ziyang Wang, Yifan Wu, Zikai Xiao, Chengxing Xie, Fan Yang, Junxiao Yang, Qianshuo Ye, Ziyu Ye, Guangtao Zeng, Yuwen Ebony Zhang, Zeyu Zhang, Zihao Zhu, Bernard Ghanem, Philip Torr, Guohao Li, 3 Sep 2025, Loong: Synthesize Long Chain-of-Thoughts at Scale through Verifiers, https://arxiv.org/abs/2509.03059
Haoyang He, Zihua Rong, Kun Ji, Chenyang Li, Qing Huang, Chong Xia, Lan Yang, Honggang Zhang, 7 Sep 2025, Rethinking Reasoning Quality in Large Language Models through Enhanced Chain-of-Thought via RL, https://arxiv.org/abs/2509.06024
Yihong Luo, Wenwu He, Zhuo-Xu Cui, Dong Liang, 8 Sep 2025, Teaching AI Stepwise Diagnostic Reasoning with Report-Guided Chain-of-Thought Learning, https://arxiv.org/abs/2509.06409

Advanced Chain-of-Thought

Some more research on advanced improvements to multi-step Chain-of-Thought are below. See also CoT efficiency optimizations.

Jiaan Wang, Fandong Meng, Yunlong Liang, Jie Zhou, 23 Dec 2024, DRT-o1: Optimized Deep Reasoning Translation via Long Chain-of-Thought, https://arxiv.org/abs/2412.17498 https://github.com/krystalan/DRT-o1 (Examines similes and metaphors in literature using long CoT.)
Violet Xiang, Charlie Snell, Kanishk Gandhi, Alon Albalak, Anikait Singh, Chase Blagden, Duy Phung, Rafael Rafailov, Nathan Lile, Dakota Mahan, Louis Castricato, Jan-Philipp Franken, Nick Haber, Chelsea Finn, 8 Jan 2025, Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Thought, https://arxiv.org/abs/2501.04682
Haotian Xu, Xing Wu, Weinong Wang, Zhongzhi Li, Da Zheng, Boyuan Chen, Yi Hu, Shijia Kang, Jiaming Ji, Yingying Zhang, Zhijiang Guo, Yaodong Yang, Muhan Zhang, Debing Zhang, 20 Jan 2025, RedStar: Does Scaling Long-CoT Data Unlock Better Slow-Reasoning Systems? https://arxiv.org/abs/2501.11284 https://huggingface.co/RedStar-Reasoning
Yiyao Yu, Yuxiang Zhang, Dongdong Zhang, Xiao Liang, Hengyuan Zhang, Xingxing Zhang, Ziyi Yang, Mahmoud Khademi, Hany Awadalla, Junjie Wang, Yujiu Yang, Furu Wei, 19 Jan 2025, Chain-of-Reasoning: Towards Unified Mathematical Reasoning in Large Language Models via a Multi-Paradigm Perspective, https://arxiv.org/abs/2501.11110
Yuanheng Fang, Guoqing Chao, Wenqiang Lei, Shaobo Li, Dianhui Chu, 21 Jan 2025, CDW-CoT: Clustered Distance-Weighted Chain-of-Thoughts Reasoning, https://arxiv.org/abs/2501.12226 (CoT with integration of clustering and prompt optimization techniques.)
Jishnu Ray Chowdhury, Cornelia Caragea, 21 Jan 2025, Zero-Shot Verification-guided Chain of Thoughts, https://arxiv.org/abs/2501.13122
Ziyu Guo, Renrui Zhang, Chengzhuo Tong, Zhizheng Zhao, Peng Gao, Hongsheng Li, Pheng-Ann Heng, 23 Jan 2025, Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step, https://arxiv.org/abs/2501.13926 https://github.com/ZiyuGuo99/Image-Generation-CoT
Liang Wang, Haonan Chen, Nan Yang, Xiaolong Huang, Zhicheng Dou, Furu Wei, 24 Jan 2025, Chain-of-Retrieval Augmented Generation, https://arxiv.org/abs/2501.14342 (Combines RAG with multi-step reasoning such as Chain-of-Thought, with a method to control token cost.)
Zhenrui Yue, Honglei Zhuang, Aijun Bai, Kai Hui, Rolf Jagerman, Hansi Zeng, Zhen Qin, Dong Wang, Xuanhui Wang, Michael Bendersky, 6 Oct 2024, Inference Scaling for Long-Context Retrieval Augmented Generation, https://arxiv.org/abs/2410.04343 (Combine RAG and multi-step inference, controlling token cost via budgeting allocations.)
Jianfeng Pan, Senyou Deng, Shaomang Huang, 4 Feb 2025, CoAT: Chain-of-Associated-Thoughts Framework for Enhancing Large Language Models Reasoning, https://arxiv.org/abs/2502.02390 (Integrating results from an "associative memory" in CoT reasoning paths at inference time.)
Chen, H., Zhu, J., Wang, W. et al. Triplet-based contrastive method enhances the reasoning ability of large language models. J Supercomput 81, 555 (2025). https://doi.org/10.1007/s11227-025-07056-6 https://link.springer.com/article/10.1007/s11227-025-07056-6 (Providing prompt examples that contrast correct and incorrect results to improve CoT reasoning.)
Yexiang Liu, Zekun Li, Zhi Fang, Nan Xu, Ran He, Tieniu Tan, 16 May 2025, Rethinking the Role of Prompting Strategies in LLM Test-Time Scaling: A Perspective of Probability Theory, https://arxiv.org/abs/2505.10981
Halil Alperen Gozeten, M. Emrullah Ildiz, Xuechen Zhang, Hrayr Harutyunyan, Ankit Singh Rawat, Samet Oymak, 29 May 2025, Continuous Chain of Thought Enables Parallel Exploration and Reasoning, https://arxiv.org/abs/2505.23648
Chengshuai Zhao, Zhen Tan, Pingchuan Ma, Dawei Li, Bohan Jiang, Yancheng Wang, Yingzhen Yang, Huan Liu, 13 Aug 2025 (v3), Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens, https://arxiv.org/abs/2508.01191
Tiernan Ray, Sept. 6, 2025, AI's not 'reasoning' at all - how this team debunked the industry hype: Researchers just got very specific about what a language model's 'chain of thought' is actually doing, https://www.zdnet.com/article/ais-not-reasoning-at-all-how-this-team-debunked-the-industry-hype/
Jeremy Berman, Sep 17, 2025, How I got the highest score on ARC-AGI again swapping Python for English: Using Multi-Agent Collaboration with Evolutionary Test-Time Compute, https://jeremyberman.substack.com/p/how-i-got-the-highest-score-on-arc-agi-again (Generates multiple solutions then prunes them with "evolution" and iterates in multi-step inference.)

Tree-of-Thought (ToT)

Tree-of-thought is a tree-structured variant of multi-step Chain-of-Thought. Other tree-based versions of CoT are also examined below. Note that the "tree" structure also arises in "CoT decoding algorithms", which are single-step CoT-like inference optimizations that are based on the inherent tree hierarchy in beam search decoding.

Research papers on Tree-of-thought include:

Shunyu Yao, Dian Yu, Jeffrey Zhao, Izhak Shafran, Thomas L. Griffiths, Yuan Cao, Karthik Narasimhan, 17 May 2023, Tree of Thoughts: Deliberate Problem Solving with Large Language Models. https://arxiv.org/abs/2305.10601
Antonis Iliakis, Jul 5, 2024, Amazing Chat GPT Prompts That Will Take You to The Next Level — Part 3, https://generativeai.pub/i-asked-chat-gpt-to-think-like-a-human-heres-what-i-found-out-7a6017109d66
Alan Boyle, Isha Gupta, Sebastian Hönig, Lukas Mautner, Kenza Amara, Furui Cheng, Mennatallah El-Assady, 31 Aug 2024, iToT: An Interactive System for Customized Tree-of-Thought Generation, https://arxiv.org/abs/2409.00413
Pranab Sahoo, Ayush Kumar Singh, Sriparna Saha, Vinija Jain, Samrat Mondal, Aman Chadha, 5 Feb 2024, A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Applications, https://arxiv.org/abs/2402.07927
Emile J, May 28, 2023, Tree of Thoughts (ToT) Prompting: The Basics, https://medium.com/@emile.jonkers/tree-of-thought-tot-prompting-simply-explained-dca7e719752
Qiqi Chen, Xinpeng Wang, Philipp Mondorf, Michael A. Hedderich, Barbara Plank, 24 Oct 2024 (v2), Understanding When Tree of Thoughts Succeeds: Larger Models Excel in Generation, Not Discrimination, https://arxiv.org/abs/2410.17820 http://github.com/mainlp/tot-eval
Cameron R. Wolfe, Dec 23, 2023, Tree of Thoughts Prompting. Solving multi-step problems with LLMs via deliberate planning and exploration, https://towardsdatascience.com/tree-of-thoughts-prompting-65a3e51f9ac4
Cameron R. Wolfe, Aug 21, 2023, Tree of Thoughts Prompting. Solving multi-step problems with LLMs via deliberate planning and exploration, https://cameronrwolfe.substack.com/p/tree-of-thoughts-prompting
Tyler McDonald, Anthony Colosimo, Yifeng Li, Ali Emami, 2 Dec 2024, Can We Afford The Perfect Prompt? Balancing Cost and Accuracy with the Economical Prompting Index, https://arxiv.org/abs/2412.01690
Shiv Sakhuja, 25 Sep 2024, Chain-of-Thought (CoT) Prompting Explained: 7 Techniques for Optimizing AI Performance, https://hub.athina.ai/athina-originals/guides-chain-of-thought-cot-prompting-explained-7-techniques-for-optimizing-ai-performance/
Changcheng Li, Xiangyu Wang, Qiuju Chen, Xiren Zhou, Huanhuan Chen, 5 Dec 2024, MTMT: Consolidating Multiple Thinking Modes to Form a Thought Tree for Strengthening LLM, https://arxiv.org/abs/2412.03987
Maciej Besta, Julia Barth, Eric Schreiber, Ales Kubicek, Afonso Catarino, Robert Gerstenberger, Piotr Nyczyk, Patrick Iff, Yueling Li, Sam Houliston, Tomasz Sternal, Marcin Copik, Grzegorz Kwaśniewski, Jürgen Müller, Łukasz Flis, Hannes Eberhard, Hubert Niewiadomski, Torsten Hoefler, 23 Jan 2025 (v3), Reasoning Language Models: A Blueprint, https://arxiv.org/abs/2501.11223 (Survey and blueprint for how to build a Large Reasoning Model.)
G Wang, S Zhang, T Zhan, Z Shen, J Li, X Hu, X Sun, Jan 2025, Unlocking the Mysteries of OpenAI o1: A Survey of the Reasoning Abilities of Large Language Models, https://openreview.net/pdf?id=J0ADLa2rNp
Son, M., Won, Y.-J., & Lee, S. (2025). Optimizing Large Language Models: A Deep Dive into Effective Prompt Engineering Techniques. Applied Sciences, 15(3), 1430. https://doi.org/10.3390/app15031430 https://www.mdpi.com/2076-3417/15/3/1430
Avinash Patil, 5 Feb 2025, Advancing Reasoning in Large Language Models: Promising Methods and Approaches, https://arxiv.org/abs/2502.03671
Yifu Ding, Wentao Jiang, Shunyu Liu, Yongcheng Jing, Jinyang Guo, Yingjie Wang, Jing Zhang, Zengmao Wang, Ziwei Liu, Bo Du, Xianglong Liu, Dacheng Tao, 27 Feb 2025 (v2), Dynamic Parallel Tree Search for Efficient LLM Reasoning, https://arxiv.org/abs/2502.16235
Komal Kumar, Tajamul Ashraf, Omkar Thawakar, Rao Muhammad Anwer, Hisham Cholakkal, Mubarak Shah, Ming-Hsuan Yang, Phillip H.S. Torr, Salman Khan, Fahad Shahbaz Khan, 28 Feb 2025, LLM Post-Training: A Deep Dive into Reasoning Large Language Models, https://arxiv.org/abs/2502.21321 https://github.com/mbzuai-oryx/Awesome-LLM-Post-training
Ruiyan Qi, Congding Wen, Weibo Zhou, Shangsong Liang, Lingbo Li, 15 Aug 2025, LETToT: Label-Free Evaluation of Large Language Models On Tourism Using Expert Tree-of-Thought, https://arxiv.org/abs/2508.11280

Other Tree-Structured CoT Variants

Research papers on other tree-based CoT variants include:

Changcheng Li, Xiangyu Wang, Qiuju Chen, Xiren Zhou, Huanhuan Chen, 5 Dec 2024, MTMT: Consolidating Multiple Thinking Modes to Form a Thought Tree for Strengthening LLM, https://arxiv.org/abs/2412.03987
Yixin Ji, Juntao Li, Hai Ye, Kaixin Wu, Jia Xu, Linjian Mo, Min Zhang, 5 Jan 2025, Test-time Computing: from System-1 Thinking to System-2 Thinking, https://arxiv.org/abs/2501.02497
Fengli Xu, Qianyue Hao, Zefang Zong, Jingwei Wang, Yunke Zhang, Jingyi Wang, Xiaochong Lan, Jiahui Gong, Tianjian Ouyang, Fanjin Meng, Chenyang Shao, Yuwei Yan, Qinglong Yang, Yiwen Song, Sijian Ren, Xinyuan Hu, Yu Li, Jie Feng, Chen Gao, Yong Li, 17 Jan 2025 (v2), Towards Large Reasoning Models: A Survey on Scaling LLM Reasoning Capabilities, https://arxiv.org/abs/2501.09686
Tiesunlong Shen, Jin Wang1, Xuejie Zhang, Erik Cambria, Jan 2025, Reasoning with Trees: Faithful Question Answering over Knowledge Graph, Proceedings of the 31st International Conference on Computational Linguistics, pages 3138–3157 January 19–24, 2025, Association for Computational Linguistics, https://aclanthology.org/2025.coling-main.211.pdf
Maciej Besta, Julia Barth, Eric Schreiber, Ales Kubicek, Afonso Catarino, Robert Gerstenberger, Piotr Nyczyk, Patrick Iff, Yueling Li, Sam Houliston, Tomasz Sternal, Marcin Copik, Grzegorz Kwaśniewski, Jürgen Müller, Łukasz Flis, Hannes Eberhard, Hubert Niewiadomski, Torsten Hoefler, 23 Jan 2025 (v3), Reasoning Language Models: A Blueprint, https://arxiv.org/abs/2501.11223 (Survey and blueprint for how to build a Large Reasoning Model.)
Xiaoxue Cheng, Junyi Li, Wayne Xin Zhao, Ji-Rong Wen, 2 Jan 2025, Think More, Hallucinate Less: Mitigating Hallucinations via Dual Process of Fast and Slow Thinking, https://arxiv.org/abs/2501.01306
Kun-Peng Ning, Jia-Yu Yao, Yu-Yang Liu, Mu-Nan Ning, Li Yuan, 13 Jan 2025, GPT as a Monte Carlo Language Tree: A Probabilistic Perspective, https://arxiv.org/abs/2501.07641
G Wang, S Zhang, T Zhan, Z Shen, J Li, X Hu, X Sun, Jan 2025, Unlocking the Mysteries of OpenAI o1: A Survey of the Reasoning Abilities of Large Language Models, https://openreview.net/pdf?id=J0ADLa2rNp
Yang Li, 4 Feb 2025, Policy Guided Tree Search for Enhanced LLM Reasoning, https://arxiv.org/abs/2502.06813
Yifu Ding, Wentao Jiang, Shunyu Liu, Yongcheng Jing, Jinyang Guo, Yingjie Wang, Jing Zhang, Zengmao Wang, Ziwei Liu, Bo Du, Xianglong Liu, Dacheng Tao, 27 Feb 2025 (v2), Dynamic Parallel Tree Search for Efficient LLM Reasoning, https://arxiv.org/abs/2502.16235
Jeremy Berman, Sep 17, 2025, How I got the highest score on ARC-AGI again swapping Python for English: Using Multi-Agent Collaboration with Evolutionary Test-Time Compute, https://jeremyberman.substack.com/p/how-i-got-the-highest-score-on-arc-agi-again (Generates multiple solutions then prunes them with "evolution" and iterates in multi-step inference.)
Ahmed Bahloul, Simon Malberg, 18 Jul 2025, From Roots to Rewards: Dynamic Tree Reasoning with RL, https://arxiv.org/abs/2507.13142
Xin Wang, Jiyao Liu, Yulong Xiao, Junzhi Ning, Lihao Liu, Junjun He, Botian Shi, Kaicheng Yu, 21 Jul 2025, THE-Tree: Can Tracing Historical Evolution Enhance Scientific Verification and Reasoning?, https://arxiv.org/abs/2506.21763
Qi Peng, Jialin Cui, Jiayuan Xie, Yi Cai, Qing Li, 5 Aug 2025, Tree-of-Reasoning: Towards Complex Medical Diagnosis via Multi-Agent Reasoning with Evidence Tree, https://arxiv.org/abs/2508.03038
Pardis Moradbeiki, Nasser Ghadiri, Sayed Jalal Zahabi, Uffe Kock Wiil, Kristoffer Kittelmann Brockhattingen, Ali Ebrahimi, 26 Aug 2025, MedVQA-TREE: A Multimodal Reasoning and Retrieval Framework for Sarcopenia Prediction, https://arxiv.org/abs/2508.19319
Song Yu, Xiaofei Xu, Ke Deng, Li Li, Lin Tian, 8 Sep 2025, Tree of Agents: Improving Long-Context Capabilities of Large Language Models through Multi-Perspective Reasoning, https://arxiv.org/abs/2509.06436

Graph Reasoning

Graph reasoning is the use of a graph structure, such as a Knowledge Graph, as part of the reasoning algorithm. There is also a variant of Chain-of-Thought called "Graph-of-Thought" or GOT (dragons, anyone?). This is a further generalization of tree-based reasoning hierarchies.

Research papers on graph-based reasoning:

Cameron R. Wolfe, Jan 3, 2024, Graph-Based Prompting and Reasoning with Language Models. Understanding graph of thoughts prompting and several variants… https://towardsdatascience.com/graph-based-prompting-and-reasoning-with-language-models-d6acbcd6b3d8
Jiarui Ji, Runlin Lei, Jialing Bi, Zhewei Wei, Yankai Lin, Xuchen Pan, Yaliang Li, Bolin Ding, 13 Oct 2024, Dynamic and Textual Graph Generation Via Large-Scale LLM-based Agent Simulation, https://arxiv.org/abs/2410.09824
Yuwei Hu, Runlin Lei, Xinyi Huang, Zhewei Wei, Yongchao Liu, 7 Oct 2024, Scalable and Accurate Graph Reasoning with LLM-based Multi-Agents, https://arxiv.org/abs/2410.05130
Sambhav Khurana, Xiner Li, Shurui Gui, Shuiwang Ji, 29 Oct 2024, A Hierarchical Language Model For Interpretable Graph Reasoning, https://arxiv.org/abs/2410.22372
Haoyu Han, Yaochen Xie, Hui Liu, Xianfeng Tang, Sreyashi Nag, William Headden, Hui Liu, Yang Li, Chen Luo, Shuiwang Ji, Qi He, Jiliang Tang, 14 Jan 2025, Reasoning with Graphs: Structuring Implicit Knowledge to Enhance LLMs Reasoning, https://arxiv.org/abs/2501.07845
F. Alotaibi, A. Kulkarni and D. Zhou, "Graph of Logic: Enhancing LLM Reasoning with Graphs and Symbolic Logic," 2024 IEEE International Conference on Big Data (BigData), Washington, DC, USA, 2024, pp. 5926-5935, doi: 10.1109/BigData62323.2024.10825450. https://ieeexplore.ieee.org/abstract/document/10825450
Maciej Besta, Julia Barth, Eric Schreiber, Ales Kubicek, Afonso Catarino, Robert Gerstenberger, Piotr Nyczyk, Patrick Iff, Yueling Li, Sam Houliston, Tomasz Sternal, Marcin Copik, Grzegorz Kwaśniewski, Jürgen Müller, Łukasz Flis, Hannes Eberhard, Hubert Niewiadomski, Torsten Hoefler, 23 Jan 2025 (v3), Reasoning Language Models: A Blueprint, https://arxiv.org/abs/2501.11223 (Survey and blueprint for how to build a Large Reasoning Model.)
Xingtong Yu, Chang Zhou, Zhongwei Kuai, Xinming Zhang, Yuan Fang, 12 Feb 2025, GCoT: Chain-of-Thought Prompt Learning for Graphs, https://arxiv.org/abs/2502.08092
Han Zhang, Langshi Zhou, Hanfang Yang, 20 Feb 2025, Learning to Retrieve and Reason on Knowledge Graph through Active Self-Reflection, https://arxiv.org/abs/2502.14932
Anastasios Nentidis, Charilaos Akasiadis, Angelos Charalambidis, Alexander Artikis, 26 Feb 2025, Dealing with Inconsistency for Reasoning over Knowledge Graphs: A Survey, https://arxiv.org/abs/2502.19023
Avinash Patil, 5 Feb 2025, Advancing Reasoning in Large Language Models: Promising Methods and Approaches, https://arxiv.org/abs/2502.03671
Komal Kumar, Tajamul Ashraf, Omkar Thawakar, Rao Muhammad Anwer, Hisham Cholakkal, Mubarak Shah, Ming-Hsuan Yang, Phillip H.S. Torr, Salman Khan, Fahad Shahbaz Khan, 28 Feb 2025, LLM Post-Training: A Deep Dive into Reasoning Large Language Models, https://arxiv.org/abs/2502.21321 https://github.com/mbzuai-oryx/Awesome-LLM-Post-training
Wenjie Wu, Yongcheng Jing, Yingjie Wang, Wenbin Hu, Dacheng Tao, 3 Mar 2025, Graph-Augmented Reasoning: Evolving Step-by-Step Knowledge Graph Retrieval for LLM Reasoning, https://arxiv.org/abs/2503.01642

Skeleton-of-Thought

Skeleton-of-thought is a technique with dual aims of smarter reasoning and faster inference. The idea is to generate an outline that is a list of points, and then have the LLM process each sub-point in parallel. This allows both a more focused answer to that issue, and a faster parallelization of shorter token length answers.

Research on skeleton-of-thought reasoning includes:

L. Zheng, L. Yin, Z. Xie, J. Huang, C. Sun, C. H. Yu, S. Cao, C. Kozyrakis, I. Stoica, J. E. Gonzalez et al., Dec 2023, Efficiently programming large language models using SGLang, arXiv preprint arXiv:2312.07104, 2023, https://arxiv.org/abs/2312.07104 (Uses a radix attention method, a trie or prefix tree, for KV caching.)
Xuefei Ning , Zinan Lin , November 17, 2023 Skeleton-of-Thought: Parallel decoding speeds up and improves LLM output, Microsoft Research Blog, https://www.microsoft.com/en-us/research/blog/skeleton-of-thought-parallel-decoding-speeds-up-and-improves-llm-output/ Code: https://github.com/imagination-research/sot/
S. Jin, Y. Wu, H. Zheng, Q. Zhang, M. Lentz, Z. M. Mao, A. Prakash, F. Qian, and D. Zhuo, “Adaptive skeleton graph decoding,” arXiv preprint arXiv:2402.12280, 2024. https://arxiv.org/abs/2402.12280
M. Liu, A. Zeng, B. Wang, P. Zhang, J. Tang, and Y. Dong, “Apar: Llms can do auto-parallel auto-regressive decoding,” arXiv preprint arXiv:2401.06761, 2024. https://arxiv.org/abs/2401.06761
8 Jun 2024 (v2), A Survey on Efficient Inference for Large Language Models, Zixuan Zhou, Xuefei Ning, Ke Hong, Tianyu Fu, Jiaming Xu, Shiyao Li, Yuming Lou, Luning Wang, Zhihang Yuan, Xiuhong Li, Shengen Yan, Guohao Dai, Xiao-Ping Zhang, Yuhan Dong, Yu Wang, https://arxiv.org/abs/2404.14294
Mahsa Khoshnoodi, Vinija Jain, Mingye Gao, Malavika Srikanth, Aman Chadha, 24 May 2024 (v2), A Comprehensive Survey of Accelerated Generation Techniques in Large Language Models, https://arxiv.org/abs/2405.13019
Steven Kolawole, KeshavSanthanam, Virginia Smith, Pratiksha Thaker, Nov 2024, Extracting Parallelism from LargeLanguageModelQueries, https://openreview.net/pdf?id=CZHt9kLS5S
Huiyou Zhan, Xuan Zhang, Haisheng Tan, Han Tian, Dongping Yong, Junyang Zhang, Xiang-Yang Li, 16 Jan 2025, PICE: A Semantic-Driven Progressive Inference System for LLM Serving in Cloud-Edge Networks, https://arxiv.org/abs/2501.09367 (Generate an outline in the cloud that is filled in by edge models, which is similar to Skeleton-of-Thought.)
Xuefei Ning, Zinan Lin, Zixuan Zhou, Zifu Wang, Huazhong Yang, Yu Wang, May 2024, Skeleton-of-Thought: Prompting LLMs for Efficient Parallel Generation, ICLR 2024, https://www.microsoft.com/en-us/research/publication/skeleton-of-thought-large-language-models-can-do-parallel-decoding/ https://neurips2023-enlsp.github.io/papers/paper_33.pdf Code: https://github.com/imagination-research/sot/
Ruibin Xiong, Yimeng Chen, Dmitrii Khizbullin, Jürgen Schmidhuber, 11 Mar 2025, Beyond Outlining: Heterogeneous Recursive Planning for Adaptive Long-form Writing with Language Models, https://arxiv.org/abs/2503.08275
Yijiong Yu, 26 Mar 2025, Accelerate Parallelizable Reasoning via Parallel Decoding within One Sequence, https://arxiv.org/abs/2503.20533 https://github.com/yuyijiong/parallel-decoding-in-one-sequence
Siqi Fan, Peng Han, Shuo Shang, Yequan Wang, Aixin Sun, 28 May 2025, CoThink: Token-Efficient Reasoning via Instruct Models Guiding Reasoning Models, https://arxiv.org/abs/2505.22017 (Generate an outline before reasoning.)
Ali Ismail-Fawaz and Maxime Devanne and Stefano Berretti and Jonathan Weber and Germain Forestier, 28 Jul 2025, Deep Learning for Skeleton Based Human Motion Rehabilitation Assessment: A Benchmark, https://arxiv.org/abs/2507.21018
Tiantian Liu, Xiao Li, Huan Li, Hua Lu, Christian S. Jensen, Jianliang Xu, 4 Aug 2025, Skeleton-Guided Learning for Shortest Path Search, https://arxiv.org/abs/2508.02270
Youwei Zhou and Tianyang Xu and Cong Wu and Xiaojun Wu and Josef Kittler, 4 Aug 2025, Adaptive Hyper-Graph Convolution Network for Skeleton-based Human Action Recognition with Virtual Connections, https://arxiv.org/abs/2411.14796
Devansh Arora, Nitin Kumar, Sukrit Gupta, 15 Aug 2025, Does the Skeleton-Recall Loss Really Work?, https://arxiv.org/abs/2508.11374
Maolin Sun, Yibiao Yang, Yuming Zhou, 28 Aug 2025, Boosting Skeleton-Driven SMT Solver Fuzzing by Leveraging LLM to Produce Formula Generators, https://arxiv.org/abs/2508.20340

Reflection

Reflection, or self-reflection, is a type of reasoning where the LLM takes an extra step to "reflect" on its own answers. This is a type of multi-step reasoning method, where the LLM is admonished to improve its own answers. There are different variants of self-reflection for training improvement or inference improvement.

Research papers on reflection:

Cogni Down Under, Sep 2024, Reflection 70B: The AI That Thinks Before It Speaks, https://medium.com/@cognidownunder/reflection-70b-the-ai-that-thinks-before-it-speaks-8a70d3a0e38a
Asankhaya Sharma (codelion), Sep 2024, Optillm: Optimizing inference proxy for LLMs, https://github.com/codelion/optillm
Pranab Sahoo, Ayush Kumar Singh, Sriparna Saha, Vinija Jain, Samrat Mondal, Aman Chadha, 5 Feb 2024, A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Applications, https://arxiv.org/abs/2402.07927
Lingjiao Chen, Jared Quincy Davis, Boris Hanin, Peter Bailis, Ion Stoica, Matei Zaharia, James Zou, 4 Jun 2024 (v2), Are More LLM Calls All You Need? Towards Scaling Laws of Compound Inference Systems, https://arxiv.org/abs/2403.02419
Siyun Zhao, Yuqing Yang, Zilong Wang, Zhiyuan He, Luna K. Qiu, Lili Qiu, 23 Sep 2024, Retrieval Augmented Generation (RAG) and Beyond: A Comprehensive Survey on How to Make your LLMs use External Data More Wisely, https://arxiv.org/abs/2409.14924
Arun Shankar, Oct 2024, Designing Cognitive Architectures: Agentic Workflow Patterns from Scratch, https://medium.com/google-cloud/designing-cognitive-architectures-agentic-workflow-patterns-from-scratch-63baa74c54bc
Anita Kirkovska, David Vargas, Jul 11, 2024, Agentic Workflows in 2024: The ultimate guide, https://www.vellum.ai/blog/agentic-workflows-emerging-architectures-and-design-patterns
A. Singh, A. Ehtesham, S. Kumar and T. T. Khoei, "Enhancing AI Systems with Agentic Workflows Patterns in Large Language Model," 2024 IEEE World AI IoT Congress (AIIoT), Seattle, WA, USA, 2024, pp. 527-532, doi: 10.1109/AIIoT61789.2024.10578990. https://ieeexplore.ieee.org/abstract/document/10578990
Chawla, Chhavi; Chatterjee, Siddharth; Gadadinni, Sanketh Siddanna; Verma, Pulkit; Banerjee, Sourav, 2024, Agentic AI: The building blocks of sophisticated AI business applications, Journal of AI, Robotics & Workplace Automation, Volume 3 / Number 3 / Summer 2024, pp. 1-15(15), Henry Stewart Publications, DOI: https://doi.org/10.69554/XEHZ1946 https://www.ingentaconnect.com/content/hsp/airwa/2024/00000003/00000003/art00001
Yu Zhao, Huifeng Yin, Bo Zeng, Hao Wang, Tianqi Shi, Chenyang Lyu, Longyue Wang, Weihua Luo, Kaifu Zhang, 21 Nov 2024, Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions, https://arxiv.org/abs/2411.14405
mshumer, Nov 2024, Open Reasoning Engine, https://github.com/mshumer/OpenReasoningEngine
Yaoke Wang, Yun Zhu, Xintong Bao, Wenqiao Zhang, Suyang Dai, Kehan Chen, Wenqiang Li, Gang Huang, Siliang Tang, Yueting Zhuang, 18 Dec 2024, Meta-Reflection: A Feedback-Free Reflection Learning Framework, https://arxiv.org/abs/2412.13781 (One-shot reflection by using a cache of prior reflection results.)
Aske Plaat, Annie Wong, Suzan Verberne, Joost Broekens, Niki van Stein, Thomas Back, 16 Jul 2024, Reasoning with Large Language Models, a Survey, https://arxiv.org/abs/2407.11511
Thomas Palmeira Ferraz, Kartik Mehta, Yu-Hsiang Lin, Haw-Shiuan Chang, Shereen Oraby, Sijia Liu, Vivek Subramanian, Tagyoung Chung, Mohit Bansal, Nanyun Peng, 9 Oct 2024, LLM Self-Correction with DeCRIM: Decompose, Critique, and Refine for Enhanced Following of Instructions with Multiple Constraints, https://arxiv.org/abs/2410.06458
Yuhang Liu, Pengxiang Li, Zishu Wei, Congkai Xie, Xueyu Hu, Xinchen Xu, Shengyu Zhang, Xiaotian Han, Hongxia Yang, Fei Wu, 8 Jan 2025, InfiGUIAgent: A Multimodal Generalist GUI Agent with Native Reasoning and Reflection, https://arxiv.org/abs/2501.04575
Ruwei Pan, Hongyu Zhang, Chao Liu, 14 Jan 2025, CodeCoR: An LLM-Based Self-Reflective Multi-Agent Framework for Code Generation, https://arxiv.org/abs/2501.07811
Zekun Xi, Wenbiao Yin, Jizhan Fang, Jialong Wu, Runnan Fang, Ningyu Zhang, Jiang Yong, Pengjun Xie, Fei Huang, Huajun Chen, 16 Jan 2025, OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking, https://arxiv.org/abs/2501.09751 (Iteratively going deeper into a topic while generating.)
Siyu Yuan, Zehui Chen, Zhiheng Xi, Junjie Ye, Zhengyin Du, Jiecao Chen, 20 Jan 2025, Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training, https://arxiv.org/abs/2501.11425 (Iterative self-training using reflection.)
Maciej Besta, Julia Barth, Eric Schreiber, Ales Kubicek, Afonso Catarino, Robert Gerstenberger, Piotr Nyczyk, Patrick Iff, Yueling Li, Sam Houliston, Tomasz Sternal, Marcin Copik, Grzegorz Kwaśniewski, Jürgen Müller, Łukasz Flis, Hannes Eberhard, Hubert Niewiadomski, Torsten Hoefler, 23 Jan 2025 (v3), Reasoning Language Models: A Blueprint, https://arxiv.org/abs/2501.11223 (Survey and blueprint for how to build a Large Reasoning Model.)
Xiangjue Dong, Maria Teleki, James Caverlee, 18 Dec 2024, A Survey on LLM Inference-Time Self-Improvement, https://arxiv.org/abs/2412.14352 https://github.com/dongxiangjue/Awesome-LLM-Self-Improvement
M. Renze and E. Guven, "Self-Reflection in Large Language Model Agents: Effects on Problem-Solving Performance," 2024 2nd International Conference on Foundation and Large Language Models (FLLM), Dubai, United Arab Emirates, 2024, pp. 516-525, doi: 10.1109/FLLM63129.2024.10852426. https://ieeexplore.ieee.org/abstract/document/10852426/ https://github.com/matthewrenze/self-reflection
G Wang, S Zhang, T Zhan, Z Shen, J Li, X Hu, X Sun, Jan 2025, Unlocking the Mysteries of OpenAI o1: A Survey of the Reasoning Abilities of Large Language Models, https://openreview.net/pdf?id=J0ADLa2rNp
Komal Kumar, Tajamul Ashraf, Omkar Thawakar, Rao Muhammad Anwer, Hisham Cholakkal, Mubarak Shah, Ming-Hsuan Yang, Phillip H.S. Torr, Salman Khan, Fahad Shahbaz Khan, 28 Feb 2025, LLM Post-Training: A Deep Dive into Reasoning Large Language Models, https://arxiv.org/abs/2502.21321 https://github.com/mbzuai-oryx/Awesome-LLM-Post-training
Yichi Zhou, Jianqiu Zhao, Yongxin Zhang, Bohan Wang, Siran Wang, Luoxin Chen, Jiahui Wang, Haowei Chen, Allan Jie, Xinbo Zhang, Haocheng Wang, Luong Trung, Rong Ye, Phan Nhat Hoang, Huishuai Zhang, Peng Sun, Hang Li, 21 Jul 2025, Solving Formal Math Problems by Decomposition and Iterative Reflection, https://arxiv.org/abs/2507.15225
Yufan Song, Jiatao Zhang, Zeng Gu, Qingmiao Liang, Tuocheng Hu, Wei Song, Shiqiang Zhu, 20 Jul 2025, FCRF: Flexible Constructivism Reflection for Long-Horizon Robotic Task Planning with Large Language Models, https://arxiv.org/abs/2507.14975
Rui Lu and Jinhe Bi and Yunpu Ma and Feng Xiao and Yuntao Du and Yijun Tian, 10 Aug 2025, MV-Debate: Multi-view Agent Debate with Dynamic Reflection Gating for Multimodal Harmful Content Detection in Social Media, https://arxiv.org/abs/2508.05557
Shijie Cao, Yuan Yuan, 3 Aug 2025, ReflecSched: Solving Dynamic Flexible Job-Shop Scheduling via LLM-Powered Hierarchical Reflection, https://arxiv.org/abs/2508.01724
Abi Aryan, Zac Liu, 6 Aug 2025, Causal Reflection with Language Models, https://arxiv.org/abs/2508.04495
Vishnu Menon, Andy Cherney, Elizabeth B. Cloude, Li Zhang, Tiffany D. Do, 6 Aug 2025, Evaluating the Impact of LLM-guided Reflection on Learning Outcomes with Interactive AI-Generated Educational Podcasts, https://arxiv.org/abs/2508.04787
Jiameng Huang, Baijiong Lin, Guhao Feng, Jierun Chen, Di He, and Lu Hou, 7 Aug 2025, Efficient Reasoning for Large Reasoning Language Models via Certainty-Guided Reflection Suppression, https://arxiv.org/abs/2508.05337
Lingyuan Liu, Mengxiang Zhang, 8 Aug 2025, Less is More: Selective Reflection for Compatible and Efficient Knowledge Distillation in Large Language Models, https://arxiv.org/abs/2508.06135
Zeyu Tang, Alex John London, Atoosa Kasirzadeh, Sanmi Koyejo, Peter Spirtes, Kun Zhang, 10 Aug 2025, Algorithmic Fairness amid Social Determinants: Reflection, Characterization, and Approach, https://arxiv.org/abs/2508.08337
Jiawei Zhou, Amy Z. Chen, Darshi Shah, Laura M. Schwab Reese, and Munmun De Choudhury, 11 Aug 2025, A Risk Taxonomy and Reflection Tool for Large Language Model Adoption in Public Health, https://arxiv.org/abs/2411.02594
Katharina Stein, Nils Hodel, Daniel Fi\v{s}er, J\"org Hoffmann, Michael Katz and Alexander Koller, 19 Aug 2025, Improved Generalized Planning with LLMs through Strategy Refinement and Reflection, https://arxiv.org/abs/2508.13876
Feng Tian, Flora D. Salim, Hao Xue, 25 Aug 2025, TradingGroup: A Multi-Agent Trading System with Self-Reflection and Data-Synthesis, https://arxiv.org/abs/2508.17565
Fu-Chieh Chang, Yu-Ting Lee, Pei-Yuan Wu, 23 Aug 2025, Unveiling the Latent Directions of Reflection in Large Language Models, https://arxiv.org/abs/2508.16989
Melissa Kazemi Rad, Alberto Purpura, Himanshu Kumar, Emily Chen, Mohammad Shahed Sorower, 23 Aug 2025, GRAID: Synthetic Data Generation with Geometric Constraints and Multi-Agentic Reflection for Harmful Content Detection, https://arxiv.org/abs/2508.17057
Aswin RRV, Jacob Dineen, Divij Handa, Md Nayem Uddin, Mihir Parmar, Chitta Baral, Ben Zhou, 11 Aug 2025, ThinkTuning: Instilling Cognitive Reflections without Distillation, https://arxiv.org/abs/2508.07616
Chunlong Wu and Zhibo Qu, 26 Aug 2025, Reflection-Enhanced Meta-Optimization Integrating TextGrad-style Prompt Optimization with Memory-Driven Self-Evolution, https://arxiv.org/abs/2508.18749
Qiang Liu, Xinlong Chen, Yue Ding, Bowen Song, Weiqiang Wang, Shu Wu, Liang Wang, 3 Sep 2025, Attention-guided Self-reflection for Zero-shot Hallucination Detection in Large Language Models, https://arxiv.org/abs/2501.09997
Quan Chen, Chenrui Shi, Qi Chen, Yuwei Wu, Zhi Gao, Xintong Zhang, Rui Gao, Kun Wu, and Yunde Jia, 4 Sep 2025, Long-Horizon Visual Imitation Learning via Plan and Code Reflection, https://arxiv.org/abs/2509.05368

LLM as Judge

LLM as Judge is the method of improving outputs by having an LLM "judge" the correctness of another LLM's output, whether to evaluate it or make improvements. When the LLM judges its own output, this is known as "self-reflection." When an LLM judges a group of other LLM outputs from the same query, and chooses the best, this is called "Best-of-N."

Research papers on LLM-as-Judge areas:

Cameron R. Wolfe, Ph.D., Dec 02, 2024, Finetuning LLM Judges for Evaluation: The Prometheus suite, JudgeLM, PandaLM, AutoJ, and more..., https://cameronrwolfe.substack.com/p/finetuned-judge
Tom Schaul, 25 Nov 2024, Boundless Socratic Learning with Language Games, https://arxiv.org/abs/2411.16905
Mingchen Zhuge, Changsheng Zhao, Dylan Ashley, Wenyi Wang, Dmitrii Khizbullin, Yunyang Xiong, Zechun Liu, Ernie Chang, Raghuraman Krishnamoorthi, Yuandong Tian, Yangyang Shi, Vikas Chandra, Jürgen Schmidhuber, 16 Oct 2024 (v2), Agent-as-a-Judge: Evaluate Agents with Agents, https://arxiv.org/abs/2410.10934
Haitao Li, Qian Dong, Junjie Chen, Huixue Su, Yujia Zhou, Qingyao Ai, Ziyi Ye, Yiqun Liu, 10 Dec 2024 (v2), LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods, https://arxiv.org/abs/2412.05579 https://github.com/CSHaitao/Awesome-LLMs-as-Judges
Xiangjue Dong, Maria Teleki, James Caverlee, 18 Dec 2024, A Survey on LLM Inference-Time Self-Improvement, https://arxiv.org/abs/2412.14352 https://github.com/dongxiangjue/Awesome-LLM-Self-Improvement (Broad survey of reasoning improvement methods from multi-step inference to RALM to decoding algorithms.)
Aske Plaat, Annie Wong, Suzan Verberne, Joost Broekens, Niki van Stein, Thomas Back, 16 Jul 2024, Reasoning with Large Language Models, a Survey, https://arxiv.org/abs/2407.11511
Yixin Ji, Juntao Li, Hai Ye, Kaixin Wu, Jia Xu, Linjian Mo, Min Zhang, 5 Jan 2025, Test-time Computing: from System-1 Thinking to System-2 Thinking, https://arxiv.org/abs/2501.02497
Zhenting Wang, Shuming Hu, Shiyu Zhao, Xiaowen Lin, Felix Juefei-Xu, Zhuowei Li, Ligong Han, Harihar Subramanyam, Li Chen, Jianfa Chen, Nan Jiang, Lingjuan Lyu, Shiqing Ma, Dimitris N. Metaxas, Ankit Jain, 31 Dec 2024, MLLM-as-a-Judge for Image Safety without Human Labeling, https://arxiv.org/abs/2501.00192
Zheqi Lv, Wenkai Wang, Jiawei Wang, Shengyu Zhang, Fei Wu, 10 Jan 2025, Cascaded Self-Evaluation Augmented Training for Efficient Multimodal Large Language Models, https://arxiv.org/abs/2501.05662 (Optimize multimodal CoT by breaking down prompts into smaller sub-goals.)
Maciej Besta, Julia Barth, Eric Schreiber, Ales Kubicek, Afonso Catarino, Robert Gerstenberger, Piotr Nyczyk, Patrick Iff, Yueling Li, Sam Houliston, Tomasz Sternal, Marcin Copik, Grzegorz Kwaśniewski, Jürgen Müller, Łukasz Flis, Hannes Eberhard, Hubert Niewiadomski, Torsten Hoefler, 23 Jan 2025 (v3), Reasoning Language Models: A Blueprint, https://arxiv.org/abs/2501.11223 (Survey and blueprint for how to build a Large Reasoning Model.)
Yafu Li, Zhilin Wang, Tingchen Fu, Ganqu Cui, Sen Yang, Yu Cheng, 21 Jan 2025, From Drafts to Answers: Unlocking LLM Potential via Aggregation Fine-Tuning, https://arxiv.org/abs/2501.11877 (Fine-tune an LLM to accept multiple candidate answers and output a final one.)
Swarnadeep Saha, Xian Li, Marjan Ghazvininejad, Jason Weston, Tianlu Wang, 30 Jan 2025, Learning to Plan & Reason for Evaluation with Thinking-LLM-as-a-Judge, https://arxiv.org/abs/2501.18099
Yubo Wang, Xiang Yue, Wenhu Chen, 30 Jan 2025 (v2), Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate, https://arxiv.org/abs/2501.17703
Gregor Bachmann, Sotiris Anagnostidis, Albert Pumarola, Markos Georgopoulos, Artsiom Sanakoyeu, Yuming Du, Edgar Schönfeld, Ali Thabet, Jonas Kohler, 31 Jan 2025, Judge Decoding: Faster Speculative Sampling Requires Going Beyond Model Alignment, https://arxiv.org/abs/2501.19309 (Using "LLM as Judge" methods to speed up speculative decoding via higher acceptance rates.)
Joshua Ong Jun Leang, Giwon Hong, Wenda Li, Shay B. Cohen, 18 Feb 2025, Theorem Prover as a Judge for Synthetic Data Generation, https://arxiv.org/abs/2502.13137
Avinash Patil, 5 Feb 2025, Advancing Reasoning in Large Language Models: Promising Methods and Approaches, https://arxiv.org/abs/2502.03671
Evangelia Spiliopoulou, Riccardo Fogliato, Hanna Burnsky, Tamer Soliman, Jie Ma, Graham Horwood, Miguel Ballesteros, 8 Aug 2025, Play Favorites: A Statistical Method to Measure Self-Bias in LLM-as-a-Judge, https://arxiv.org/abs/2508.06709
Zailong Tian, Zhuoheng Han, Yanzhe Chen, Haozhe Xu, Xi Yang, Richeng Xuan, Houfeng Wang, Lizi Liao, 11 Aug 2025, Overconfidence in LLM-as-a-Judge: Diagnosis and Confidence-Driven Solution, https://arxiv.org/abs/2508.06225
Asaf Yehudai, Lilach Eden, Yotam Perlitz, Roy Bar-Haim, Michal Shmueli-Scheuer, 24 Jul 2025, CLEAR: Error Analysis via LLM-as-a-Judge Made Easy, https://arxiv.org/abs/2507.18392
Nitay Calderon, Roi Reichart, Rotem Dror, 8 Aug 2025, The Alternative Annotator Test for LLM-as-a-Judge: How to Statistically Justify Replacing Human Annotators with LLMs, https://arxiv.org/abs/2501.10970
Francesco Fabbri, Gustavo Penha, Edoardo D'Amico, Alice Wang, Marco De Nadai, Jackie Doremus, Paul Gigioli, Andreas Damianou, Oskar Stal, and Mounia Lalmas, 12 Aug 2025, Evaluating Podcast Recommendations with Profile-Aware LLM-as-a-Judge, https://arxiv.org/abs/2508.08777
Yang Zhang, Cunxiang Wang, Lindong Wu, Wenbo Yu, Yidong Wang, Guangsheng Bao, Jie Tang, 13 Aug 2025, UDA: Unsupervised Debiasing Alignment for Pair-wise LLM-as-a-Judge, https://arxiv.org/abs/2508.09724
Hongchao Jiang, Yiming Chen, Yushi Cao, Hung-yi Lee, Robby T. Tan, 14 Aug 2025, CodeJudgeBench: Benchmarking LLM-as-a-Judge for Coding Tasks, https://arxiv.org/abs/2507.10535
Arduin Findeis, Floris Weers, Guoli Yin, Ke Ye, Ruoming Pang, Tom Gunter, 22 Jul 2025, Can External Validation Tools Improve Annotation Quality for LLM-as-a-Judge?, https://arxiv.org/abs/2507.17015
Luke Guerdan, Solon Barocas, Kenneth Holstein, Hanna Wallach, Zhiwei Steven Wu, Alexandra Chouldechova, 21 Aug 2025, Validating LLM-as-a-Judge Systems under Rating Indeterminacy, https://arxiv.org/abs/2503.05965
Jiawen Shi, Zenghui Yuan, Yinuo Liu, Yue Huang, Pan Zhou, Lichao Sun, Neil Zhenqiang Gong, 24 Aug 2025, Optimization-based Prompt Injection Attack to LLM-as-a-Judge, https://arxiv.org/abs/2403.17710
Xiaolong Wei, Bo Lu, Xingyu Zhang, Zhejun Zhao, Dongdong Shen, Long Xia, Dawei Yin, 29 Aug 2025, Igniting Creative Writing in Small Language Models: LLM-as-a-Judge versus Multi-Agent Refined Rewards, https://arxiv.org/abs/2508.21476

System 2

System 2 is the slower reasoning mode of the human brain, which multi-step reasoning algorithms try to emulate. This is the conscious brain and its capability for rational reasoning, usually in a slow and step-by-step fashion, which reasoning algorithms such as Chain-of-Thought aim to copy. By comparison, System 1 is the sensory processing and intuitive type of brain functions, including the "subconscious" brain, which is massively parallel and innate, but also lacking in rationality and explainability, much like a raw neural network.

Research papers on LLMs and System 2 thinking modes:

Pranab Sahoo, Ayush Kumar Singh, Sriparna Saha, Vinija Jain, Samrat Mondal, Aman Chadha, 5 Feb 2024, A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Applications, https://arxiv.org/abs/2402.07927
Akash Bajwa, Oct 07, 2024, Inference Time Scaling Laws: AI Megacycle Of System 1 And System 2 Applications, https://akashbajwa.substack.com/p/inference-time-scaling-laws
Latent Space, Nov 05, 2024, Inference, Fast and Slow. When System 1/System 2 analogies are not enough: The 6 types of LLM inference https://www.latent.space/p/inference-fast-and-slow
Ping Yu, Jing Xu, Jason Weston, Ilia Kulikov, 24 Jul 2024 (v3), Distilling System 2 into System 1, https://arxiv.org/abs/2407.06023
DiJia Su, Sainbayar Sukhbaatar, Michael Rabbat, Yuandong Tian, Qinqing Zheng, 13 Oct 2024, Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces, https://arxiv.org/abs/2410.09918
Cheng Yang, Chufan Shi, Siheng Li, Bo Shui, Yujiu Yang, Wai Lam, 29 Dec 2024, LLM2: Let Large Language Models Harness System 2 Reasoning, https://arxiv.org/abs/2412.20372
Xiaoxue Cheng, Junyi Li, Wayne Xin Zhao, Ji-Rong Wen, 2 Jan 2025, Think More, Hallucinate Less: Mitigating Hallucinations via Dual Process of Fast and Slow Thinking, https://arxiv.org/abs/2501.01306
Scott C. Lowe, 29 Oct 2024 (v2), System 2 Reasoning Capabilities Are Nigh, https://arxiv.org/abs/2410.03662
Yixin Ji, Juntao Li, Hai Ye, Kaixin Wu, Jia Xu, Linjian Mo, Min Zhang, 5 Jan 2025, Test-time Computing: from System-1 Thinking to System-2 Thinking, https://arxiv.org/abs/2501.02497
Violet Xiang, Charlie Snell, Kanishk Gandhi, Alon Albalak, Anikait Singh, Chase Blagden, Duy Phung, Rafael Rafailov, Nathan Lile, Dakota Mahan, Louis Castricato, Jan-Philipp Franken, Nick Haber, Chelsea Finn, 8 Jan 2025, Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Thought, https://arxiv.org/abs/2501.04682
Fengli Xu, Qianyue Hao, Zefang Zong, Jingwei Wang, Yunke Zhang, Jingyi Wang, Xiaochong Lan, Jiahui Gong, Tianjian Ouyang, Fanjin Meng, Chenyang Shao, Yuwei Yan, Qinglong Yang, Yiwen Song, Sijian Ren, Xinyuan Hu, Yu Li, Jie Feng, Chen Gao, Yong Li, 17 Jan 2025 (v2), Towards Large Reasoning Models: A Survey on Scaling LLM Reasoning Capabilities, https://arxiv.org/abs/2501.09686
Maciej Besta, Julia Barth, Eric Schreiber, Ales Kubicek, Afonso Catarino, Robert Gerstenberger, Piotr Nyczyk, Patrick Iff, Yueling Li, Sam Houliston, Tomasz Sternal, Marcin Copik, Grzegorz Kwaśniewski, Jürgen Müller, Łukasz Flis, Hannes Eberhard, Hubert Niewiadomski, Torsten Hoefler, 23 Jan 2025 (v3), Reasoning Language Models: A Blueprint, https://arxiv.org/abs/2501.11223 (Survey and blueprint for how to build a Large Reasoning Model.)
Bilgehan Sel, Ruoxi Jia, Ming Jin, 23 Jan 2025, LLMs Can Plan Only If We Tell Them, https://arxiv.org/abs/2501.13545
Kounianhua Du, Hanjing Wang, Jianxing Liu, Jizheng Chen, Xinyi Dai, Yasheng Wang, Ruiming Tang, Yong Yu, Jun Wang, Weinan Zhang, 18 Feb 2025, Boost, Disentangle, and Customize: A Robust System2-to-System1 Pipeline for Code Generation, https://arxiv.org/abs/2502.12492
Alireza S. Ziabari, Nona Ghazizadeh, Zhivar Sourati, Farzan Karimi-Malekabadi, Payam Piray, Morteza Dehghani, 18 Feb 2025, Reasoning on a Spectrum: Aligning LLMs to System 1 and System 2 Thinking, https://arxiv.org/abs/2502.12470
Zhong-Zhi Li, Duzhen Zhang, Ming-Liang Zhang, Jiaxin Zhang, Zengyan Liu, Yuxuan Yao, Haotian Xu, Junhao Zheng, Pei-Jie Wang, Xiuyi Chen, Yingying Zhang, Fei Yin, Jiahua Dong, Zhijiang Guo, Le Song, Cheng-Lin Liu, 25 Feb 2025 (v2), From System 1 to System 2: A Survey of Reasoning Large Language Models, https://arxiv.org/abs/2502.17419
Pengcheng Wen, Jiaming Ji, Chi-Min Chan, Juntao Dai, Donghai Hong, Yaodong Yang, Sirui Han, Yike Guo, 17 Mar 2025, ThinkPatterns-21k: A Systematic Study on the Impact of Thinking Patterns in LLMs, https://arxiv.org/abs/2503.12918
Sapient Intelligence, 22/07/2025, Sapient Intelligence Open-Sources Hierarchical Reasoning Model, a Brain-Inspired Architecture That Solves Complex Reasoning Tasks With 27 Million Parameters, https://www.sapient.inc/blog/5
Sejin Kim, Sundong Kim, 13 Aug 2025, System 2 Reasoning for Human-AI Alignment: Generality and Adaptivity via ARC-AGI, https://arxiv.org/abs/2410.07866
Runqi Qiao and Qiuna Tan and Peiqing Yang and Yanzi Wang and Xiaowan Wang and Enhui Wan and Sitong Zhou and Guanting Dong and Yuchen Zeng and Yida Xu and Jie Wang and Chong Sun and Chen Li and Honggang Zhang, 14 Aug 2025, We-Math 2.0: A Versatile MathBook System for Incentivizing Visual Mathematical Reasoning, https://arxiv.org/abs/2508.10433

Best of N Reasoning

Best of N is an LLM reasoning method where multiple answers are generated, and the best one is chosen. You can use Best of N (BoN) with multiple answers from a single LLM, or in an ensemble inference architecture with answers from multiple different LLMs. Usually, the last step is another LLM inference that performs "LLM as Judge" computations to choose the best answer. It is also possible to use other types of non-LLM ranking algorithms to choose the best one.

Research papers on Best-of-N reasoning:

Siwei Wu, Zhongyuan Peng, Xinrun Du, Tuney Zheng, Minghao Liu, Jialong Wu, Jiachen Ma, Yizhi Li, Jian Yang, Wangchunshu Zhou, Qunshu Lin, Junbo Zhao, Zhaoxiang Zhang, Wenhao Huang, Ge Zhang, Chenghua Lin, J.H. Liu, 22 Oct 2024 (v2), A Comparative Study on Reasoning Patterns of OpenAI's o1 Model, https://arxiv.org/abs/2410.13639
Hanshi Sun, Momin Haider, Ruiqi Zhang, Huitao Yang, Jiahao Qiu, Ming Yin, Mengdi Wang, Peter Bartlett, Andrea Zanette, 26 Oct 2024, Fast Best-of-N Decoding via Speculative Rejection, https://arxiv.org/abs/2410.20290
Do Xuan Long, Duong Ngoc Yen, Anh Tuan Luu, Kenji Kawaguchi, Min-Yen Kan, Nancy F. Chen, 1 Nov 2024, Multi-expert Prompting Improves Reliability, Safety, and Usefulness of Large Language Models, https://arxiv.org/abs/2411.00492
Yinlam Chow, Guy Tennenholtz, Izzeddin Gur, Vincent Zhuang, Bo Dai, Sridhar Thiagarajan, Craig Boutilier, Rishabh Agarwal, Aviral Kumar, Aleksandra Faust, 18 Dec 2024, Inference-Aware Fine-Tuning for Best-of-N Sampling in Large Language Models, https://arxiv.org/abs/2412.15287
Violet Xiang, Charlie Snell, Kanishk Gandhi, Alon Albalak, Anikait Singh, Chase Blagden, Duy Phung, Rafael Rafailov, Nathan Lile, Dakota Mahan, Louis Castricato, Jan-Philipp Franken, Nick Haber, Chelsea Finn, 8 Jan 2025, Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Thought, https://arxiv.org/abs/2501.04682
Tong Xiao, Jingbo Zhu, 16 Jan 2025, Foundations of Large Language Models, https://arxiv.org/abs/2501.09223 (Huge 230 page paper on many topics such as training, prompting, alignment, and long context.)
Kuang-Huei Lee, Ian Fischer, Yueh-Hua Wu, Dave Marwood, Shumeet Baluja, Dale Schuurmans, Xinyun Chen, 17 Jan 2025, Evolving Deeper LLM Thinking, https://arxiv.org/abs/2501.09891 (An alternative search strategy broad/deep, compared to CoT and reflection.)
Edward Beeching, Lewis Tunstall, Sasha Rush Dec 16, 2024, Scaling Test Time Compute with Open Source Models, https://huggingface.co/spaces/HuggingFaceH4/blogpost-scaling-test-time-compute
Yafu Li, Zhilin Wang, Tingchen Fu, Ganqu Cui, Sen Yang, Yu Cheng, 21 Jan 2025, From Drafts to Answers: Unlocking LLM Potential via Aggregation Fine-Tuning, https://arxiv.org/abs/2501.11877 (Fine-tune an LLM to accept multiple candidate answers and output a final one.)
Weihua Du, Yiming Yang, Sean Welleck, 7 Feb 2025, Optimizing Temperature for Language Models with Multi-Sample Inference, https://arxiv.org/abs/2502.05234 https://github.com/StigLidu/TURN
Juntai Cao, Xiang Zhang, Raymond Li, Chuyuan Li, Shafiq Joty, Giuseppe Carenini, 27 Feb 2025, Multi2: Multi-Agent Test-Time Scalable Framework for Multi-Document Processing, https://arxiv.org/abs/2502.20592 (Test time computed applied to the multi-document summarization use case.)
Komal Kumar, Tajamul Ashraf, Omkar Thawakar, Rao Muhammad Anwer, Hisham Cholakkal, Mubarak Shah, Ming-Hsuan Yang, Phillip H.S. Torr, Salman Khan, Fahad Shahbaz Khan, 28 Feb 2025, LLM Post-Training: A Deep Dive into Reasoning Large Language Models, https://arxiv.org/abs/2502.21321 https://github.com/mbzuai-oryx/Awesome-LLM-Post-training
Chengsong Huang, Langlin Huang, Jixuan Leng, Jiacheng Liu, Jiaxin Huang, 25 Feb 2025, Efficient Test-Time Scaling via Self-Calibration, https://arxiv.org/abs/2503.00031
Yiming Wang, Pei Zhang, Siyuan Huang, Baosong Yang, Zhuosheng Zhang, Fei Huang, Rui Wang, 3 Mar 2025, Sampling-Efficient Test-Time Scaling: Self-Estimating the Best-of-N Sampling in Early Decoding, https://arxiv.org/abs/2503.01422
Yiwei Li, Jiayi Shi, Shaoxiong Feng, Peiwen Yuan, Xinglin Wang, Yueqi Zhang, Ji Zhang, Chuyi Tan, Boyuan Pan, Yao Hu, Kan Li, 7 Mar 2025, Speculative Decoding for Multi-Sample Inference, https://arxiv.org/abs/2503.05330 (Optimizing speculative decoding when generating multiple answers for a single query, such as for Best-of-N reasoning.)
Eric Zhao, Pranjal Awasthi, Sreenivas Gollapudi, 20 Feb 2025 (v2), Sample, Scrutinize and Scale: Effective Inference-Time Search by Scaling Verification https://arxiv.org/abs/2502.01839 (Wrapping a single model with a Best-of-N approach that self-selects the best answer can significantly improve reasoning rates.)
Ningning Wang, Xavier Hu, Pai Liu, He Zhu, Yue Hou, Heyuan Huang, Shengyu Zhang, Jian Yang, Jiaheng Liu, Ge Zhang, Changwang Zhang, Jun Wang, Yuchen Eleanor Jiang, Wangchunshu Zhou, 24 Jul 2025, Efficient Agents: Building Effective Agents While Reducing Cost, https://arxiv.org/pdf/2508.02694 https://github.com/OPPO-PersonalAI/OAgents
Shubham Toshniwal, Ivan Sorokin, Aleksander Ficek, Ivan Moshkov, Igor Gitman, 23 Jul 2025, GenSelect: A Generative Approach to Best-of-N, https://arxiv.org/abs/2507.17797
Jizhou Guo, Zhaomin Wu, Hanchen Yang, Philip S. Yu, 29 Jul 2025, Mining Intrinsic Rewards from LLM Hidden States for Efficient Best-of-N Sampling, https://arxiv.org/abs/2505.12225
Jiahao Qiu, Yifu Lu, Yifan Zeng, Jiacheng Guo, Jiayi Geng, Chenhao Zhu, Xinzhe Juan, Ling Yang, Huazheng Wang, Kaixuan Huang, Yue Wu, Mengdi Wang, 3 Sep 2025, TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling, https://arxiv.org/abs/2410.16033

Program Synthesis

Program synthesis is the reasoning method whereby the LLM can synthesize program code that is then executed to solve a problem. Using a Python interpreter with an LLM is common, but any language can potentially be used, including more abstract mathematical symbolic languages. The virtually unlimited flexibility of programming languages, when combined with LLM pattern-matching power to create code, offers a fertile area for reasoning advancement.

Research papers related to program synthesis and similar symbolic reasoning approaches:

Guoxin Chen, Minpeng Liao, Chengxi Li, Kai Fan, 6 May 2024, AlphaMath Almost Zero: process Supervision without process, https://arxiv.org/abs/2405.03553 https://github.com/MARIO-Math-Reasoning/Super_MARIO
Wenhu Chen, Xueguang Ma, Xinyi Wang, and William W Cohen. Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588, 2022. https://arxiv.org/abs/2211.12588 (Integrate a Python interpreter to execute the code generated by the LLM to answer the query.)
Luyu Gao, Aman Madaan, Shuyan Zhou, Uri Alon, Pengfei Liu, Yiming Yang, Jamie Callan, and Graham Neubig. Pal: Program-aided language models. In International Conference on Machine Learning, pages 10764–10799. PMLR, 2023. https://arxiv.org/abs/2211.10435 Code: http://reasonwithpal.com/ (Python interpreter integrated as a tool for LLMs.)
Long Hei Matthew Lam, Ehsan Shareghi, 1 Jun 2024, A Closer Look at Logical Reasoning with LLMs: The Choice of Tool Matters, https://arxiv.org/abs/2406.00284 (Using symbolic solvers with LLMs.)
M Keber, I Grubišic, A Barešic, A Jovic, 2024, A Review on Neuro-symbolic AI Improvements to Natural Language Processing, https://www.researchgate.net/profile/Alan-Jovic/publication/380911364_A_Review_on_Neuro-symbolic_AI_Improvements_to_Natural_Language_Processing/links/6655c0ec22a7f16b4f51fb2f/A-Review-on-Neuro-symbolic-AI-Improvements-to-Natural-Language-Processing.pdf
Joy He-Yueya, Gabriel Poesia, Rose E. Wang, and Noah D. Goodman. Solving math word problems by combining language models with symbolic solvers. ArXiv, abs/2304.09102, 2023. https://arxiv.org/abs/2304.09102
Owen Dugan, Donato Manuel Jimenez Beneto, Charlotte Loh, Zhuo Chen, Rumen Dangovski, Marin Soljačić, 4 Jun 2024, OccamLLM: Fast and Exact Language Model Arithmetic in a Single Step, https://arxiv.org/abs/2406.06576
Zayne Sprague, Fangcong Yin, Juan Diego Rodriguez, Dongwei Jiang, Manya Wadhwa, Prasann Singhal, Xinyu Zhao, Xi Ye, Kyle Mahowald, Greg Durrett, 18 Sep 2024, To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning, https://arxiv.org/abs/2409.12183
Yongchao Chen, Harsh Jhamtani, Srinagesh Sharma, Chuchu Fan, Chi Wang, 4 Oct 2024, Steering Large Language Models between Code Execution and Textual Reasoning, https://arxiv.org/abs/2410.03524 https://yongchao98.github.io/CodeSteer/
Iman Mirzadeh, Keivan Alizadeh, Hooman Shahrokhi, Oncel Tuzel, Samy Bengio, Mehrdad Farajtabar, 7 Oct 2024, GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models, https://arxiv.org/abs/2410.05229
Jiajun Chen, Yik-Cheung Tam, 5 Dec 2024, Enhancing Mathematical Reasoning in LLMs with Background Operators, https://arxiv.org/abs/2412.04110
Aske Plaat, Annie Wong, Suzan Verberne, Joost Broekens, Niki van Stein, Thomas Back, 16 Jul 2024, Reasoning with Large Language Models, a Survey, https://arxiv.org/abs/2407.11511
Mayi Xu, Yunfeng Ning, Yongqi Li, Jianhao Chen, Jintao Wen, Yao Xiao, Shen Zhou, Birong Pan, Zepeng Bao, Xin Miao, Hankun Kang, Ke Sun, Tieyun Qian, 2 Jan 2025, Reasoning based on symbolic and parametric knowledge bases: a survey, https://arxiv.org/abs/2501.01030 (Extensive survey of reasoning from CoT to knowledge graphs to table-based reasoning.)
Andrea Matarazzo, Riccardo Torlone, 3 Jan 2025, A Survey on Large Language Models with some Insights on their Capabilities and Limitations, https://arxiv.org/abs/2501.04040 (Broad survey with many LLM topics covered from history to architectures to optimizations.)
Ndea, Jan 16, 2025, Ndea is building frontier AI systems that blend intuitive pattern recognition and formal reasoning into a unified architecture., https://ndea.com/
François Chollet, 25 Nov 2019 (v2), On the Measure of Intelligence, https://arxiv.org/abs/1911.01547
Sumit Gulwani, Alex Polozov, Rishabh Singh, 2017, Program Synthesis, NOW, August 2017, Vol 4, https://www.microsoft.com/en-us/research/publication/program-synthesis/ https://www.microsoft.com/en-us/research/wp-content/uploads/2017/10/program_synthesis_now.pdf
Shraddha Barke, Emmanuel Anaya Gonzalez, Saketh Ram Kasibatla, Taylor Berg-Kirkpatrick, Nadia Polikarpova, 1 Nov 2024 (v2), HYSYNTH: Context-Free LLM Approximation for Guiding Program Synthesis, https://arxiv.org/abs/2405.15880
Stephen Mell, Steve Zdancewic, and Osbert Bastani. 2024. Optimal Program Synthesis via Abstract Interpretation. Proc. ACM Program. Lang. 8, POPL, Article 16 (January 2024), 25 pages. https://doi.org/10.1145/3632858 https://dl.acm.org/doi/abs/10.1145/3632858
Yixuan Li, Lewis Frampton, Federico Mora, Elizabeth Polgreen, 9 Jan 2025, Online Prompt and Solver Selection for Program Synthesis, https://arxiv.org/abs/2501.05247
Qikang Liu, Yang He, Yanwen Cai, Byeongguk Kwak, Yuepeng Wang, 8 Dec 2024, Synthesizing Document Database Queries using Collection Abstractions, https://arxiv.org/abs/2412.06102
F. Alotaibi, A. Kulkarni and D. Zhou, "Graph of Logic: Enhancing LLM Reasoning with Graphs and Symbolic Logic," 2024 IEEE International Conference on Big Data (BigData), Washington, DC, USA, 2024, pp. 5926-5935, doi: 10.1109/BigData62323.2024.10825450. https://ieeexplore.ieee.org/abstract/document/10825450
Yiyao Yu, Yuxiang Zhang, Dongdong Zhang, Xiao Liang, Hengyuan Zhang, Xingxing Zhang, Ziyi Yang, Mahmoud Khademi, Hany Awadalla, Junjie Wang, Yujiu Yang, Furu Wei, 19 Jan 2025, Chain-of-Reasoning: Towards Unified Mathematical Reasoning in Large Language Models via a Multi-Paradigm Perspective, https://arxiv.org/abs/2501.11110
Benjamin Callewaert, Simon Vandevelde, Joost Vennekens, 24 Jan 2025, VERUS-LM: a Versatile Framework for Combining LLMs with Symbolic Reasoning, https://arxiv.org/abs/2501.14540
G Wang, S Zhang, T Zhan, Z Shen, J Li, X Hu, X Sun, Jan 2025, Unlocking the Mysteries of OpenAI o1: A Survey of the Reasoning Abilities of Large Language Models, https://openreview.net/pdf?id=J0ADLa2rNp
Mohit Sewak, Ph.D., January 29, 2025, Achieving General Intelligence (AGI) and Super Intelligence (ASI): Pathways, Uncertainties, and Ethical Concerns, https://towardsai.net/p/l/achieving-general-intelligence-agi-and-super-intelligence-asi-pathways-uncertainties-and-ethical-concerns
Yubin Ge, Salvatore Romeo, Jason Cai, Raphael Shu, Monica Sunkara, Yassine Benajiba, Yi Zhang, 3 Feb 2025, TReMu: Towards Neuro-Symbolic Temporal Reasoning for LLM-Agents with Memory in Multi-Session Dialogues, https://arxiv.org/abs/2502.01630
Avinash Patil, 5 Feb 2025, Advancing Reasoning in Large Language Models: Promising Methods and Approaches, https://arxiv.org/abs/2502.03671
Cheryl Li, Tianyuan Xu, Yiwen Guo, 5 Feb 2025, Reasoning-as-Logic-Units: Scaling Test-Time Reasoning in Large Language Models Through Logic Unit Alignment, https://arxiv.org/abs/2502.07803
Hanmeng Liu, Zhizhang Fu, Mengru Ding, Ruoxi Ning, Chaoli Zhang, Xiaozhang Liu, Yue Zhang, 13 Feb 2025, Logical Reasoning in Large Language Models: A Survey, https://arxiv.org/abs/2502.09100
Zhong-Zhi Li, Duzhen Zhang, Ming-Liang Zhang, Jiaxin Zhang, Zengyan Liu, Yuxuan Yao, Haotian Xu, Junhao Zheng, Pei-Jie Wang, Xiuyi Chen, Yingying Zhang, Fei Yin, Jiahua Dong, Zhijiang Guo, Le Song, Cheng-Lin Liu, 25 Feb 2025 (v2), From System 1 to System 2: A Survey of Reasoning Large Language Models, https://arxiv.org/abs/2502.17419
Ali Forootani, 22 Mar 2025, A Survey on Mathematical Reasoning and Optimization with Large Language Models, https://arxiv.org/abs/2503.17726
Siheng Xiong, Jieyu Zhou, Zhangding Liu, Yusen Su, 2 May 2025, SymPlanner: Deliberate Planning in Language Models with Symbolic Representation, https://arxiv.org/abs/2505.01479
Adam Stein, Aaditya Naik, Neelay Velingker, Mayur Naik, Eric Wong, 30 May 2025, The Road to Generalizable Neuro-Symbolic Learning Should be Paved with Foundation Models, https://arxiv.org/abs/2505.24874
Martin Berger, Nathanaël Fijalkow, Mojtaba Valizadeh, 26 Apr 2025, GPU accelerated program synthesis: Enumerate semantics, not syntax! https://arxiv.org/abs/2504.18943
Simon Ouellette, 17 Jul 2025, Out-of-Distribution Generalization in the ARC-AGI Domain: Comparing Execution-Guided Neural Program Synthesis and Test-Time Fine-Tuning, https://arxiv.org/abs/2507.15877
Noah van der Vleuten, 20 Jul 2025, Dr. Boot: Bootstrapping Program Synthesis Language Models to Perform Repairing, https://arxiv.org/abs/2507.15889
Busra Icoz, Goksel Biricik, 24 Jul 2025, Automated Code Review Using Large Language Models with Symbolic Reasoning, https://arxiv.org/abs/2507.18476
Julien Pourcel, C\'edric Colas, Pierre-Yves Oudeyer, 10 Jul 2025, Self-Improving Language Models for Evolutionary Program Synthesis: A Case Study on ARC-AGI, https://arxiv.org/abs/2507.14172
Luca Salvatore Lorello, Nikolaos Manginas, Marco Lippi, Stefano Melacci, 23 Jul 2025, LTLZinc: a Benchmarking Framework for Continual Learning and Neuro-Symbolic Temporal Reasoning, https://arxiv.org/abs/2507.17482
Gary Marcus, Jul 14, 2025, How o3 and Grok 4 Accidentally Vindicated Neurosymbolic AI, https://garymarcus.substack.com/p/how-o3-and-grok-4-accidentally-vindicated
Lin-Han Jia, Si-Yu Han, Wen-Chao Hu, Jie-Jing Shao, Wen-Da Wei, Zhi Zhou, Lan-Zhe Guo, Yu-Feng Li, 10 Aug 2025, When Is Prior Knowledge Helpful? Exploring the Evaluation and Selection of Unsupervised Pretext Tasks from a Neuro-Symbolic Perspective, https://arxiv.org/abs/2508.07299
Raffaele Pojer, Andrea Passerini, Kim G. Larsen, Manfred Jaeger, 29 Jul 2025, A Neuro-Symbolic Approach for Probabilistic Reasoning on Graph Data, https://arxiv.org/abs/2507.21873
Andrew Kiruluta, Andreas Lemos, and Priscilla Burity, 27 Jul 2025, Operator-Based Machine Intelligence: A Hilbert Space Framework for Spectral Learning and Symbolic Reasoning, https://arxiv.org/abs/2507.21189
Andrew Kiruluta, Andreas Lemos, and Priscilla Burity, 27 Jul 2025, Beyond Neural Networks: Symbolic Reasoning over Wavelet Logic Graph Signals, https://arxiv.org/abs/2507.21190
Wenkai Tan, Alvaro Velasquez, Houbing Song, 28 Jul 2025, DEM-NeRF: A Neuro-Symbolic Method for Scientific Discovery through Physics-Informed Simulation, https://arxiv.org/abs/2507.21350
Vasileios Manginas, Nikolaos Manginas, Edward Stevinson, Sherwin Varghese, Nikos Katzouris, Georgios Paliouras, Alessio Lomuscio, 29 Jul 2025, A Scalable Approach to Probabilistic Neuro-Symbolic Robustness Verification, https://arxiv.org/abs/2502.03274
Oren Sultan, Eitan Stern, Dafna Shahaf, 29 Jul 2025, Towards Reliable Proof Generation with LLMs: A Neuro-Symbolic Approach, https://arxiv.org/abs/2505.14479
Tilman Hinnerichs, Bart Swinkels, Jaap de Jong, Reuben Gardos Reid, Tudor Magirescu, Neil Yorke-Smith, Sebastijan Dumancic, 10 Jul 2025, Modelling Program Spaces in Program Synthesis with Constraints, https://arxiv.org/abs/2508.00005
Xinkai Zou, Xuan Jiang, Ruikai Huang, Haoze He, Parv Kapoor, Jiahua Zhao, 3 Aug 2025, CloudAnoAgent: Anomaly Detection for Cloud Sites via LLM Agent with Neuro-Symbolic Mechanism, https://arxiv.org/abs/2508.01844
Long S. T. Nguyen, Khang H. N. Vo, Thu H. A. Nguyen, Tuan C. Bui, Duc Q. Nguyen, Thanh-Tung Tran, Anh D. Nguyen, Minh L. Nguyen, Fabien Baldacci, Thang H. Bui, Emanuel Di Nardo, Angelo Ciaramella, Son H. Le, Ihsan Ullah, Lorenzo Di Rocco, and Tho T. Quan, 2 Aug 2025, Bridging LLMs and Symbolic Reasoning in Educational QA Systems: Insights from the XAI Challenge at IJCNN 2025, https://arxiv.org/abs/2508.01263
Zewen Liu, Juntong Ni, Xianfeng Tang, Max S.Y. Lau, Wei Jin, 5 Aug 2025, Can Large Language Models Adequately Perform Symbolic Reasoning Over Time Series?, https://arxiv.org/abs/2508.03963
Andrew Kiruluta, 7 Aug 2025, A Novel Architecture for Symbolic Reasoning with Decision Trees and LLM Agents, https://arxiv.org/abs/2508.05311
Anjiang Wei, Tarun Suresh, Jiannan Cao, Naveen Kannan, Yuheng Wu, Kai Yan, Thiago S. F. X. Teixeira, Ke Wang, Alex Aiken, 8 Aug 2025, CodeARC: Benchmarking Reasoning Capabilities of LLM Agents for Inductive Program Synthesis, https://arxiv.org/abs/2503.23145
Iman Sharifi, Mustafa Yildirim, Saber Fallah, 17 Aug 2025, Towards Safe Autonomous Driving Policies using a Neuro-Symbolic Deep Reinforcement Learning Approach, https://arxiv.org/abs/2307.01316
Ronit Virwani and Ruchika Suryawanshi, 18 Aug 2025, LOOP: A Plug-and-Play Neuro-Symbolic Framework for Enhancing Planning in Autonomous Systems, https://arxiv.org/abs/2508.13371
Xiao-Wen Yang, Jie-Jing Shao, Lan-Zhe Guo, Bo-Wen Zhang, Zhi Zhou, Lin-Han Jia, Wang-Zhou Dai and Yu-Feng Li, 19 Aug 2025, Neuro-Symbolic Artificial Intelligence: Towards Improving the Reasoning Abilities of Large Language Models, https://arxiv.org/abs/2508.13678
Andrew Kiruluta, 19 Aug 2025, A Fully Spectral Neuro-Symbolic Reasoning Architecture with Graph Signal Processing as the Computational Backbone, https://arxiv.org/abs/2508.14923
Xuan Zhang, Zhijian Zhou, Weidi Xu, Yanting Miao, Chao Qu, Yuan Qi, 22 Aug 2025, Constraints-Guided Diffusion Reasoner for Neuro-Symbolic Learning, https://arxiv.org/abs/2508.16524
Christopher J. Mungall and Adnan Malik and Daniel R. Korn and Justin T. Reese and Noel M. O'Boyle, Noel and Janna Hastings, 24 Aug 2025, Chemical classification program synthesis using generative artificial intelligence, https://arxiv.org/abs/2505.18470
Justin Chih-Yao Chen, Sukwon Yun, Elias Stengel-Eskin, Tianlong Chen, Mohit Bansal, 18 Jul 2025, Symbolic Mixture-of-Experts: Adaptive Skill-based Routing for Heterogeneous Reasoning, https://arxiv.org/abs/2503.05641
Tong Yang, Yu Huang, Yingbin Liang, Yuejie Chi, 11 Aug 2025, Multi-head Transformers Provably Learn Symbolic Multi-step Reasoning via Gradient Descent, https://arxiv.org/abs/2508.08222
Qiushi Sun, Jinyang Gong, Lei Li, Qipeng Guo, Fei Yuan, 25 Jul 2025, CodeEvo: Interaction-Driven Synthesis of Code-centric Data through Hybrid and Iterative Feedback, https://arxiv.org/abs/2507.22080
Gongyao Jiang, Qiong Luo, 16 Aug 2025, Chart-CoCa: Self-Improving Chart Understanding of Vision LMs via Code-Driven Synthesis and Candidate-Conditioned Answering, https://arxiv.org/abs/2508.11975
Phuong Minh Nguyen, Tien Huu Dang, Naoya Inoue, 17 Aug 2025, Non-Iterative Symbolic-Aided Chain-of-Thought for Logical Reasoning, https://arxiv.org/abs/2508.12425
Ryan Hare and Ying Tang, 25 Aug 2025, Toward Generalized Autonomous Agents: A Neuro-Symbolic AI Framework for Integrating Social and Technical Support in Education, https://arxiv.org/abs/2508.18406
Hung Ming Liu, 26 Aug 2025, Interpretable by AI Mother Tongue: Native Symbolic Reasoning in Neural Models, https://arxiv.org/abs/2508.18988
Rushitha Santhoshi Mamidala, Anshuman Chhabra, Ankur Mali, 22 Aug 2025, Rethinking Reasoning in LLMs: Neuro-Symbolic Local RetoMaton Beyond ICL and CoT, https://arxiv.org/abs/2508.19271
Marianne Defresne, Romain Gambardella, Sophie Barbe, Thomas Schiex, 28 Aug 2025, Efficient Neuro-Symbolic Learning of Constraints and Objective, https://arxiv.org/abs/2508.20978
Axel Mezini, Elena Umili, Ivan Donadello, Fabrizio Maria Maggi, Matteo Mancanelli, Fabio Patrizi, 31 Aug 2025, Neuro-Symbolic Predictive Process Monitoring, https://arxiv.org/abs/2509.00834
Jay Vaghasiya, Omkar Ghugarkar, Vishvesh Bhat, Vipul Dholaria, Julian McAuley, 31 Aug 2025, CoreThink: A Symbolic Reasoning Layer to reason over Long Horizon Tasks with LLMs, https://arxiv.org/abs/2509.00971
Madhav Kanda, Shubham Ugare, Sasa Misailovic, 1 Sep 2025, REFINESTAT: Efficient Exploration for Probabilistic Program Synthesis, https://arxiv.org/abs/2509.01082
Markus Reiter-Haas and Elisabeth Lex, 2 Sep 2025, Towards Multi-Aspect Diversification of News Recommendations Using Neuro-Symbolic AI for Individual and Societal Benefit, https://arxiv.org/abs/2509.02220
Kevin Alcedo and Pedro U. Lima and Rachid Alami, 2 Sep 2025, Perspective-Shifted Neuro-Symbolic World Models: A Framework for Socially-Aware Robot Navigation, https://arxiv.org/abs/2503.20425
Yousef Alhessi, S\'olr\'un Halla Einarsd\'ottir, George Granberry, Emily First, Moa Johansson, Sorin Lerner, Nicholas Smallbone, 1 Sep 2025, Lemmanaid: Neuro-Symbolic Lemma Conjecturing, https://arxiv.org/abs/2504.04942
Midhat Urooj, Ayan Banerjee, Farhat Shaikh, Kuntal Thakur, Sandeep Gupta, 3 Sep 2025, Single Domain Generalization in Diabetic Retinopathy: A Neuro-Symbolic Learning Approach, https://arxiv.org/abs/2509.02918
Kaizhi Zheng, Kaiwen Zhou, Jing Gu, Yue Fan, Jialu Wang, Zonglin Di, Xuehai He, Xin Eric Wang, 2 Sep 2025, JARVIS: A Neuro-Symbolic Commonsense Reasoning Framework for Conversational Embodied Agents, https://arxiv.org/abs/2208.13266
Safayat Bin Hakim, Muhammad Adil, Alvaro Velasquez, Shouhuai Xu, Houbing Herbert Song, 8 Sep 2025, Neuro-Symbolic AI for Cybersecurity: State of the Art, Challenges, and Opportunities, https://arxiv.org/abs/2509.06921

Reasoning Decoding Algorithms

Reasoning decoding algorithms, or Chain-of-Thought decoding algorithms, are methods of using the decoding phase of LLM inference rather than multiple steps. The idea is that the possible pathways based on logits can be similar to Chain-of-Thought reasoning, and these pathways can be explored and combined during inference. This yields an algorithm that is better at reasoning than simpler decoding algorithms, but is more efficient than Chain-of-Thought because it can examine multiple pathways in a single inference step.

Research papers on reasoning-decoding or CoT-decoding:

Xuezhi Wang, Denny Zhou, 23 May 2024 (v2), Chain-of-Thought Reasoning Without Prompting, https://arxiv.org/abs/2402.10200 ("CoT decoding" is examining the alternative paths in the decoding algorithm, which is somewhat similar to Chain-of-Thought reasoning.)
xjdr-alt, Dec 2024, entropix: Entropy Based Sampling and Parallel CoT Decoding, https://github.com/xjdr-alt/entropix (Parallel decoding attempts to get something similar to CoT.)
Hongxuan Zhang, Zhining Liu, Yao Zhao, Jiaqi Zheng, Chenyi Zhuang, Jinjie Gu, Guihai Chen, 4 Jun 2024 (v2), Fast Chain-of-Thought: A Glance of Future from Parallel Decoding Leads to Answers Faster, https://arxiv.org/abs/2311.08263 (Use of Jacobi parallel decoding with Chain-of-Thought.)
Renato Vukovic, David Arps, Carel van Niekerk, Benjamin Matthias Ruppik, Hsien-Chin Lin, Michael Heck, Milica Gašić, 5 Aug 2024, Dialogue Ontology Relation Extraction via Constrained Chain-of-Thought Decoding, https://arxiv.org/abs/2408.02361
Yuntian Deng, Kiran Prasad, Roland Fernandez, Paul Smolensky, Vishrav Chaudhary, Stuart Shieber, 2 Nov 2023, Implicit Chain of Thought Reasoning via Knowledge Distillation, https://arxiv.org/abs/2311.01460 (Knowledge distillation applied to optimizing the interim computations in Chain-of-Thought.)
Yuntian Deng, Yejin Choi, Stuart Shieber, 23 May 2024, From Explicit CoT to Implicit CoT: Learning to Internalize CoT Step by Step, https://arxiv.org/abs/2405.14838
Ping Yu, Jing Xu, Jason Weston, Ilia Kulikov, 24 Jul 2024 (v3), Distilling System 2 into System 1, https://arxiv.org/abs/2407.06023
Mehul Damani, Idan Shenfeld, Andi Peng, Andreea Bobu, Jacob Andreas, 7 Oct 2024, Learning How Hard to Think: Input-Adaptive Allocation of LM Computation, https://arxiv.org/abs/2410.04707
Pranjal Aggarwal, Aman Madaan, Yiming Yang, Mausam, 16 Nov 2023 (v2), Let's Sample Step by Step: Adaptive-Consistency for Efficient Reasoning and Coding with LLMs, EMNLP 2023, https://arxiv.org/abs/2305.11860 https://www.sample-step-by-step.info/
Shibo Hao, Sainbayar Sukhbaatar, DiJia Su, Xian Li, Zhiting Hu, Jason Weston, Yuandong Tian, 9 Dec 2024, Training Large Language Models to Reason in a Continuous Latent Space, https://arxiv.org/abs/2412.06769 (Performing reasoning in a model trained to operate in the embedding vector space, rather than more directly in the token space.)
Luyang Liu, Jonas Pfeiffer, Jiaxing Wu, Jun Xie, Arthur Szlam, 23 Dec 2024, Deliberation in Latent Space via Differentiable Cache Augmentation, https://arxiv.org/abs/2412.17747 (Augmenting the KV cache with reasoning information so that decoding will mimic multi-step reasoning with fewer tokens required for intermediate steps.)
Sachin Goyal, Ziwei Ji, Ankit Singh Rawat, Aditya Krishna Menon, Sanjiv Kumar, Vaishnavh Nagarajan, 21 Apr 2024 (v3), Think before you speak: Training Language Models With Pause Tokens, https://arxiv.org/abs/2310.02226 (Inserting extra "pause tokens" that trigger the LLM to perform extra reasoning during the decoding phase.)
Yuval Shalev, Amir Feder, Ariel Goldstein, 19 Jun 2024, Distributional reasoning in LLMs: Parallel reasoning processes in multi-hop reasoning, https://arxiv.org/abs/2406.13858 (Using embeddings from intermediate model layers in decoding to mimic reasoning pathways.)
Eden Biran, Daniela Gottesman, Sohee Yang, Mor Geva, Amir Globerson, 14 Oct 2024 (v2), Hopping Too Late: Exploring the Limitations of Large Language Models on Multi-Hop Queries, https://arxiv.org/abs/2406.12775 (Backpatching prior layers using embeddings from the current activations to mimic multi-step reasoning.)
Jacob Pfau, William Merrill, Samuel R. Bowman, 24 Apr 2024, Let's Think Dot by Dot: Hidden Computation in Transformer Language Models, https://arxiv.org/abs/2404.15758 (Use of dummy "filler tokens" similar to "pause tokens" or "reasoning tokens" to aid multi-step reasoning in decoding.)
Eric Zelikman, Georges Harik, Yijia Shao, Varuna Jayasiri, Nick Haber, Noah D. Goodman, 18 Mar 2024 (v2), Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking, https://arxiv.org/abs/2403.09629 (Introduces answers between a start-of-thought and end-of-thought meta-token for reasoning.)
Haoran Wang, Kai Shu, Jan 2025, MakeEveryTokenCount: ASystematic Survey on Decoding Methods for Foundation Model, https://www.researchgate.net/profile/Haoran-Wang-96/publication/387703971_Make_Every_Token_Count_A_Systematic_Survey_on_Decoding_Methods_for_Foundation_Models/links/67784c8ce74ca64e1f49eb15/Make-Every-Token-Count-A-Systematic-Survey-on-Decoding-Methods-for-Foundation-Models.pdf https://github.com/wang2226/Awesome-LLM-Decoding
Phuc Phan, Hieu Tran, Long Phan, 23 Aug 2024 (v2), Distillation Contrastive Decoding: Improving LLMs Reasoning with Contrastive Decoding and Distillation, https://arxiv.org/abs/2402.14874
Maxime Peyrard, Martin Josifoski, Robert West, 21 Mar 2024, The Era of Semantic Decoding, https://arxiv.org/abs/2403.14562
Fengli Xu, Qianyue Hao, Zefang Zong, Jingwei Wang, Yunke Zhang, Jingyi Wang, Xiaochong Lan, Jiahui Gong, Tianjian Ouyang, Fanjin Meng, Chenyang Shao, Yuwei Yan, Qinglong Yang, Yiwen Song, Sijian Ren, Xinyuan Hu, Yu Li, Jie Feng, Chen Gao, Yong Li, 17 Jan 2025 (v2), Towards Large Reasoning Models: A Survey on Scaling LLM Reasoning Capabilities, https://arxiv.org/abs/2501.09686
Xiangjue Dong, Maria Teleki, James Caverlee, 18 Dec 2024, A Survey on LLM Inference-Time Self-Improvement, https://arxiv.org/abs/2412.14352 https://github.com/dongxiangjue/Awesome-LLM-Self-Improvement
Jonas Geiping, Sean McLeish, Neel Jain, John Kirchenbauer, Siddharth Singh, Brian R. Bartoldson, Bhavya Kailkhura, Abhinav Bhatele, Tom Goldstein, 7 Feb 2025, Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach, https://arxiv.org/abs/2502.05171
G Lu, L Peng, L Li, 2025, CoT-Decoding: Complex Reasoning via Chain-of-Thought Decoding, https://epubs.siam.org/doi/pdf/10.1137/1.9781611978520.44

Planning (as part of Reasoning)

Having an LLM know how to make a plan is part of intelligence. Here are some papers specifically on the aspect of "planning" as part of reasoning:

Myeonghwa Lee, Seonho An, Min-Soo Kim, 18 Jun 2024, PlanRAG: A Plan-then-Retrieval Augmented Generation for Generative Large Language Models as Decision Makers, https://arxiv.org/abs/2406.12430 Code: https://github.com/myeon9h/PlanRAG
Vishal Rajput, Apr 11, 2024, What’s next for AI: AI agentic workflows? https://medium.com/aiguys/next-for-llms-and-rag-ai-agentic-workflows-1869ba0a6796
Zehui Chen, Kuikun Liu, Qiuchen Wang, Jiangning Liu, Wenwei Zhang, Kai Chen, Feng Zhao, 29 Jul 2024, MindSearch: Mimicking Human Minds Elicits Deep AI Searcher, https://arxiv.org/abs/2407.20183 Code: https://github.com/InternLM/MindSearch Project: https://mindsearch.netlify.app
Daniel Cao, Michael Katz, Harsha Kokel, Kavitha Srinivas, Shirin Sohrabi, 21 Aug 2024, Automating Thought of Search: A Journey Towards Soundness and Completeness, https://arxiv.org/abs/2408.11326
Vishal Rajput, Jul 8, 2024, Why LLMs Can’t Plan And Unlikely To Reach AGI? https://medium.com/aiguys/why-llms-cant-plan-and-unlikely-to-reach-agi-642bda3e0aa3
Evan Wang, Federico Cassano, Catherine Wu, Yunfeng Bai, Will Song, Vaskar Nath, Ziwen Han, Sean Hendryx, Summer Yue, Hugh Zhang, 5 Sep 2024, Planning In Natural Language Improves LLM Search For Code Generation, https://arxiv.org/abs/2409.03733
Yongjing Yin, Junran Ding, Kai Song, Yue Zhang, 17 Sep 2024, Semformer: Transformer Language Models with Semantic Planning, https://arxiv.org/abs/2409.11143
Chung-Yu Wang, Alireza DaghighFarsoodeh, Hung Viet Pham, 24 Sep 2024, Task-oriented Prompt Enhancement via Script Generation, https://arxiv.org/abs/2409.16418
LangChain, Jul 20, 2024, Planning for Agents, https://blog.langchain.dev/planning-for-agents/
A. Singh, A. Ehtesham, S. Kumar and T. T. Khoei, "Enhancing AI Systems with Agentic Workflows Patterns in Large Language Model," 2024 IEEE World AI IoT Congress (AIIoT), Seattle, WA, USA, 2024, pp. 527-532, doi: 10.1109/AIIoT61789.2024.10578990. https://ieeexplore.ieee.org/abstract/document/10578990
Chawla, Chhavi; Chatterjee, Siddharth; Gadadinni, Sanketh Siddanna; Verma, Pulkit; Banerjee, Sourav, 2024, Agentic AI: The building blocks of sophisticated AI business applications, Journal of AI, Robotics & Workplace Automation, Volume 3 / Number 3 / Summer 2024, pp. 1-15(15), Henry Stewart Publications, DOI: https://doi.org/10.69554/XEHZ1946 https://www.ingentaconnect.com/content/hsp/airwa/2024/00000003/00000003/art00001
Jian Xie, Kexun Zhang, Jiangjie Chen, Siyu Yuan, Kai Zhang, Yikai Zhang, Lei Li, Yanghua Xiao, 16 Oct 2024, Revealing the Barriers of Language Agents in Planning, https://arxiv.org/abs/2410.12409
Wenchao Xu, Jinyu Chen, Peirong Zheng, Xiaoquan Yi, Tianyi Tian, Wenhui Zhu, Quan Wan, Haozhao Wang, Yunfeng Fan, Qinliang Su, Xuemin Shen, https://arxiv.org/abs/2412.13437 18 Dec 2024, Deploying Foundation Model Powered Agent Services: A Survey, (A survey of not just deployment, but many inference optimization techniques.)
Gautier Dagan, Frank Keller, Alex Lascarides, 30 Dec 2024, Plancraft: an evaluation dataset for planning with LLM agents, https://arxiv.org/abs/2412.21033
Andrea Matarazzo, Riccardo Torlone, 3 Jan 2025, A Survey on Large Language Models with some Insights on their Capabilities and Limitations, https://arxiv.org/abs/2501.04040 (Broad survey with many LLM topics covered from history to architectures to optimizations.)
Paul Sawers, January 23, 2025, Meta’s Yann LeCun predicts a ‘new AI architectures paradigm’ within 5 years and ‘decade of robotics’, https://techcrunch.com/2025/01/23/metas-yann-lecun-predicts-a-new-ai-architectures-paradigm-within-5-years-and-decade-of-robotics/
Ben Dickson, January 22, 2025, DeepMind’s new inference-time scaling technique improves planning accuracy in LLMs, https://venturebeat.com/ai/deepmind-new-inference-time-scaling-technique-improves-planning-accuracy-in-llms/
Xinzhe Li, Jan 2025, A Review of Prominent Paradigms for LLM-Based Agents: Tool Use (Including RAG), Planning, and Feedback Learning, Proceedings of the 31st International Conference on Computational Linguistics, pages 9760–9779, January 19–24, 2025. ©2025 Association for Computational Linguistics, https://aclanthology.org/2025.coling-main.652.pdf https://github.com/xinzhel/LLM-Agent-Survey
S Wang, X Zhang, J Ma, A Hwang, Z Yu, Jan 2025, JumpStarter: Getting Started on Personal Goals with Adaptive Personal Context Curation, https://sitong-wang.github.io/data/JumpStarter.pdf (Long-term planning of goal-oriented long multi-step projects.)
Karthik Valmeekam, Kaya Stechly, Subbarao Kambhampati, 20 Sep 2024, LLMs Still Can't Plan; Can LRMs? A Preliminary Evaluation of OpenAI's o1 on PlanBench, https://arxiv.org/abs/2409.13373
Bilgehan Sel, Ruoxi Jia, Ming Jin, 23 Jan 2025, LLMs Can Plan Only If We Tell Them, https://arxiv.org/abs/2501.13545
Weimin Xiong, Yifan Song, Qingxiu Dong, Bingchan Zhao, Feifan Song, Xun Wang, Sujian Li, 4 Mar 2025, MPO: Boosting LLM Agents with Meta Plan Optimization, https://arxiv.org/abs/2503.02682
Yuqi Zhou, Shuai Wang, Sunhao Dai, Qinglin Jia, Zhaocheng Du, Zhenhua Dong, Jun Xu, 5 Mar 2025, CHOP: Mobile Operating Assistant with Constrained High-frequency Optimized Subtask Planning, https://arxiv.org/abs/2503.03743
P Verma, SP Midigeshi, G Sinha, A Solin, N Natarajan, Mar 2025, Plan *RAG: Efficient Test-Time Planning for Retrieval Augmented Generation, ICLR 2025 review, https://openreview.net/pdf?id=gi9aqlYdBk (Improve RAG reasoning efficiency via planning for parallel reasoning.)
Lutfi Eren Erdogan, Nicholas Lee, Sehoon Kim, Suhong Moon, Hiroki Furuta, Gopala Anumanchipalli, Kurt Keutzer, Amir Gholami, 12 Mar 2025, Plan-and-Act: Improving Planning of Agents for Long-Horizon Tasks, https://arxiv.org/abs/2503.09572
Siheng Xiong, Jieyu Zhou, Zhangding Liu, Yusen Su, 2 May 2025, SymPlanner: Deliberate Planning in Language Models with Symbolic Representation, https://arxiv.org/abs/2505.01479
Pengfei Cao, Tianyi Men, Wencan Liu, Jingwen Zhang, Xuzhao Li, Xixun Lin, Dianbo Sui, Yanan Cao, Kang Liu, Jun Zhao, 26 May 2025, Large Language Models for Planning: A Comprehensive and Systematic Survey, https://arxiv.org/abs/2505.19683
Kenneth Payne, Baptiste Alloui-Cros, 3 Jul 2025, Strategic Intelligence in Large Language Models: Evidence from evolutionary Game Theory, https://arxiv.org/abs/2507.02618
Guancheng Zeng, Xueyi Chen, Jiawang Hu, Shaohua Qi, Yaxuan Mao, Zhantao Wang, Yifan Nie, Shuang Li, Qiuyang Feng, Pengxu Qiu, Yujia Wang, Wenqiang Han, Linyan Huang, Gang Li, Jingjing Mo, Haowen Hu, 22 Jul 2025 (v2), Routine: A Structural Planning Framework for LLM Agent System in Enterprise, https://arxiv.org/abs/2507.14447
Sangwoo Jeon, Juchul Shin, Gyeong-Tae Kim, YeonJe Cho and Seongwoo Kim, 14 Aug 2025, Scaling Up without Fading Out: Goal-Aware Sparse GNN for RL-based Generalized Planning, https://arxiv.org/abs/2508.10747
Steven Klee and Yuntian Xia, 13 Aug 2025, Measuring Time Series Forecast Stability for Demand Planning, https://arxiv.org/abs/2508.10063
Anantha Narayanan, Battu Bhanu Teja, Pruthwik Mishra, 14 Aug 2025, TLE-Based A2C Agent for Terrestrial Coverage Orbital Path Planning, https://arxiv.org/abs/2508.10872
Rishi Parekh, Saisubramaniam Gopalakrishnan, Zishan Ahmad, Anirudh Deodhar, 23 Jul 2025, Leveraging Knowledge Graphs and LLM Reasoning to Identify Operational Bottlenecks for Warehouse Planning Assistance, https://arxiv.org/abs/2507.17273
Stefan Borgwardt, Duy Nhu, Gabriele R\"oger, 23 Jul 2025, Automated planning with ontologies under coherence update semantics (Extended Version), https://arxiv.org/abs/2507.15120
Muhayy Ud Din and Jan Rosell and Waseem Akram and Isiah Zaplana and Maximo A Roa and Irfan Hussain, 23 Jul 2025, Onto-LLM-TAMP: Knowledge-oriented Task and Motion Planning using Large Language Models, https://arxiv.org/abs/2412.07493
Zixiao Huang, Junhao Hu, Hao Lin, Chunyang Zhu, Yueran Tang, Quanlu Zhang, Zhen Guo, Zhenhua Li, Shengen Yan, Zhenhua Zhu, Guohao Dai, Yu Wang, 22 Jul 2025, Reducing GPU Memory Fragmentation via Spatio-Temporal Planning for Efficient Large-Scale Model Training, https://arxiv.org/abs/2507.16274
Chi-Pin Huang, Yueh-Hua Wu, Min-Hung Chen, Yu-Chiang Frank Wang, Fu-En Yang, 22 Jul 2025, ThinkAct: Vision-Language-Action Reasoning via Reinforced Visual Latent Planning, https://arxiv.org/abs/2507.16815
Dominic LaBella, Valeriia Abramova, Mehdi Astaraki, Andre Ferreira, Zhifan Jiang, Mason C. Cleveland, Ramandeep Kang, Uma M. Lal-Trehan Estrada, Cansu Yalcin, Rachika E. Hamadache, Clara Lisazo, Adri\`a Casamitjana, Joaquim Salvi, Arnau Oliver, Xavier Llad\'o, Iuliana Toma-Dasu, Tiago Jesus, Behrus Puladi, Jens Kleesiek, Victor Alves, Jan Egger, Daniel Capell\'an-Mart\'in, Abhijeet Parida, Austin Tapp, Xinyang Liu, Maria J. Ledesma-Carbayo, Jay B. Patel, Thomas N. McNeal, Maya Viera, Owen McCall, Albert E. Kim, Elizabeth R. Gerstner, Christopher P. Bridge, Katherine Schumacher, Michael Mix, Kevin Leu, Shan McBurney-Lin, Pierre Nedelec, Javier Villanueva-Meyer, David R. Raleigh, Jonathan Shapey, Tom Vercauteren, Kazumi Chia, Marina Ivory, Theodore Barfoot, Omar Al-Salihi, Justin Leu, Lia M. Halasz, et al. (57 additional authors not shown), 21 Jul 2025, Analysis of the 2024 BraTS Meningioma Radiotherapy Planning Automated Segmentation Challenge, https://arxiv.org/abs/2405.18383
Bofei Liu and Dong Ye and Zunhao Yao and Zhaowei Sun, 22 Jul 2025, A Goal-Oriented Reinforcement Learning-Based Path Planning Algorithm for Modular Self-Reconfigurable Satellites, https://arxiv.org/abs/2505.01966
Sunandita Patra, Mehtab Pathan, Mahmoud Mahfouz, Parisa Zehtabi, Wided Ouaja, Daniele Magazzeni, and Manuela Veloso, 21 Jul 2025, Capacity Planning and Scheduling for Jobs with Uncertainty in Resource Usage and Duration, https://arxiv.org/abs/2507.01225
Dario Della Monica, Angelo Montanari, Pietro Sala, 23 Jul 2025, Synthesis of timeline-based planning strategies avoiding determinization, https://arxiv.org/abs/2507.17988
Gilberto Cunha, Alexandra Ram\^oa, Andr\'e Sequeira, Michael de Oliveira, Lu\'is Barbosa, 24 Jul 2025, Hybrid quantum-classical algorithm for near-optimal planning in POMDPs, https://arxiv.org/abs/2507.18606
Andres M Bran, Theo A Neukomm, Daniel P Armstrong, Zlatko Jon\v{c}ev, Philippe Schwaller, 23 Jul 2025, Chemical reasoning in LLMs unlocks strategy-aware synthesis planning and reaction mechanism elucidation, https://arxiv.org/abs/2503.08537
Zhiwei Xu, 24 Jul 2025, DAA*: Deep Angular A Star for Image-based Path Planning, https://arxiv.org/abs/2507.09305
Genliang Li, Yaxin Cui, Jinyu Su, 18 Jul 2025, A multi-strategy improved snake optimizer for three-dimensional UAV path planning and engineering problems, https://arxiv.org/abs/2507.14043
Yuejiao Xie, Maonan Wang, Di Zhou, Man-On Pun, and Zhu Han, 18 Jul 2025, Real-Time Communication-Aware Ride-Sharing Route Planning for Urban Air Mobility: A Multi-Source Hybrid Attention Reinforcement Learning Approach, https://arxiv.org/abs/2507.14249
Yufan Song, Jiatao Zhang, Zeng Gu, Qingmiao Liang, Tuocheng Hu, Wei Song, Shiqiang Zhu, 20 Jul 2025, FCRF: Flexible Constructivism Reflection for Long-Horizon Robotic Task Planning with Large Language Models, https://arxiv.org/abs/2507.14975
Thanh Thi Nguyen, Saeid Nahavandi, Imran Razzak, Dung Nguyen, Nhat Truong Pham, Quoc Viet Hung Nguyen, 21 Jul 2025, The Emergence of Deep Reinforcement Learning for Path Planning, https://arxiv.org/abs/2507.15469
Alexandru Coca, Mark Gaynor, Zhenxing Zhang, Jianpeng Cheng, Bo-Hsiang Tseng, Pete Boothroyd, H\'ector Martinez Alonso, Diarmuid \'O S\'eaghdha, Anders Johannsen, 21 Jul 2025, ASPERA: A Simulated Environment to Evaluate Planning for Complex Action Execution, https://arxiv.org/abs/2507.15501
Jubin Abhishek Soni, Amit Anand, Rajesh Kumar Pandey, Aniket Abhishek Soni, 19 Jul 2025, Dynamic Context Tuning for Retrieval-Augmented Generation: Enhancing Multi-Turn Planning and Tool Adaptation, https://arxiv.org/abs/2506.11092
Giwon Lee, Wooseong Jeong, Daehee Park, Jaewoo Jeong, and Kuk-Jin Yoon, 21 Jul 2025, Interaction-Merged Motion Planning: Effectively Leveraging Diverse Motion Datasets for Robust Planning, https://arxiv.org/abs/2507.04790
Abhinav Sagar, Sai Teja Gilukara, 20 Jul 2025, CBAGAN-RRT: Convolutional Block Attention Generative Adversarial Network for Sampling-Based Path Planning, https://arxiv.org/abs/2305.10442
Hayeon Oh, 21 Jul 2025, LaViPlan : Language-Guided Visual Path Planning with RLVR, https://arxiv.org/abs/2507.12911
Markus Fritzsche, Elliot Gestrin, Jendrik Seipp, 11 Aug 2025, Symmetry-Aware Transformer Training for Automated Planning, https://arxiv.org/abs/2508.07743
Alejandro Murillo-Gonzalez, Junhong Xu and Lantao Liu, 8 Aug 2025, Learning Causal Structure Distributions for Robust Planning, https://arxiv.org/abs/2508.06742
Naiyi Li, Zihui Ma, Runlong Yu, Lingyao Li, 9 Aug 2025, LSDTs: LLM-Augmented Semantic Digital Twins for Adaptive Knowledge-Intensive Infrastructure Planning, https://arxiv.org/abs/2508.06799
Alberto Pozanco, Marianela Morales, Daniel Borrajo, Manuela Veloso, 11 Aug 2025, A Planning Compilation to Reason about Goal Achievement at Planning Time, https://arxiv.org/abs/2503.09545
Yanchen Zhu, Honghui Zou, Chufan Liu, Yuyu Luo, Yuankai Wu, Yuxuan Liang, 10 Aug 2025, Reinforcement Learning for Hybrid Charging Stations Planning and Operation Considering Fixed and Mobile Chargers, https://arxiv.org/abs/2506.16764
Jaike van Twiller, Yossiri Adulyasak, Erick Delage, Djordje Grbic, Rune M{\o}ller Jensen, 11 Aug 2025, Navigating Demand Uncertainty in Container Shipping: Deep Reinforcement Learning for Enabling Adaptive and Feasible Master Stowage Planning, https://arxiv.org/abs/2502.12756
Vlad Sobal, Wancong Zhang, Kyunghyun Cho, Randall Balestriero, Tim G. J. Rudner, Yann LeCun, 10 Aug 2025, Learning from Reward-Free Offline Data: A Case for Planning with Latent Dynamics Models, https://arxiv.org/abs/2502.14819
Zhipeng Tang, Sha Zhang, Jiajun Deng, Chenjie Wang, Guoliang You, Yuting Huang, Xinrui Lin and Yanyong Zhang, 27 Jul 2025, VLMPlanner: Integrating Visual Language Models with Motion Planning, https://arxiv.org/abs/2507.20342
Sara Pohland and Claire Tomlin, 26 Jul 2025, Competency-Aware Planning for Probabilistically Safe Navigation Under Perception Uncertainty, https://arxiv.org/abs/2409.06111
Chang-Hun Ji, SiWoon Song, Youn-Hee Han, SungTae Moon, 29 Jul 2025, Decision Transformer-Based Drone Trajectory Planning with Dynamic Safety-Efficiency Trade-Offs, https://arxiv.org/abs/2507.21506
Tyler Han, Yanda Bao, Bhaumik Mehta, Gabriel Guo, Anubhav Vishwakarma, Emily Kang, Sanghun Jung, Rosario Scalise, Jason Zhou, Bryan Xu, Byron Boots, 29 Jul 2025, Model Predictive Adversarial Imitation Learning for Planning from Observation, https://arxiv.org/abs/2507.21533
Yi Kong, Dianxi Shi, Guoli Yang, Zhang ke-di, Chenlin Huang, Xiaopeng Li, Songchang Jin, 29 Jul 2025, MapAgent: Trajectory-Constructed Memory-Augmented Planning for Mobile Task Automation, https://arxiv.org/abs/2507.21953
Ratijit Mitra and Indranil Saha, 28 Jul 2025, Online Concurrent Multi-Robot Coverage Path Planning, https://arxiv.org/abs/2403.10460
Shengao Yi, Xiaojiang Li, Wei Tu, Tianhong Zhao, 30 Jul 2025, Planning for Cooler Cities: A Multimodal AI Framework for Predicting and Mitigating Urban Heat Stress through Urban Landscape Transformation, https://arxiv.org/abs/2507.23000
Mahmoud Ghorab and Matthias Lorenzen, 31 Jul 2025, Multi-Waypoint Path Planning and Motion Control for Non-holonomic Mobile Robots in Agricultural Applications, https://arxiv.org/abs/2507.23350
Yiyan Ji, Haoran Chen, Qiguang Chen, Chengyue Wu, Libo Qin, Wanxiang Che, 31 Jul 2025, MPCC: A Novel Benchmark for Multimodal Planning with Complex Constraints in Multimodal Large Language Models, https://arxiv.org/abs/2507.23382
Kai Goebel and Patrik Zips, 31 Jul 2025, Can LLM-Reasoning Models Replace Classical Planning? A Benchmark Study, https://arxiv.org/abs/2507.23589
Babak Esmaeili, Hamidreza Modares, Stefano Di Cairano, 31 Jul 2025, Data-Driven Motion Planning for Uncertain Nonlinear Systems, https://arxiv.org/abs/2508.00154
Milad Farjadnasab, Shahin Sirouspour, 31 Jul 2025, Cooperative and Asynchronous Transformer-based Mission Planning for Heterogeneous Teams of Mobile Robots, https://arxiv.org/abs/2410.06372
Yuanzhe Shen, Kaimin Wang, Changze Lv, Xiaoqing Zheng, Xuanjing Huang, 2 Aug 2025, TripTailor: A Real-World Benchmark for Personalized Travel Planning, https://arxiv.org/abs/2508.01432
Yinghao Zhu, Yifan Qi, Zixiang Wang, Lei Gu, Dehao Sui, Haoran Hu, Xichen Zhang, Ziyi He, Liantao Ma, Lequan Yu, 4 Aug 2025, HealthFlow: A Self-Evolving AI Agent with Meta Planning for Autonomous Healthcare Research, https://arxiv.org/abs/2508.02621
Enrique Valero-Leal, Pedro Larra\~naga and Concha Bielza, 4 Aug 2025, Actionable Counterfactual Explanations Using Bayesian Networks and Path Planning with Applications to Environmental Quality Improvement, https://arxiv.org/abs/2508.02634
Mikhail Andronov, Natalia Andronova, Michael Wand, J\"urgen Schmidhuber, Djork-Arn\'e Clevert, 2 Aug 2025, Fast and scalable retrosynthetic planning with a transformer neural network and speculative beam search, https://arxiv.org/abs/2508.01459
Krish Agarwal, Yuqian Jiang, Jiaheng Hu, Bo Liu, Peter Stone, 3 Aug 2025, L3M+P: Lifelong Planning with Large Language Models, https://arxiv.org/abs/2508.01917
Alexander Tuisov, Yonatan Vernik and Alexander Shleyfman, 3 Aug 2025, LLM-Generated Heuristics for AI Planning: Do We Even Need Domain-Independence Anymore?, https://arxiv.org/abs/2501.18784
Jungkoo Kang, 3 Aug 2025, Scaling LLM Planning: NL2FLOW for Parametric Problem Generation and Rigorous Evaluation, https://arxiv.org/abs/2507.02253
An T. Le, Khai Nguyen, Minh Nhat Vu, Jo\~ao Carvalho, Jan Peters, 2 Aug 2025, Model Tensor Planning, https://arxiv.org/abs/2505.01059
Zhichen Dong, Zhanhui Zhou, Zhixuan Liu, Chao Yang, Chaochao Lu, 4 Aug 2025, Emergent Response Planning in LLMs, https://arxiv.org/abs/2502.06258
Mikhail Soutchanski and Yongmei Liu, 26 Jul 2025, Planning with Dynamically Changing Domains, https://arxiv.org/abs/2508.02697
Michael Katz, Harsha Kokel, Sarath Sreedharan, 4 Aug 2025, Seemingly Simple Planning Problems are Computationally Challenging: The Countdown Game, https://arxiv.org/abs/2508.02900
Yutong Wang, Pengliang Ji, Kaixin Li, Baolong Bi, Tao Feng, and Guillaume Sartoretti, 5 Aug 2025, Beyond Policy Optimization: A Data Curation Flywheel for Sparse-Reward Long-Horizon Planning, https://arxiv.org/abs/2508.03018
Longling Geng and Edward Y. Chang, 5 Aug 2025, REALM-Bench: A Benchmark for Evaluating Multi-Agent Systems on Real-world, Dynamic Planning and Scheduling Tasks, https://arxiv.org/abs/2502.18836
Hamza El Alaoui, Atieh Taheri, Yi-Hao Peng, Jeffrey P. Bigham, 6 Aug 2025, StepWrite: Adaptive Planning for Speech-Driven Text Generation, https://arxiv.org/abs/2508.04011
Aniket Johri, Divyanshi Dwivedi, Mayukha Pal, 6 Aug 2025, Agentic-AI based Mathematical Framework for Commercialization of Energy Resilience in Electrical Distribution System Planning and Operation, https://arxiv.org/abs/2508.04170
Kim Hammar and Tansu Alpcan and Emil C. Lupu, 7 Aug 2025, Incident Response Planning Using a Lightweight Large Language Model with Reduced Hallucination, https://arxiv.org/abs/2508.05188
Hongyu Nie, Xu Liu, Zhaotong Tan, Sen Mei, and Wenbo Su, 7 Aug 2025, Unified Linear Parametric Map Modeling and Perception-aware Trajectory Planning for Mobile Robotics, https://arxiv.org/abs/2507.09340
Sahil Bansal, Sai Shruthi Sistla, Aarti Arikatala, Sebastian Schreiber, 7 Aug 2025, Planning Agents on an Ego-Trip: Leveraging Hybrid Ego-Graph Ensembles for Improved Tool Retrieval in Enterprise Task Planning, https://arxiv.org/abs/2508.05888
Michael Wehrli, Alicia Durrer, Paul Friedrich, Sidaty El Hadramy, Edwin Li, Luana Brahaj, Carol C. Hasler, Philippe C. Cattin, 8 Aug 2025, Towards MR-Based Trochleoplasty Planning, https://arxiv.org/abs/2508.06076
Yongchao Chen, Yilun Hao, Yang Zhang, Chuchu Fan, 7 Aug 2025, Code-as-Symbolic-Planner: Foundation Model-Based Robot Planning via Symbolic Code Generation, https://arxiv.org/abs/2503.01700
Masataro Asai, 11 Aug 2025, Bilevel MCTS for Amortized O(1) Node Selection in Classical Planning, https://arxiv.org/abs/2508.08385
Yuechen Wang, Yuming Qiao, Dan Meng, Jun Yang, Haonan Lu, Zhenyu Yang, Xudong Zhang, 12 Aug 2025, Efficient Agent: Optimizing Planning Capability for Multimodal Retrieval Augmented Generation, https://arxiv.org/abs/2508.08816
Maxence Boels, Harry Robertshaw, Thomas C Booth, Prokar Dasgupta, Alejandro Granados, Sebastien Ourselin, 12 Aug 2025, When Imitation Learning Outperforms Reinforcement Learning in Surgical Action Planning, https://arxiv.org/abs/2507.05011
Navin Sriram Ravie, Keerthi Vasan M, Asokan Thondiyath and Bijo Sebastian, 28 Apr 2025, QuickGrasp: Lightweight Antipodal Grasp Planning with Point Clouds, https://arxiv.org/abs/2504.19716
Kechen Li, Yaotian Tao, Ximing Wen, Quanwei Sun, Zifei Gong, Chang Xu, Xizhe Zhang, Tianbo Ji, 13 Aug 2025, GridRoute: A Benchmark for LLM-Based Route Planning with Cardinal Movement in Grid Environments, https://arxiv.org/abs/2505.24306
Qingqing Wang, Liqiang Xiao, Chang Chang, 14 Aug 2025, Learn to optimize for automatic proton PBS treatment planning for H&N cancers, https://arxiv.org/abs/2508.11085
David H. Chan, Mark Roberts, Dana S. Nau, 15 Aug 2025, Landmark-Assisted Monte Carlo Planning, https://arxiv.org/abs/2508.11493
Rowan Hodson, Bruce Bassett, Charel van Hoof, Benjamin Rosman, Mark Solms, Jonathan P. Shock, Ryan Smith, 14 Aug 2025, Sophisticated Learning: A novel algorithm for active learning during model-based planning, https://arxiv.org/abs/2308.08029
Yanming Liu, Xinyue Peng, Jiannan Cao, Yuwei Zhang, Xuhong Zhang, Sheng Cheng, Xun Wang, Jianwei Yin, Tianyu Du, 15 Aug 2025, Tool-Planner: Task Planning with Clusters across Multiple Tools, https://arxiv.org/abs/2406.03807
Michael Aichm\"uller, Hector Geffner, 15 Aug 2025, Sketch Decompositions for Classical Planning via Deep Reinforcement Learning, https://arxiv.org/abs/2412.08574
Kyle Brown, Dylan M. Asmar, Mac Schwager, and Mykel J. Kochenderfer, 15 Aug 2025, Large-Scale Multi-Robot Assembly Planning for Autonomous Manufacturing, https://arxiv.org/abs/2311.00192
Frazier N. Baker, Daniel Adu-Ampratwum, Reza Averly, Botao Yu, Huan Sun, Xia Ning, 16 Aug 2025, LARC: Towards Human-level Constrained Retrosynthesis Planning through an Agentic Framework, https://arxiv.org/abs/2508.11860
Chunliang Hua, Xiao Hu, Jiayang Sun, Zeyuan Yang, 18 Aug 2025, The Maximum Coverage Model and Recommendation System for UAV Vertiports Location Planning, https://arxiv.org/abs/2508.12651
Wenjie Chen, Wenbin Li, Di Yao, Xuying Meng, Chang Gong, Jingping Bi, 18 Aug 2025, GTool: Graph Enhanced Tool Planning with Large Language Model, https://arxiv.org/abs/2508.12725
Petr Anokhin, Roman Khalikov, Stefan Rebrikov, Viktor Volkov, Artyom Sorokin, Vincent Bissonnette, 18 Aug 2025, HeroBench: A Benchmark for Long-Horizon Planning and Structured Reasoning in Virtual Worlds, https://arxiv.org/abs/2508.12782
Giovanni Briglia, Francesco Fabiano, Stefano Mariani, 18 Aug 2025, Scaling Multi-Agent Epistemic Planning through GNN-Derived Heuristics, https://arxiv.org/abs/2508.12840
Qian Cao and Jielin Chen and Junchao Zhao and Rudi Stouffs, 15 Aug 2025, From Heuristics to Data: Quantifying Site Planning Layout Indicators with Deep Learning and Multi-Modal Data, https://arxiv.org/abs/2508.11723
Sangwoo Jeon, Juchul Shin, YeonJe Cho, Gyeong-Tae Kim and Seongwoo Kim, 16 Aug 2025, Integrating Symbolic RL Planning into a BDI-based Autonomous UAV Framework: System Integration and SIL Validation, https://arxiv.org/abs/2508.11890
Long Ma, Fangwei Zhong, Yizhou Wang, 18 Aug 2025, Reinforced Context Order Recovery for Adaptive Reasoning and Planning, https://arxiv.org/abs/2508.13070
Gokul Puthumanaillam, Aditya Penumarti, Manav Vora, Paulo Padrao, Jose Fuentes, Leonardo Bobadilla, Jane Shin, Melkior Ornik, 16 Aug 2025, Belief-Conditioned One-Step Diffusion: Real-Time Trajectory Planning with Just-Enough Sensing, https://arxiv.org/abs/2508.12166
Deqian Kong, Dehong Xu, Minglu Zhao, Bo Pang, Jianwen Xie, Andrew Lizarraga, Yuhao Huang, Sirui Xie, Ying Nian Wu, 18 Aug 2025, Latent Plan Transformer for Trajectory Abstraction: Planning as Latent Space Inference, https://arxiv.org/abs/2402.04647
Bernhard Jaeger and Daniel Dauner and Jens Bei{\ss}wenger and Simon Gerstenecker and Kashyap Chitta and Andreas Geiger, 18 Aug 2025, CaRL: Learning Scalable Planning Policies with Simple Rewards, https://arxiv.org/abs/2504.17838
Ronit Virwani and Ruchika Suryawanshi, 18 Aug 2025, LOOP: A Plug-and-Play Neuro-Symbolic Framework for Enhancing Planning in Autonomous Systems, https://arxiv.org/abs/2508.13371
Minh Hoang Nguyen, Van Dai Do, Dung Nguyen, Thin Nguyen, Hung Le, 19 Aug 2025, CausalPlan: Empowering Efficient LLM Multi-Agent Collaboration Through Causality-Driven Planning, https://arxiv.org/abs/2508.13721
Katharina Stein, Nils Hodel, Daniel Fi\v{s}er, J\"org Hoffmann, Michael Katz and Alexander Koller, 19 Aug 2025, Improved Generalized Planning with LLMs through Strategy Refinement and Reflection, https://arxiv.org/abs/2508.13876
Kim Hammar and Tao Li, 20 Aug 2025, Online Incident Response Planning under Model Misspecification through Bayesian Learning and Belief Quantization, https://arxiv.org/abs/2508.14385
Karin A. Olthof, Matteo Fusagli, Bianca G\"uttner, Tiziano Natali, Bram Westerink, Stefanie Speidel, Theo J.M. Ruers, Koert F.D. Kuhlmann, Andrey Zhylka, 19 Aug 2025, Automated surgical planning with nnU-Net: delineation of the anatomy in hepatobiliary phase MRI, https://arxiv.org/abs/2508.14133
Md Mainul Abrar, Xun Jia, Yujie Chi, 19 Aug 2025, New Insights into Automatic Treatment Planning for Cancer Radiotherapy Using Explainable Artificial Intelligence, https://arxiv.org/abs/2508.14229
Jo\~ao Vitor de Carvalho Silva and Douglas G. Macharet, 20 Aug 2025, Can LLM Agents Solve Collaborative Tasks? A Study on Urgency-Aware Planning and Coordination, https://arxiv.org/abs/2508.14635
Xiaowei Chi, Kuangzhi Ge, Jiaming Liu, Siyuan Zhou, Peidong Jia, Zichen He, Yuzhen Liu, Tingguang Li, Lei Han, Sirui Han, Shanghang Zhang, Yike Guo, 20 Aug 2025, MinD: Learning A Dual-System World Model for Real-Time Planning and Implicit Risk Analysis, https://arxiv.org/abs/2506.18897
Wei Yang, Jinwei Xiao, Hongming Zhang, Qingyang Zhang, Yanna Wang, Bo Xu, 21 Aug 2025, Coarse-to-Fine Grounded Memory for LLM Agent Planning, https://arxiv.org/abs/2508.15305
Bin Deng, Yizhe Feng, Zeming Liu, Qing Wei, Xiangrong Zhu, Shuai Chen, Yuanfang Guo, Yunhong Wang, 21 Aug 2025, RETAIL: Towards Real-world Travel Planning for Large Language Models, https://arxiv.org/abs/2508.15335
Alberto Pozanco, Marianela Morales, Daniel Borrajo, Manuela Veloso, 21 Aug 2025, Planning with Minimal Disruption, https://arxiv.org/abs/2508.15358
Deyu Zhang, Xicheng Zhang, Jiahao Li, Tingting Long, Xunhua Dai, Yongjian Fu, Jinrui Zhang, Ju Ren, and Yaoxue Zhang, 21 Aug 2025, LLM-Driven Self-Refinement for Embodied Drone Task Planning, https://arxiv.org/abs/2508.15501
Nikita Kachaev, Andrei Spiridonov, Andrey Gorodetsky, Kirill Muravyev, Nikita Oskolkov, Aditya Narendra, Vlad Shakhuro, Dmitry Makarov, Aleksandr I. Panov, Polina Fedotova, Alexey K. Kovalev, 21 Aug 2025, Mind and Motion Aligned: A Joint Evaluation IsaacSim Benchmark for Task Planning and Low-Level Policies in Mobile Manipulation, https://arxiv.org/abs/2508.15663
Yiheng Hu, Xiaoyang Wang, Qing Liu, Xiwei Xu, Qian Fu, Wenjie Zhang, Liming Zhu, 22 Aug 2025, MMAPG: A Training-Free Framework for Multimodal Multi-hop Question Answering via Adaptive Planning Graphs, https://arxiv.org/abs/2508.16051
Sijie Yang, Binyu Lei, Filip Biljecki, 22 Aug 2025, Urban Comfort Assessment in the Era of Digital Planning: A Multidimensional, Data-driven, and AI-assisted Framework, https://arxiv.org/abs/2508.16057
Hichem Cheriet, Khellat Kihel Badra, Chouraqui Samira, 22 Aug 2025, Comparative Analysis of UAV Path Planning Algorithms for Efficient Navigation in Urban 3D Environments, https://arxiv.org/abs/2508.16515
Bert de Vries, Wouter Nuijten, Thijs van de Laar, Wouter Kouw, Sepideh Adamiat, Tim Nisslbeck, Mykola Lukashchuk, Hoang Minh Huu Nguyen, Marco Hidalgo Araya, Raphael Tresor, Thijs Jenneskens, Ivana Nikoloska, Raaja Ganapathy Subramanian, Bart van Erp, Dmitry Bagaev and Albert Podusenko, 22 Aug 2025, Expected Free Energy-based Planning as Variational Inference, https://arxiv.org/abs/2504.14898
Yulison Herry Chrisnanto and Julian Evan Chrisnanto, 13 Aug 2025, Quantum-Inspired DRL Approach with LSTM and OU Noise for Cut Order Planning Optimization, https://arxiv.org/abs/2508.16611
Xing Wei, Yuqi Ouyang, 24 Aug 2025, GPG-HT: Generalized Policy Gradient with History-Aware Decision Transformer for Probabilistic Path Planning, https://arxiv.org/abs/2508.17218
Fan Ding, Xuewen Luo, Hwa Hui Tew, Ruturaj Reddy, Xikun Wang, Junn Yong Loo, 23 Aug 2025, Drive As You Like: Strategy-Level Motion Planning Based on A Multi-Head Diffusion Model, https://arxiv.org/abs/2508.16947
Jatin Nainani, Sankaran Vaidyanathan, Connor Watts, Andre N. Assis, Alice Rigg, 25 Aug 2025, Detecting and Characterizing Planning in Language Models, https://arxiv.org/abs/2508.18098
Arvi Jonnarth, Ola Johansson, Jie Zhao, Michael Felsberg, 23 Aug 2025, Sim-to-Real Transfer of Deep Reinforcement Learning Agents for Online Coverage Path Planning, https://arxiv.org/abs/2406.04920
Yongzhi Qi, Jiaheng Yin, Jianshen Zhang, Dongyang Geng, Zhengyu Chen, Hao Hu, Wei Qi, Zuo-Jun Max Shen, 4 Sep 2025, Leveraging LLM-Based Agents for Intelligent Supply Chain Planning, https://arxiv.org/abs/2509.03811
\'Angel Aso-Mollar and Diego Aineto and Enrico Scala and Eva Onaindia, 4 Sep 2025, Handling Infinite Domain Parameters in Planning Through Best-First Search with Delayed Partial Expansions, https://arxiv.org/abs/2509.03953
Alberto Luise, Michele Lombardi, Florent Teichteil Koenigsbuch, 4 Sep 2025, Hybrid Reinforcement Learning and Search for Flight Trajectory Planning, https://arxiv.org/abs/2509.04100
Antonio Guillen-Perez, 3 Sep 2025, Efficient Virtuoso: A Latent Diffusion Transformer Model for Goal-Conditioned Trajectory Planning, https://arxiv.org/abs/2509.03658
Lennart Clasmeier, Jan-Gerrit Habekost, Connor G\"ade, Philipp Allgeuer, and Stefan Wermter, 4 Sep 2025, Keypoint-based Diffusion for Robotic Motion Planning on the NICOL Robot, https://arxiv.org/abs/2509.04076
Yuan Zhao, Liu Lin, 4 Sep 2025, MEPG:Multi-Expert Planning and Generation for Compositionally-Rich Image Generation, https://arxiv.org/abs/2509.04126
Babak Esmaeili and Hamidreza Modares, 4 Sep 2025, SAFE--MA--RRT: Multi-Agent Motion Planning with Data-Driven Safety Certificates, https://arxiv.org/abs/2509.04413
Hang Zhao, Juzhan Xu, Kexiong Yu, Ruizhen Hu, Chenyang Zhu, Bo Du, Kai Xu, 4 Sep 2025, Deliberate Planning of 3D Bin Packing on Packing Configuration Trees, https://arxiv.org/abs/2504.04421
Kushan Mitra, Dan Zhang, Hannah Kim, Estevam Hruschka, 29 Aug 2025, RECAP: REwriting Conversations for Intent Understanding in Agentic Planning, https://arxiv.org/abs/2509.04472
Yilin Guan, Wenyue Hua, Qingfeng Lan, Sun Fei, Dujian Ding, Devang Acharya, Chi Wang, William Yang Wang, 5 Sep 2025, Dynamic Speculative Agent Planning, https://arxiv.org/abs/2509.01920
Yaniv Hassidof, Tom Jurgenson, Kiril Solovey, 5 Sep 2025, Train-Once Plan-Anywhere Kinodynamic Motion Planning via Diffusion Trees, https://arxiv.org/abs/2508.21001
Dillon Z. Chen and Johannes Zenn and Tristan Cinquin and Sheila A. McIlraith, 25 Aug 2025, Language Models For Generalised PDDL Planning: Synthesising Sound and Programmatic Policies, https://arxiv.org/abs/2508.18507
Dillon Z. Chen, 25 Aug 2025, Weisfeiler-Leman Features for Planning: A 1,000,000 Sample Size Hyperparameter Study, https://arxiv.org/abs/2508.18515
Lisai Zhang, Baohan Xu, Siqian Yang, Mingyu Yin, Jing Liu, Chao Xu, Siqi Wang, Yidi Wu, Yuxin Hong, Zihao Zhang, Yanzhang Liang, and Yudong Jiang, 26 Aug 2025, AniME: Adaptive Multi-Agent Planning for Long Animation Generation, https://arxiv.org/abs/2508.18781
Antonio Guillen-Perez, 25 Aug 2025, Mining the Long Tail: A Comparative Study of Data-Centric Criticality Metrics for Robust Offline Reinforcement Learning in Autonomous Motion Planning, https://arxiv.org/abs/2508.18397
Ziyue Li, Yuan Chang, Gaihong Yu, Xiaoqiu Le, 26 Aug 2025, HiPlan: Hierarchical Planning for LLM-Based Agents with Adaptive Global-Local Guidance, https://arxiv.org/abs/2508.19076
Christopher Chandler, Bernd Porr, Giulia Lafratta, Alice Miller, 26 Aug 2025, Real-Time Model Checking for Closed-Loop Robot Reactive Planning, https://arxiv.org/abs/2508.19186
Alex LaGrassa, Zixuan Huang, Dmitry Berenson, and Oliver Kroemer, 26 Aug 2025, Planning-Query-Guided Model Generation for Model-Based Deformable Object Manipulation, https://arxiv.org/abs/2508.19199
Maris F. L. Galesloot, Marnix Suilen, Thiago D. Sim\~ao, Steven Carr, Matthijs T. J. Spaan, Ufuk Topcu, Nils Jansen, 26 Aug 2025, Pessimistic Iterative Planning with RNNs for Robust POMDPs, https://arxiv.org/abs/2408.08770
Rajesh Mangannavar, Alan Fern, Prasad Tadepalli, 25 Aug 2025, Hierarchical Object-Oriented POMDP Planning for Object Rearrangement, https://arxiv.org/abs/2412.01348
Rajesh Mangannavar, Stefan Lee, Alan Fern, Prasad Tadepalli, 25 Aug 2025, Graph Neural Network Based Action Ranking for Planning, https://arxiv.org/abs/2412.04752
Zhiwei Li, Yong Hu, Wenqing Wang, 27 Aug 2025, Encouraging Good Processes Without the Need for Good Answers: Reinforcement Learning for LLM Agent Planning, https://arxiv.org/abs/2508.19598
Jinhao Liang, Sven Koenig, Ferdinando Fioretto, 27 Aug 2025, Discrete-Guided Diffusion for Scalable and Safe Multi-Robot Motion Planning, https://arxiv.org/abs/2508.20095
Wenfeng Feng and Chuzhan Hao and Yuewei Zhang and Guochao Jiang and Jingyi Song and Hao Wang, 27 Aug 2025, AirRAG: Autonomous Strategic Planning and Reasoning Steer Retrieval Augmented Generation, https://arxiv.org/abs/2501.10053
Qizhen Wu, Lei Chen, Kexin Liu, and Jinhu Lu, 27 Aug 2025, Bidirectional Task-Motion Planning Based on Hierarchical Reinforcement Learning for Strategic Confrontation, https://arxiv.org/abs/2504.15876
Nicola Gigante, Francesco Leofante, Andrea Micheli, 29 Aug 2025, Counterfactual Scenarios for Automated Planning, https://arxiv.org/abs/2508.21521
Saravanan Venkatachalam, 29 Aug 2025, Integrating Large Language Models with Network Optimization for Interactive and Explainable Supply Chain Planning: A Real-World Case Study, https://arxiv.org/abs/2508.21622
Usman A. Khan, Mouhacine Benosman, Wenliang Liu, Federico Pecora, Joseph W. Durham, 28 Aug 2025, Multi-robot Path Planning and Scheduling via Model Predictive Optimal Transport (MPC-OT), https://arxiv.org/abs/2508.21205
Bo Fu, Zhe Chen, Rahul Chandan, Alex Barbosa, Michael Caldara, Joey Durham, Federico Pecora, 31 Aug 2025, Symbolic Planning and Multi-Agent Path Finding in Extremely Dense Environments with Movable Obstacles, https://arxiv.org/abs/2509.01022
Huang Fang, Mengxi Zhang, Heng Dong, Wei Li, Zixuan Wang, Qifeng Zhang, Xueyun Tian, Yucheng Hu, Hang Li, 1 Sep 2025, Robix: A Unified Model for Robot Interaction, Reasoning and Planning, https://arxiv.org/abs/2509.01106
Riya Kinnarkar, Mansur Arief, 15 Aug 2025, Optimized Renewable Energy Planning MDP for Socially-Equitable Electricity Coverage in the US, https://arxiv.org/abs/2509.00008
Fulvio Mastrogiovanni and Antony Thomas, 30 Aug 2025, A Framework for Task and Motion Planning based on Expanding AND/OR Graphs, https://arxiv.org/abs/2509.00317
Andrea Eirale, Matteo Leonetti, Marcello Chiaberge, 2 Sep 2025, Learning Social Heuristics for Human-Aware Path Planning, https://arxiv.org/abs/2509.02134
Aline Dobrovsky, Konstantin Schekotihin, Christian Burmer, 2 Sep 2025, Intelligent Assistants for the Semiconductor Failure Analysis with LLM-Based Planning Agents, https://arxiv.org/abs/2506.15567
Shahab Shokouhi, Oguzhan Oruc, May-Win Thein, 1 Sep 2025, Self-Supervised Learning-Based Path Planning and Obstacle Avoidance Using PPO and B-Splines in Unknown Environments, https://arxiv.org/abs/2412.02176
Delong Chen, Theo Moutakanni, Willy Chung, Yejin Bang, Ziwei Ji, Allen Bolourchi, Pascale Fung, 2 Sep 2025, Planning with Reasoning using Vision Language World Model, https://arxiv.org/abs/2509.02722
Hankang Gu, Yuli Zhang, Chengming Wang, Ruiyuan Jiang, Ziheng Qiao, Pengfei Fan, Dongyao Jia, 3 Sep 2025, A Hierarchical Deep Reinforcement Learning Framework for Traffic Signal Control with Predictable Cycle Planning, https://arxiv.org/abs/2509.03118
Itai Zilberstein, Alberto Candela, Steve Chien, 3 Sep 2025, Real-Time Instrument Planning and Perception for Novel Measurements of Dynamic Phenomena, https://arxiv.org/abs/2509.03500
Yunzhe Wang, Volkan Ustun, Chris McGroarty, 8 Sep 2025, A data-driven discretized CS:GO simulation environment to facilitate strategic multi-agent planning research, https://arxiv.org/abs/2509.06355
Yanda Yang, Max Sokolich, Fatma Ceren Kirmizitas, Sambeeta Das, Andreas A. Malikopoulos, 5 Sep 2025, Microrobot Vascular Parkour: Analytic Geometry-based Path Planning with Real-time Dynamic Obstacle Avoidance, https://arxiv.org/abs/2509.05500
Jiahui Yang, Jason Jingzhou Liu, Yulong Li, Youssef Khaky, Kenneth Shaw, Deepak Pathak, 8 Sep 2025, Deep Reactive Policy: Learning Reactive Manipulator Motion Planning for Dynamic Environments, https://arxiv.org/abs/2509.06953
Matthew Lai, Keegan Go, Zhibin Li, Torsten Kroger, Stefan Schaal, Kelsey Allen, Jonathan Scholz, 5 Sep 2025, RoboBallet: Planning for Multi-Robot Reaching with Graph Neural Networks and Reinforcement Learning, https://arxiv.org/abs/2509.05397
Jie-Jing Shao, Bo-Wen Zhang, Xiao-Wen Yang, Baizhi Chen, Si-Yu Han, Wen-Da Wei, Guohao Cai, Zhenhua Dong, Lan-Zhe Guo, Yu-Feng Li, 6 Sep 2025, ChinaTravel: An Open-Ended Benchmark for Language Agents in Chinese Travel Planning, https://arxiv.org/abs/2412.13682
Yanwei Gong, Junchao Fan, Ruichen Zhang, Dusit Niyato, Yingying Yao, and Xiaolin Chang, 7 Sep 2025, Safe and Economical UAV Trajectory Planning in Low-Altitude Airspace: A Hybrid DRL-LLM Approach with Compliance Awareness, https://arxiv.org/abs/2506.08532
Mustafa Baniodeh, Kratarth Goel, Scott Ettinger, Carlos Fuertes, Ari Seff, Tim Shen, Cole Gulino, Chenjie Yang, Ghassen Jerfel, Dokook Choe, Rui Wang, Benjamin Charrow, Vinutha Kallem, Sergio Casas, Rami Al-Rfou, Benjamin Sapp, Dragomir Anguelov, 8 Sep 2025, Scaling Laws of Motion Forecasting and Planning - Technical Report, https://arxiv.org/abs/2506.08228

LLM Long Term Memory

LLM Long Term Memory refers to having the LLM "remember" things that it has learned during inference. By default, an LLM is "stateless" and does not recall facts between queries. Short-term memory can be given via tracking conversational history as "context" for a query, but long term memory is the aim of having an LLM "learn" or "memorize" new facts. Note that this research area is about accuracy of the output, not about the speed optimization of LLM inference memory efficiency.

Research on LLM long term memory:

Shenggang Li, Jul 30, 2024, Mem0: Is This the Future of AI Memory Management? https://ai.gopubby.com/mem0-is-this-the-future-of-ai-memory-management-1e228dc8220a
Aurimas Griciūnas, Oct 30, 2024, Memory in Agent Systems, https://www.newsletter.swirlai.com/p/memory-in-agent-systems
Zihong He, Weizhe Lin, Hao Zheng, Fan Zhang, Matt Jones, Laurence Aitchison, Xuhai Xu, Miao Liu, Per Ola Kristensson, Junxiao Shen, 1 Nov 2024, Human-inspired Perspectives: A Survey on AI Long-term Memory, https://arxiv.org/abs/2411.00489
Debmalya Biswas, Dec 2024, Long-term Memory for AI Agents: Why Vector Databases are not sufficient for Memory Management of Agentic AI Systems? https://ai.gopubby.com/long-term-memory-for-agentic-ai-systems-4ae9b37c6c0f
Mingda Chen, Yang Li, Karthik Padthe, Rulin Shao, Alicia Sun, Luke Zettlemoyer, Gargi Gosh, Wen-tau Yih, 24 Dec 2024, Improving Factuality with Explicit Working Memory, https://arxiv.org/abs/2412.18069
Ben Dickson, December 13, 2024, New LLM optimization technique slashes memory costs up to 75%, https://venturebeat.com/ai/new-llm-optimization-technique-slashes-memory-costs-up-to-75/
Edoardo Cetin, Qi Sun, Tianyu Zhao, Yujin Tang, 6 Dec 2024 (v3), An Evolved Universal Transformer Memory, https://arxiv.org/abs/2410.13166
Mingyu Jin, Weidi Luo, Sitao Cheng, Xinyi Wang, Wenyue Hua, Ruixiang Tang, William Yang Wang, Yongfeng Zhang, 21 Nov 2024 (v2), Disentangling Memory and Reasoning Ability in Large Language Models, https://arxiv.org/abs/2411.13504 https://github.com/MingyuJ666/Disentangling-Memory-and-Reasoning
Alhassan Mumuni, Fuseini Mumuni, 6 Jan 2025, Large language models for artificial general intelligence (AGI): A survey of foundational principles and approaches, https://arxiv.org/abs/2501.03151
Andrea Matarazzo, Riccardo Torlone, 3 Jan 2025, A Survey on Large Language Models with some Insights on their Capabilities and Limitations, https://arxiv.org/abs/2501.04040 (Broad survey with many LLM topics covered from history to architectures to optimizations.)
Ben Dickson, January 16, 2025, Google’s new neural-net LLM architecture separates memory components to control exploding costs of capacity and compute, https://venturebeat.com/ai/googles-new-neural-net-architecture-separates-memory-components-to-control-exploding-costs/
Mohamed A. Taha, 14 Jan 2025, Logarithmic Memory Networks (LMNs): Efficient Long-Range Sequence Modeling for Resource-Constrained Environments, https://arxiv.org/abs/2501.07905
Ali Behrouz, Peilin Zhong, Vahab Mirrokni, 31 Dec 2024, Titans: Learning to Memorize at Test Time, https://arxiv.org/abs/2501.00663
Tong Xiao, Jingbo Zhu, 16 Jan 2025, Foundations of Large Language Models, https://arxiv.org/abs/2501.09223 (Huge 230 page paper on many topics such as training, prompting, alignment, and long context.)
Sergey Legtchenko, Ioan Stefanovici, Richard Black, Antony Rowstron, Junyi Liu, Paolo Costa, Burcu Canakci, Dushyanth Narayanan, Xingbo Wu, 16 Jan 2025, Managed-Retention Memory: A New Class of Memory for the AI Era, https://arxiv.org/abs/2501.09605
Dr. Ashish Bamania, Jan 2025, Memory Layers Are Supercharging LLMs Like Never Before, https://levelup.gitconnected.com/memory-layers-are-supercharging-llms-like-never-before-056b99ea75cd
Vincent-Pierre Berges, Barlas Oğuz, Daniel Haziza, Wen-tau Yih, Luke Zettlemoyer, Gargi Ghosh, 20 Dec 2024 (v2), Memory Layers at Scale, https://arxiv.org/abs/2412.09764 https://github.com/facebookresearch/memory
Guillaume Lample, Alexandre Sablayrolles, Marc'Aurelio Ranzato, Ludovic Denoyer, Hervé Jégou, 16 Dec 2019 (v2), Large Memory Layers with Product Keys, https://arxiv.org/abs/1907.05242
Sainbayar Sukhbaatar, Arthur Szlam, Jason Weston, Rob Fergus, 24 Nov 2015 (v5), End-To-End Memory Networks, https://arxiv.org/abs/1503.08895 (Early paper as precursor to memory layers.)
Paul Sawers, January 23, 2025, Meta’s Yann LeCun predicts a ‘new AI architectures paradigm’ within 5 years and ‘decade of robotics’, https://techcrunch.com/2025/01/23/metas-yann-lecun-predicts-a-new-ai-architectures-paradigm-within-5-years-and-decade-of-robotics/
Haomiao Xiong, Zongxin Yang, Jiazuo Yu, Yunzhi Zhuge, Lu Zhang, Jiawen Zhu, Huchuan Lu, 23 Jan 2025, Streaming Video Understanding and Multi-round Interaction with Memory-enhanced Knowledge, https://arxiv.org/abs/2501.13468 https://github.com/hmxiong/StreamChat
Libo Wang, 24 Jan 2025, Wormhole Memory: A Rubik's Cube for Cross-Dialogue Retrieval, https://arxiv.org/abs/2501.14846
Wujiang Xu, Zujie Liang, Kai Mei, Hang Gao, Juntao Tan, Yongfeng Zhang, 17 Feb 2025, A-MEM: Agentic Memory for LLM Agents, https://arxiv.org/abs/2502.12110 https://github.com/WujiangXu/AgenticMemory
Xiaoran Liu, Ruixiao Li, Mianqiu Huang, Zhigeng Liu, Yuerong Song, Qipeng Guo, Siyang He, Qiqi Wang, Linlin Li, Qun Liu, Yaqian Zhou, Xuanjing Huang, Xipeng Qiu, 24 Feb 2025, Thus Spake Long-Context Large Language Model, https://arxiv.org/abs/2502.17129 (Impressive survey of many techniques to improve efficiency and accuracy of long context processing in both inference and training, covering text, video and multimodal models.)
Avinash Patil, 5 Feb 2025, Advancing Reasoning in Large Language Models: Promising Methods and Approaches, https://arxiv.org/abs/2502.03671
Emilia David, March 5, 2025, Enhancing AI agents with long-term memory: Insights into LangMem SDK, Memobase and the A-MEM Framework, https://venturebeat.com/ai/enhancing-ai-agents-with-long-term-memory-insights-into-langmem-sdk-memobase-and-the-a-mem-framework/
Asif Razzaq, March 8, 2025, Meet Manus: A New AI Agent from China with Deep Research + Operator + Computer Use + Lovable + Memory, https://www.marktechpost.com/2025/03/08/meet-manus-a-new-ai-agent-from-china-with-deep-research-operator-computer-use-lovable-memory/
Mingyue Cheng, Yucong Luo, Jie Ouyang, Qi Liu, Huijie Liu, Li Li, Shuo Yu, Bohou Zhang, Jiawei Cao, Jie Ma, Daoyu Wang, Enhong Chen, 17 Mar 2025 (v2), A Survey on Knowledge-Oriented Retrieval-Augmented Generation, https://arxiv.org/abs/2503.10677
Character.AI, May 19, 2025, Helping Characters Remember What Matters Most, https://blog.character.ai/helping-characters-remember-what-matters-most/
Nir Diamant, Jun 29, 2025, Memory Optimization Strategies in AI Agents: Building Smarter Agents That Learn and Remember, https://diamantai.substack.com/p/memory-optimization-strategies-in
Zeyu Zhang, Quanyu Dai, Xu Chen, Rui Li, Zhongyang Li, Zhenhua Dong, 4 May 2025, MemEngine: A Unified and Modular Library for Developing Advanced Memory of LLM-based Agents, https://arxiv.org/abs/2505.02099 https://github.com/nuster1128/MemEngine
Yiming Du, Wenyu Huang, Danna Zheng, Zhaowei Wang, Sebastien Montella, Mirella Lapata, Kam-Fai Wong, Jeff Z. Pan, 27 May 2025 (v2), Rethinking Memory in AI: Taxonomy, Operations, Topics, and Future Directions, https://arxiv.org/abs/2505.00675 https://github.com/Elvin-Yiming-Du/Survey_Memory_in_AI
Yaxiong Wu, Sheng Liang, Chen Zhang, Yichao Wang, Yongyue Zhang, Huifeng Guo, Ruiming Tang, Yong Liu, 23 Apr 2025 (v2), From Human Memory to AI Memory: A Survey on Memory Mechanisms in the Era of LLMs, https://arxiv.org/abs/2504.15965
Ningning Wang, Xavier Hu, Pai Liu, He Zhu, Yue Hou, Heyuan Huang, Shengyu Zhang, Jian Yang, Jiaheng Liu, Ge Zhang, Changwang Zhang, Jun Wang, Yuchen Eleanor Jiang, Wangchunshu Zhou, 24 Jul 2025, Efficient Agents: Building Effective Agents While Reducing Cost, https://arxiv.org/pdf/2508.02694 https://github.com/OPPO-PersonalAI/OAgents
Emilia David, August 13, 2025, Google adds limited chat personalization to Gemini, trails Anthropic and OpenAI in memory features, https://venturebeat.com/ai/google-adds-limited-chat-personalization-to-gemini-trails-anthropic-and-openai-in-memory-features/
Nathan Lambert, Aug 15, 2025, Contra Dwarkesh on Continual Learning: Don't try to make your airplane too much like a bird, https://www.interconnects.ai/p/contra-dwarkesh-on-continual-learning
MacKenzie Sigalos, Aug 19 2025 Sam Altman on GPT-6: ‘People want memory’, https://www.cnbc.com/2025/08/19/sam-altman-on-gpt-6-people-want-memory.html
Parsa Omidi, Xingshuai Huang, Axel Laborieux, Bahareh Nikpour, Tianyu Shi, and Armaghan Eshaghi, 14 Aug 2025, Memory-Augmented Transformers: A Systematic Review from Neuroscience Principles to Technical Solutions, https://arxiv.org/abs/2508.10824
Daniel Szelogowski, 29 Jul 2025, Hebbian Memory-Augmented Recurrent Networks: Engram Neurons in Deep Learning, https://arxiv.org/abs/2507.21474
Yi Kong, Dianxi Shi, Guoli Yang, Zhang ke-di, Chenlin Huang, Xiaopeng Li, Songchang Jin, 29 Jul 2025, MapAgent: Trajectory-Constructed Memory-Augmented Planning for Mobile Task Automation, https://arxiv.org/abs/2507.21953
Leyi Ouyang, 2 Aug 2025, Can Memory-Augmented LLM Agents Aid Journalism in Interpreting and Framing News for Diverse Audiences?, https://arxiv.org/abs/2507.21055
Yongyi Wang, Lingfeng Li, Bozhou Chen, Ang Li, Hanyu Liu, Qirui Zheng, Xionghui Yang, Wenxin Li, 6 Aug 2025, Synthetic POMDPs to Challenge Memory-Augmented RL: Memory Demand Structure Modeling, https://arxiv.org/abs/2508.04282
Dongwook Choi, Taeyoon Kwon, Dongil Yang, Hyojun Kim, Jinyoung Yeo, 12 Aug 2025, Designing Memory-Augmented AR Agents for Spatiotemporal Reasoning in Personalized Task Assistance, https://arxiv.org/abs/2508.08774

Agentic Workflow

Agentic workflow has some aspects of reasoning (e.g., planning, multi-step execution) combined with agent technologies. Papers on agentic workflow include:

Arun Shankar, Oct 2024, Designing Cognitive Architectures: Agentic Workflow Patterns from Scratch, https://medium.com/google-cloud/designing-cognitive-architectures-agentic-workflow-patterns-from-scratch-63baa74c54bc
AI Agent Workflows: A Complete Guide on Whether to Build With LangGraph or LangChain, Sandi Besen, Oct 2024, https://towardsdatascience.com/ai-agent-workflows-a-complete-guide-on-whether-to-build-with-langgraph-or-langchain-117025509fa0
Anita Kirkovska, David Vargas, Jul 11, 2024, Agentic Workflows in 2024: The ultimate guide, https://www.vellum.ai/blog/agentic-workflows-emerging-architectures-and-design-patterns
Shuofei Qiao, Runnan Fang, Zhisong Qiu, Xiaobin Wang, Ningyu Zhang, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen, 10 Oct 2024, Benchmarking Agentic Workflow Generation, https://arxiv.org/abs/2410.07869
A. Singh, A. Ehtesham, S. Kumar and T. T. Khoei, "Enhancing AI Systems with Agentic Workflows Patterns in Large Language Model," 2024 IEEE World AI IoT Congress (AIIoT), Seattle, WA, USA, 2024, pp. 527-532, doi: 10.1109/AIIoT61789.2024.10578990. https://ieeexplore.ieee.org/abstract/document/10578990
Chawla, Chhavi; Chatterjee, Siddharth; Gadadinni, Sanketh Siddanna; Verma, Pulkit; Banerjee, Sourav, 2024, Agentic AI: The building blocks of sophisticated AI business applications, Journal of AI, Robotics & Workplace Automation, Volume 3 / Number 3 / Summer 2024, pp. 1-15(15), Henry Stewart Publications, DOI: https://doi.org/10.69554/XEHZ1946 https://www.ingentaconnect.com/content/hsp/airwa/2024/00000003/00000003/art00001
Jiayi Zhang, Jinyu Xiang, Zhaoyang Yu, Fengwei Teng, Xionghui Chen, Jiaqi Chen, Mingchen Zhuge, Xin Cheng, Sirui Hong, Jinlin Wang, Bingnan Zheng, Bang Liu, Yuyu Luo, Chenglin Wu, 14 Oct 2024, AFlow: Automating Agentic Workflow Generation, https://arxiv.org/abs/2410.10762 https://github.com/geekan/MetaGPT
Ruixuan Xiao, Wentao Ma, Ke Wang, Yuchuan Wu, Junbo Zhao, Haobo Wang, Fei Huang, Yongbin Li, 21 Jun 2024, FlowBench: Revisiting and Benchmarking Workflow-Guided Planning for LLM-based Agents, https://arxiv.org/abs/2406.14884
Dawei Gao, Zitao Li, Xuchen Pan, Weirui Kuang, Zhijian Ma, Bingchen Qian, Fei Wei, Wenhao Zhang, Yuexiang Xie, Daoyuan Chen, Liuyi Yao, Hongyi Peng, Zeyu Zhang, Lin Zhu, Chen Cheng, Hongzhu Shi, Yaliang Li, Bolin Ding, Jingren Zhou, 20 May 2024 (v2), AgentScope: A Flexible yet Robust Multi-Agent Platform, https://arxiv.org/abs/2402.14034 https://github.com/modelscope/agentscope
Omer Mahmood, Dec 25, 2024, Getting Started With Agentic Workflows: Moving beyond AI tools to automating high-value processes! https://pub.towardsai.net/getting-started-with-agentic-workflows-9703ac6ded62
Chirag Shah, Ryen W. White, 19 Dec 2024, Agents Are Not Enough, https://www.arxiv.org/abs/2412.16241
Fengli Xu, Qianyue Hao, Zefang Zong, Jingwei Wang, Yunke Zhang, Jingyi Wang, Xiaochong Lan, Jiahui Gong, Tianjian Ouyang, Fanjin Meng, Chenyang Shao, Yuwei Yan, Qinglong Yang, Yiwen Song, Sijian Ren, Xinyuan Hu, Yu Li, Jie Feng, Chen Gao, Yong Li, 17 Jan 2025 (v2), Towards Large Reasoning Models: A Survey on Scaling LLM Reasoning Capabilities, https://arxiv.org/abs/2501.09686

Temporal Reasoning (Time-Based Logic)

AI models struggle with the concept of time and any sort of "temporal reasoning" that is based on time progression or causation over time.

Jonas Wallat, Adam Jatowt, Avishek Anand, March 2024, Temporal Blind Spots in Large Language Models, WSDM '24: Proceedings of the 17th ACM International Conference on Web Search and Data Mining, Pages 683–692, https://arxiv.org/abs/2401.12078, https://doi.org/10.1145/3616855.3635818, https://dl.acm.org/doi/abs/10.1145/3616855.3635818
Siheng Xiong, Ali Payani, Ramana Kompella, Faramarz Fekri, 22 Apr 2024 (v3), Large Language Models Can Learn Temporal Reasoning, https://arxiv.org/abs/2401.06853
Bowen Zhao, Zander Brumbaugh, Yizhong Wang, Hannaneh Hajishirzi, Noah A. Smith, 26 Feb 2024, Set the Clock: Temporal Alignment of Pretrained Language Models, https://arxiv.org/abs/2402.16797 Code: https://github.com/yizhongw/llm-temporal-alignment
16 Nov 2023, Towards Robust Temporal Reasoning of Large Language Models via a Multi-Hop QA Dataset and Pseudo-Instruction Tuning, Qingyu Tan, Hwee Tou Ng, Lidong Bing, https://arxiv.org/abs/2311.09821
Yifu Qiu, Zheng Zhao, Yftah Ziser, Anna Korhonen, Edoardo M. Ponti, Shay B. Cohen, 16 Nov 2023 (v2), Are Large Language Models Temporally Grounded? https://arxiv.org/abs/2311.08398 Code: https://github.com/yfqiu-nlp/temporal-llms
Raghav Jain, Daivik Sojitra, Arkadeep Acharya, Sriparna Saha, Adam Jatowt, Sandipan Dandapat, December 2023, Do Language Models Have a Common Sense regarding Time? Revisiting Temporal Commonsense Reasoning in the Era of Large Language Models, Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing https://aclanthology.org/2023.emnlp-main.418/ PDF: https://aclanthology.org/2023.emnlp-main.418.pdf
Yifan Wei, Yisong Su, Huanhuan Ma, Xiaoyan Yu, Fangyu Lei, Yuanzhe Zhang, Jun Zhao, Kang Liu, 8 Oct 2023, MenatQA: A New Dataset for Testing the Temporal Comprehension and Reasoning Abilities of Large Language Models, https://arxiv.org/abs/2310.05157
Himanshu Beniwal, Kowsik Nandagopan D, Mayank Singh, 19 Feb 2024, Remember This Event That Year? Assessing Temporal Information and Reasoning in Large Language Models, https://arxiv.org/abs/2402.11997
Bahare Fatemi, Mehran Kazemi, Anton Tsitsulin, Karishma Malkan, Jinyeong Yim, John Palowitch, Sungyong Seo, Jonathan Halcrow, Bryan Perozzi, 13 Jun 2024, Test of Time: A Benchmark for Evaluating LLMs on Temporal Reasoning, https://arxiv.org/abs/2406.09170
Irwin Deng, Kushagra Dixit, Vivek Gupta, Dan Roth, 22 Jul 2024, Enhancing Temporal Understanding in LLMs for Semi-structured Tables, https://arxiv.org/abs/2407.16030
Dimitris Spathis, Fahim Kawsar, The first step is the hardest: pitfalls of representing and tokenizing temporal data for large language models, Journal of the American Medical Informatics Association, Volume 31, Issue 9, September 2024, Pages 2151–2158, https://doi.org/10.1093/jamia/ocae090 https://academic.oup.com/jamia/advance-article-abstract/doi/10.1093/jamia/ocae090/7702405?redirectedFrom=fulltext
Mayi Xu, Yunfeng Ning, Yongqi Li, Jianhao Chen, Jintao Wen, Yao Xiao, Shen Zhou, Birong Pan, Zepeng Bao, Xin Miao, Hankun Kang, Ke Sun, Tieyun Qian, 2 Jan 2025, Reasoning based on symbolic and parametric knowledge bases: a survey, https://arxiv.org/abs/2501.01030 (Extensive survey of reasoning from CoT to knowledge graphs to table-based reasoning.)
Yubin Ge, Salvatore Romeo, Jason Cai, Raphael Shu, Monica Sunkara, Yassine Benajiba, Yi Zhang, 3 Feb 2025, TReMu: Towards Neuro-Symbolic Temporal Reasoning for LLM-Agents with Memory in Multi-Session Dialogues, https://arxiv.org/abs/2502.01630
Jongho Kim, Seung-won Hwang, 17 Feb 2025, Counterfactual-Consistency Prompting for Relative Temporal Understanding in Large Language Models, https://arxiv.org/abs/2502.11425
Ningke Li, Yahui Song, Kailong Wang, Yuekang Li, Ling Shi, Yi Liu, Haoyu Wang, 19 Feb 2025, Detecting LLM Fact-conflicting Hallucinations Enhanced by Temporal-logic-based Reasoning, https://arxiv.org/abs/2502.13416
Yuhan Xie, William Cappelletti, Mahsa Shoaran and Pascal Frossard, 13 Aug 2025, rETF-semiSL: Semi-Supervised Learning for Neural Collapse in Temporal Data, https://arxiv.org/abs/2508.10147
Xuanhao Mu, G\"okhan Demirel, Yuzhe Zhang, Jianlei Liu, Thorsten Schlachter and Veit Hagenmeyer, 14 Aug 2025, Self-Supervised Temporal Super-Resolution of Energy Data using Generative Adversarial Transformer, https://arxiv.org/abs/2508.10587
Qianru Zhang, Xinyi Gao, Haixin Wang, Dong Huang, Siu-Ming Yiu and Hongzhi Yin, 14 Aug 2025, HGAurban: Heterogeneous Graph Autoencoding for Urban Spatial-Temporal Learning, https://arxiv.org/abs/2410.10915
Chandra Raskoti, Iftekharul Islam, Xuan Wang, and Weizi Li, 13 Aug 2025, MIAT: Maneuver-Intention-Aware Transformer for Spatio-Temporal Trajectory Prediction, https://arxiv.org/abs/2504.05059
Luca Salvatore Lorello, Nikolaos Manginas, Marco Lippi, Stefano Melacci, 23 Jul 2025, LTLZinc: a Benchmarking Framework for Continual Learning and Neuro-Symbolic Temporal Reasoning, https://arxiv.org/abs/2507.17482
Shaohan Li, Hao Yang, Min Chen, Xiaolin Qin, 23 Jul 2025, Met$^2$Net: A Decoupled Two-Stage Spatio-Temporal Forecasting Model for Complex Meteorological Systems, https://arxiv.org/abs/2507.17189
Tobias Morocutti, Jonathan Greif, Paul Primus, Florian Schmid, Gerhard Widmer, 23 Jul 2025, On Temporal Guidance and Iterative Refinement in Audio Source Separation, https://arxiv.org/abs/2507.17297
Weihua Gao, Chunxu Ren, Wenlong Niu, Xiaodong Peng, 23 Jul 2025, Temporal Point-Supervised Signal Reconstruction: A Human-Annotation-Free Framework for Weak Moving Target Detection, https://arxiv.org/abs/2507.17334
Jianhao Chen, Junyang Ren, Wentao Ding, Haoyuan Ouyang, Wei Hu, Yuzhong Qu, 23 Jul 2025, Conflict Detection for Temporal Knowledge Graphs:A Fast Constraint Mining Algorithm and New Benchmarks, https://arxiv.org/abs/2312.11053
Guangqiang Li, M. Amine Atoui and Xiangshun Li, 23 Jul 2025, Attention-Based Multiscale Temporal Fusion Network for Uncertain-Mode Fault Diagnosis in Multimode Processes, https://arxiv.org/abs/2504.05172
Pascal K\"undig, Fabio Sigrist, 23 Jul 2025, A Spatio-Temporal Machine Learning Model for Mortgage Credit Risk: Default Probabilities and Loan Portfolios, https://arxiv.org/abs/2410.02846
Xi Yang, Jiachen Wang, Song Han, Suining He, 21 Jul 2025, Micromobility Flow Prediction: A Bike Sharing Station-level Study via Multi-level Spatial-Temporal Attention Neural Network, https://arxiv.org/abs/2507.16020
Chang Li, Yaren Zhang, Haoran Lv, Qiong Cao, Chao Xue, Xiaodong He, 22 Jul 2025, Learning Temporal Abstractions via Variational Homomorphisms in Option-Induced Abstract MDPs, https://arxiv.org/abs/2507.16473
Zixiao Huang, Junhao Hu, Hao Lin, Chunyang Zhu, Yueran Tang, Quanlu Zhang, Zhen Guo, Zhenhua Li, Shengen Yan, Zhenhua Zhu, Guohao Dai, Yu Wang, 22 Jul 2025, Reducing GPU Memory Fragmentation via Spatio-Temporal Planning for Efficient Large-Scale Model Training, https://arxiv.org/abs/2507.16274
Alireza Dizaji, Benedict Aaron Tjandra, Mehrab Hamidi, Shenyang Huang, Guillaume Rabusseau, 22 Jul 2025, T-GRAB: A Synthetic Diagnostic Benchmark for Learning on Temporal Graphs, https://arxiv.org/abs/2507.10183
Shiyuan Zhang, Tong Li, Zhu Xiao, Hongyang Du, Kaibin Huang, 23 Jul 2025, LSDM: LLM-Enhanced Spatio-temporal Diffusion Model for Service-Level Mobile Traffic Prediction, https://arxiv.org/abs/2507.17795
Jianchao Wang, Qingfeng Li, Pengcheng Zheng, Xiaorong Pu, Yazhou Ren, 24 Jul 2025, ChronoSelect: Robust Learning with Noisy Labels via Dynamics Temporal Memory, https://arxiv.org/abs/2507.18183
Ruizhe Chen, Zhiting Fan, Tianze Luo, Heqing Zou, Zhaopeng Feng, Guiyang Xie, Hansheng Zhang, Zhuochen Wang, Zuozhu Liu, Huaijian Zhang, 24 Jul 2025, Datasets and Recipes for Video Temporal Grounding via Reinforcement Learning, https://arxiv.org/abs/2507.18100
Edward Fish and Andrew Gilbert, 24 Jul 2025, PLOT-TAL: Prompt Learning with Optimal Transport for Few-Shot Temporal Action Localization, https://arxiv.org/abs/2403.18915
Haoyang Li, Yuming Xu, Yiming Li, Hanmo Liu, Darian Li, Chen Jason Zhang, Lei Chen, Qing Li, 18 Jul 2025, When Speed meets Accuracy: an Efficient and Effective Graph Model for Temporal Link Prediction, https://arxiv.org/abs/2507.13825
Pedro Cabalar, Mart\'in Di\'eguez, Fran\c{c}ois Olivier, Torsten Schaub and Igor St\'ephan, 18 Jul 2025, Towards Constraint Temporal Answer Set Programming, https://arxiv.org/abs/2507.13958
Itay Katav, Aryeh Kontorovich, 18 Jul 2025, ParallelTime: Dynamically Weighting the Balance of Short- and Long-Term Temporal Dependencies, https://arxiv.org/abs/2507.13998
Jianhong Chen, Meng Zhao, Mostafa Reisi Gahrooei, Xubo Yue, 18 Jul 2025, Toward Temporal Causal Representation Learning with Tensor Decomposition, https://arxiv.org/abs/2507.14126
Sirui Wang, Zhou Guan, Bingxi Zhao, Tongjia Gu, 17 Jul 2025, CaSTFormer: Causal Spatio-Temporal Transformer for Driving Intention Prediction, https://arxiv.org/abs/2507.13425
Garapati Keerthana, Manik Gupta, 18 Jul 2025, DENSE: Longitudinal Progress Note Generation with Temporal Modeling of Heterogeneous Clinical Notes Across Hospital Visits, https://arxiv.org/abs/2507.14079
Haoyu He, Haozheng Luo, Yan Chen, Qi R. Wang, 18 Jul 2025, Efficient Temporal Tokenization for Mobility Prediction with Large Language Models, https://arxiv.org/abs/2507.14017
Jiayu Song, Mahmud Elahi Akhter, Dana Atzil Slonim, Maria Liakata, 18 Jul 2025, Temporal reasoning for timeline summarisation in social media, https://arxiv.org/abs/2501.00152
Lingyu Li, Yang Yao, Yixu Wang, Chubo Li, Yan Teng, Yingchun Wang, 21 Jul 2025, The Other Mind: How Language Models Exhibit Human Temporal Cognition, https://arxiv.org/abs/2507.15851
Xuetao Lin (1 and 2), Tianhao Peng (1 and 2), Peihong Dai (1 and 2), Yu Liang (3), Wenjun Wu (1 and 2) ((1) Beihang University, Beijing, China, (2) SKLCCSE, Beijing, China, (3) Beijing University of Technology, Beijing, China), 19 Jul 2025, Spatial-Temporal Transformer with Curriculum Learning for EEG-Based Emotion Recognition, https://arxiv.org/abs/2507.14698
Mehak Arora, Ayman Ali, Kaiyuan Wu, Carolyn Davis, Takashi Shimazui, Mahmoud Alwakeel, Victor Moas, Philip Yang, Annette Esper, Rishikesan Kamaleswaran, 19 Jul 2025, CXR-TFT: Multi-Modal Temporal Fusion Transformer for Predicting Chest X-ray Trajectories, https://arxiv.org/abs/2507.14766
Rabia Latief Bhat and Iqra Altaf Gillani, 21 Jul 2025, Spatio-Temporal Demand Prediction for Food Delivery Using Attention-Driven Graph Neural Networks, https://arxiv.org/abs/2507.15246
Matthew J. Bryan, Felix Schwock, Azadeh Yazdan-Shahmorad, Rajesh P N Rao, 21 Jul 2025, Temporal Basis Function Models for Closed-Loop Neural Stimulation, https://arxiv.org/abs/2507.15274
Xinxin Dong, Baoyun Peng, Haokai Ma, Yufei Wang, Zixuan Dong, Fei Hu, Xiaodong Wang, 20 Jul 2025, LeAdQA: LLM-Driven Context-Aware Temporal Grounding for Video Question Answering, https://arxiv.org/abs/2507.14784
Shaohang Wei, Wei Li, Feifan Song, Wen Luo, Tianyi Zhuang, Haochen Tan, Zhijiang Guo, Houfeng Wang, 19 Jul 2025, TIME: A Multi-level Benchmark for Temporal Reasoning of LLMs in Real-World Scenarios, https://arxiv.org/abs/2505.12891
Duygu Sezen Islakoglu, Jan-Christoph Kalo, 21 Jul 2025, ChronoSense: Exploring Temporal Understanding in Large Language Models with Time Intervals of Events, https://arxiv.org/abs/2501.03040
Luo Ji, Gao Liu, Mingyang Yin, Hongxia Yang, Jingren Zhou, 19 Jul 2025, Hierarchical Reinforcement Learning for Temporal Abstraction of Listwise Recommendation, https://arxiv.org/abs/2409.07416
Yijing Lin, Mengqi Huang, Shuhan Zhuang, Zhendong Mao, 20 Jul 2025, RealGeneral: Unifying Visual Generation via Temporal In-Context Learning with Video Models, https://arxiv.org/abs/2503.10406
Bo-Cheng Chiu, Jen-Jee Chen, Yu-Chee Tseng and Feng-Chi Chen, 21 Jul 2025, DaMO: A Data-Efficient Multimodal Orchestrator for Temporal Reasoning with Video LLMs, https://arxiv.org/abs/2506.11558
Yiming Yang, Yueru Luo, Bingkun He, Hongbin Lin, Suzhong Fu, Chao Zheng, Zhipeng Cao, Erlong Li, Chao Yan, Shuguang Cui, Zhen Li, 20 Jul 2025, TopoStreamer: Temporal Lane Segment Topology Reasoning in Autonomous Driving, https://arxiv.org/abs/2507.00709
Agnideep Aich, Ashit Baran Aich, Dipak C. Jain, 21 Jul 2025, Temporal Conformal Prediction (TCP): A Distribution-Free Statistical and Machine Learning Framework for Adaptive Risk Forecasting, https://arxiv.org/abs/2507.05470
Zhaoyu Chen, Hongnan Lin, Yongwei Nie, Fei Ma, Xuemiao Xu, Fei Yu, Chengjiang Long, 10 Aug 2025, Invert4TVG: A Temporal Video Grounding Framework with Inversion Tasks for Enhanced Action Understanding, https://arxiv.org/abs/2508.07388
Jie Li, Haoye Dong, Zhengyang Wu, Zetao Zheng, Mingrong Lin, 11 Aug 2025, Disentangling Multiplex Spatial-Temporal Transition Graph Representation Learning for Socially Enhanced POI Recommendation, https://arxiv.org/abs/2508.07649
Yiran Huang, Amirhossein Nouranizadeh, Christine Ahrends, Mengjia Xu, 9 Aug 2025, BrainATCL: Adaptive Temporal Brain Connectivity Learning for Functional Link Prediction and Age Estimation, https://arxiv.org/abs/2508.07106
Sichen Zhao, Wei Shao, Jeffrey Chan, Ziqi Xu, Flora Salim, 11 Aug 2025, FairDRL-ST: Disentangled Representation Learning for Fair Spatio-Temporal Mobility Prediction, https://arxiv.org/abs/2508.07518
Mohammed-Khalil Ghali, Cecil Pang, Oscar Molina, Carlos Gershenson-Garcia, Daehan Won, 24 Jul 2025, Forecasting Commodity Price Shocks Using Temporal and Semantic Fusion of Prices Signals and Agentic Generative AI Extracted Economic News, https://arxiv.org/abs/2508.06497
Zihao Sheng, Zilin Huang, Yen-Jung Chen, Yansong Qu, Yuhao Luo, Yue Leng, Sikai Chen, 9 Aug 2025, SafePLUG: Empowering Multimodal LLMs with Pixel-Level Insight and Temporal Grounding for Traffic Accident Understanding, https://arxiv.org/abs/2508.06763
Yanru Sun, Emadeldeen Eldele, Zongxia Xie, Yucheng Wang, Wenzhe Niu, Qinghua Hu, Chee Keong Kwoh, Min Wu, 10 Aug 2025, Adapting LLMs to Time Series Forecasting via Temporal Heterogeneity Modeling and Semantic Alignment, https://arxiv.org/abs/2508.07195
Chaohong Guo, Xun Mo, Yongwei Nie, Xuemiao Xu, Chao Xu, Fei Yu, and Chengjiang Long, 11 Aug 2025, TAR-TVG: Enhancing VLMs with Timestamp Anchor-Constrained Reasoning for Temporal Video Grounding, https://arxiv.org/abs/2508.07683
Zhuqiang Lu, Zhenfei Yin, Mengwei He, Zhihui Wang, Zicheng Liu, Zhiyong Wang and Kun Hu, 11 Aug 2025, B-VLLM: A Vision Large Language Model with Balanced Spatio-Temporal Tokens, https://arxiv.org/abs/2412.09919
Jiawen Qi, Chang Gao, Zhaochun Ren, Qinyu Chen, 25 Jul 2025, DeltaLLM: A Training-Free Framework Exploiting Temporal Sparsity for Efficient Edge LLM Inference, https://arxiv.org/abs/2507.19608
Pritom Ray Nobin, Imran Ahammad Rifat, 28 Jul 2025, STARN-GAT: A Multi-Modal Spatio-Temporal Graph Attention Network for Accident Severity Prediction, https://arxiv.org/abs/2507.20451
Yongzheng Liu, Yiming Wang, Po Xu, Yingjie Xu, Yuntian Chen, Dongxiao Zhang, 28 Jul 2025, BuildSTG: A Multi-building Energy Load Forecasting Method using Spatio-Temporal Graph Neural Network, https://arxiv.org/abs/2507.20838
Gongli Xi, Ye Tian, Yannan Hu, Yuchao Zhang, Yapeng Niu and Xiangyang Gong, 27 Jul 2025, Packet-Level DDoS Data Augmentation Using Dual-Stream Temporal-Field Diffusion, https://arxiv.org/abs/2507.20115
Javier Sol\'is-Garc\'ia, Bel\'en Vega-M\'arquez, Juan A. Nepomuceno, Isabel A. Nepomuceno-Chamorro, 26 Jul 2025, CoSTI: Consistency Models for (a faster) Spatio-Temporal Imputation, https://arxiv.org/abs/2501.19364
Dyuman Aditya, Colton Payne, Mario Leiva, Paulo Shakarian, 27 Jul 2025, Machine Learning Model Integration with Open World Temporal Logic for Process Automation, https://arxiv.org/abs/2506.17776
Lei Zheng, Ning Li, Weinan Zhang, Yong Yu, 27 Jul 2025, Retrieval and Distill: A Temporal Data Shift-Free Paradigm for Online Recommendation System, https://arxiv.org/abs/2404.15678
Yu Tai, Xinglong Wu, Hongwei Yang, Hui He, Duanjing Chen, Yuanming Shao and Weizhe Zhang, 28 Jul 2025, How to Bridge Spatial and Temporal Heterogeneity in Link Prediction? A Contrastive Method, https://arxiv.org/abs/2411.00612
Pallavi Zambare, Venkata Nikhil Thanikella, Ying Liu, 25 Jul 2025, Seeing Beyond Frames: Zero-Shot Pedestrian Intention Prediction with Raw Temporal Video and Multimodal Cues, https://arxiv.org/abs/2507.21161
Jing Ren, Suyu Ma, Hong Jia, Xiwei Xu, Ivan Lee, Haytham Fayek, Xiaodong Li, Feng Xia, 29 Jul 2025, LiteFat: Lightweight Spatio-Temporal Graph Learning for Real-Time Driver Fatigue Detection, https://arxiv.org/abs/2507.21756
Zhengpeng Feng, Clement Atzberger, Sadiq Jaffer, Jovana Knezevic, Silja Sormunen, Robin Young, Madeline C Lisaius, Markus Immitzer, David A. Coomes, Anil Madhavapeddy, Andrew Blake and Srinivasan Keshav, 29 Jul 2025, TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and Analysis, https://arxiv.org/abs/2506.20380
Boyuan Zheng and Victor W. Chu, 30 Jul 2025, Multi-Hazard Early Warning Systems for Agriculture with Featural-Temporal Explanations, https://arxiv.org/abs/2507.22962
Sofiane Bouaziz, Adel Hafiane, Raphael Canals, Rachid Nedjai, 30 Jul 2025, FuseTen: A Generative Model for Daily 10 m Land Surface Temperature Estimation from Spatio-Temporal Satellite Observations, https://arxiv.org/abs/2507.23154
Molly Wang, Kin.K Leung, 31 Jul 2025, Spatial-Temporal Reinforcement Learning for Network Routing with Non-Markovian Traffic, https://arxiv.org/abs/2507.22174
Shahla John, 30 Jul 2025, Efficient Spatial-Temporal Modeling for Real-Time Video Analysis: A Unified Framework for Action Recognition and Object Tracking, https://arxiv.org/abs/2507.22421
Mohammed Kamran, Maria Bernathova, Raoul Varga, Christian Singer, Zsuzsanna Bago-Horvath, Thomas Helbich, Georg Langs, Philipp Seeb\"ock, 1 Aug 2025, LesiOnTime -- Joint Temporal and Clinical Modeling for Small Breast Lesion Segmentation in Longitudinal DCE-MRI, https://arxiv.org/abs/2508.00496
Camille Bourgaux, Anton Gnatenko, Micha\"el Thomazo, 1 Aug 2025, Analysing Temporal Reasoning in Description Logics Using Formal Grammars, https://arxiv.org/abs/2508.00575
Mingyu Kang, Duxin Chen, Ning Meng, Gang Yan and Wenwu Yu, 1 Aug 2025, Identifying Unique Spatial-Temporal Bayesian Network without Markov Equivalence, https://arxiv.org/abs/2211.10085
Yujing Ke and Kevin George and Kathan Pandya and David Blumenthal and Maximilian Sprang and Gerrit Gro{\ss}mann and Sebastian Vollmer and David Antony Selby, 2 Aug 2025, BioDisco: Multi-agent hypothesis generation with dual-mode evidence, iterative feedback and temporal evaluation, https://arxiv.org/abs/2508.01285
Jingtian Yan, Stephen F. Smith, Jiaoyang Li, 2 Aug 2025, WinkTPG: An Execution Framework for Multi-Agent Path Finding Using Temporal Reasoning, https://arxiv.org/abs/2508.01495
Zijian Guo, \.Ilker I\c{s}{\i}k, H. M. Sabbir Ahmad, Wenchao Li, 3 Aug 2025, One Subgoal at a Time: Zero-Shot Generalization to Arbitrary Linear Temporal Logic Requirements in Multi-Task Reinforcement Learning, https://arxiv.org/abs/2508.01561
Dong Li, Yichen Niu, Ying Ai, Xiang Zou, Biqing Qi, Jianxing Liu, 3 Aug 2025, T-GRAG: A Dynamic GraphRAG Framework for Resolving Temporal Conflicts and Redundancy in Knowledge Retrieval, https://arxiv.org/abs/2508.01680
Zhenan Lin, Yuni Lai, Wai Lun Lo, Richard Tai-Chiu Hsung, Harris Sik-Ho Tsang, Xiaoyu Xue, Kai Zhou, Yulin Zhu, 25 Jul 2025, Multi-Grained Temporal-Spatial Graph Learning for Stable Traffic Flow Forecasting, https://arxiv.org/abs/2508.00884
Bang Hu, Changze Lv, Mingjie Li, Yunpeng Liu, Xiaoqing Zheng, Fengzhe Zhang, Wei cao, Fan Zhang, 4 Aug 2025, SpikeSTAG: Spatial-Temporal Forecasting via GNN-SNN Collaboration, https://arxiv.org/abs/2508.02069
Wei Hao, Bin Chong, Ronghua Ji, and Chen Hou, 4 Aug 2025, User Trajectory Prediction Unifying Global and Local Temporal Information, https://arxiv.org/abs/2508.02161
Erhang Zhang, Junyi Ma, Yin-Dong Zheng, Yixuan Zhou, Hesheng Wang, 4 Jun 2025, Zero-Shot Temporal Interaction Localization for Egocentric Videos, https://arxiv.org/abs/2506.03662
Jose M. S\'anchez Vel\'azquez, Mingbo Cai, Andrew Coney, \'Alvaro J. Garc\'ia- Tejedor, Alberto Nogales, 28 Jul 2025, Benefits of Feature Extraction and Temporal Sequence Analysis for Video Frame Prediction: An Evaluation of Hybrid Deep Learning Models, https://arxiv.org/abs/2508.00898
Weihong Li, Shaohua Dong, Haonan Lu, Yanhao Zhang, Heng Fan, Libo Zhang, 3 Aug 2025, DMTrack: Spatio-Temporal Multimodal Tracking via Dual-Adapter, https://arxiv.org/abs/2508.01592
Zhaoyu Hu, Hao Guo, Yuan Tian, Erpeng Xue, Jianyang Wang, Xianyang Qi, Hongxiang Lin, Lei Wang, Sheng Chen, 4 Aug 2025, Dynamic Forgetting and Spatio-Temporal Periodic Interest Modeling for Local-Life Service Recommendation, https://arxiv.org/abs/2508.02451
Yixuan He, Aaron Sandel, David Wipf, Mihai Cucuringu, John Mitani, Gesine Reinert, 3 Aug 2025, Learning to Fuse Temporal Proximity Networks: A Case Study in Chimpanzee Social Interactions, https://arxiv.org/abs/2502.00302
Fengbin Zhu, Junfeng Li, Liangming Pan, Wenjie Wang, Fuli Feng, Chao Wang, Huanbo Luan, Tat-Seng Chua, 3 Aug 2025, Towards Temporal-Aware Multi-Modal Retrieval Augmented Generation in Finance, https://arxiv.org/abs/2503.05185
Yihe Wang, Nadia Mammone, Darina Petrovsky, Alexandros T. Tzallas, Francesco C. Morabito, Xiang Zhang, 4 Aug 2025, ADformer: A Multi-Granularity Spatial-Temporal Transformer for EEG-Based Alzheimer Detection, https://arxiv.org/abs/2409.00032
Osama Mohammed, Jiaxin Pan, Mojtaba Nayyeri, Daniel Hern\'andez and Steffen Staab, 5 Aug 2025, Full-History Graphs with Edge-Type Decoupled Networks for Temporal Reasoning, https://arxiv.org/abs/2508.03251
Irene Ferfoglia, Simone Silvetti, Gaia Saveri, Laura Nenzi, Luca Bortolussi, 5 Aug 2025, Towards Interpretable Concept Learning over Time Series via Temporal Logic Semantics, https://arxiv.org/abs/2508.03269
Evangelos Sariyanidi, John D. Herrington, Lisa Yankowitz, Pratik Chaudhari, Theodore D. Satterthwaite, Casey J. Zampella, Robert T. Schultz, Russell T. Shinohara, Birkan Tunc, 29 Jul 2025, Measuring Dependencies between Biological Signals with Temporal Self-supervision, and its Limitations, https://arxiv.org/abs/2508.02703
Yi Zhang, Nikolaos Farmakidis, Ioannis Roumpos, Miltiadis Moralis-Pegios, Apostolos Tsakyridis, June Sang Lee, Bowei Dong, Yuhan He, Samarth Aggarwal, Nikolaos Pleros and Harish Bhaskaran, 5 Aug 2025, All-optical temporal integration mediated by subwavelength heat antennas, https://arxiv.org/abs/2505.04405
Amin Farajzadeh, Hongzhao Zheng, Sarah Dumoulin, Trevor Ha, Halim Yanikomeroglu, Amir Ghasemi, 5 Aug 2025, Data-Driven Spectrum Demand Prediction: A Spatio-Temporal Framework with Transfer Learning, https://arxiv.org/abs/2508.03863
Krishnakanta Barik and Goutam Paul, 6 Aug 2025, Quantum Temporal Fusion Transformer, https://arxiv.org/abs/2508.04048
Xiangzhe Xu, Guangyu Shen, Zian Su, Siyuan Cheng, Hanxi Guo, Lu Yan, Xuan Chen, Jiasheng Jiang, Xiaolong Jin, Chengpeng Wang, Zhuo Zhang, Xiangyu Zhang, 5 Aug 2025, ASTRA: Autonomous Spatial-Temporal Red-teaming for AI Software Assistants, https://arxiv.org/abs/2508.03936
Zhihao Wen, Yuan Fang, Pengcheng Wei, Fayao Liu, Zhenghua Chen, Min Wu, 6 Aug 2025, Temporal and Heterogeneous Graph Neural Network for Remaining Useful Life Prediction, https://arxiv.org/abs/2405.04336
Keivan Faghih Niresi, Ismail Nejjar, Olga Fink, 6 Aug 2025, Efficient Unsupervised Domain Adaptation Regression for Spatial-Temporal Sensor Fusion, https://arxiv.org/abs/2411.06917
Chin-Chia Michael Yeh, Xiran Fan, Zhimeng Jiang, Yujie Fan, Huiyuan Chen, Uday Singh Saini, Vivian Lai, Xin Dai, Junpeng Wang, Zhongfang Zhuang, Liang Wang, Yan Zheng, 6 Aug 2025, UltraSTF: Ultra-Compact Model for Large-Scale Spatio-Temporal Forecasting, https://arxiv.org/abs/2502.20634
Luis Mandl and Dibyajyoti Nayak and Tim Ricken and Somdatta Goswami, 7 Aug 2025, Physics-Informed Time-Integrated DeepONet: Temporal Tangent Space Operator Learning for High-Accuracy Inference, https://arxiv.org/abs/2508.05190
Shuonan Yang, Tailin Chen, Rahul Singh, Jiangbei Yue, Jianbo Jiao, Zeyu Fu, 6 Aug 2025, Revealing Temporal Label Noise in Multimodal Hateful Video Classification, https://arxiv.org/abs/2508.04900
Zhu Xu, Ting Lei, Zhimin Li, Guan Wang, Qingchao Chen, Yuxin Peng, Yang liu, 7 Aug 2025, TRKT: Weakly Supervised Dynamic Scene Graph Generation with Temporal-enhanced Relation-aware Knowledge Transferring, https://arxiv.org/abs/2508.04943
Long Yang, Lianqing Zheng, Wenjin Ai, Minghao Liu, Sen Li, Qunshu Lin, Shengyu Yan, Jie Bai, Zhixiong Ma, Tao Huang and Xichan Zhu, 7 Aug 2025, MetaOcc: Spatio-Temporal Fusion of Surround-View 4D Radar and Camera for 3D Occupancy Prediction with Dual Training Strategies, https://arxiv.org/abs/2501.15384
Serkan Sulun, Paula Viana, Matthew E. P. Davies, 7 Aug 2025, Video Soundtrack Generation by Aligning Emotions and Temporal Boundaries, https://arxiv.org/abs/2502.10154
Wenhao Dong, Yueyang Li, Weiming Zeng, Lei Chen, Hongjie Yan, Wai Ting Siok, and Nizhuan Wang, 7 Aug 2025, STARFormer: A Novel Spatio-Temporal Aggregation Reorganization Transformer of FMRI for Brain Disorder Diagnosis, https://arxiv.org/abs/2501.00378
Barak Gahtan, Alex M. Bronstein, 8 Aug 2025, Architecture-Aware Generalization Bounds for Temporal Networks: Theory and Fair Comparison Methodology, https://arxiv.org/abs/2508.06066
Yidong Wang, Xin Wang, Cunxiang Wang, Junfeng Fang, Qiufeng Wang, Jianing Chu, Xuran Meng, Shuxun Yang, Libo Qin, Yue Zhang, Wei Ye, Shikun Zhang, 8 Aug 2025, Temporal Self-Rewarding Language Models: Decoupling Chosen-Rejected via Past-Future, https://arxiv.org/abs/2508.06026
Sofiane Bouaziz, Adel Hafiane, Raphael Canals, Rachid Nedjai, 8 Aug 2025, WGAST: Weakly-Supervised Generative Network for Daily 10 m Land Surface Temperature Estimation via Spatio-Temporal Fusion, https://arxiv.org/abs/2508.06485
Zibo Liu, Zhe Jiang, Zelin Xu, Tingsong Xiao, Zhengkun Xiao, Yupu zhang, Haibo Wang, and Shigang Chen, 8 Aug 2025, Spatio-Temporal Partial Sensing Forecast for Long-term Traffic, https://arxiv.org/abs/2408.02689
Ignatius Rollere, Caspian Hartsfield, Seraphina Courtenay, Lucian Fenwick, Aurelia Grunwald, 8 Aug 2025, Algorithmic Segmentation and Behavioral Profiling for Ransomware Detection Using Temporal-Correlation Graphs, https://arxiv.org/abs/2501.17429
Abhishek Rajgaria, Kushagra Dixit, Mayank Vyas, Harshavardhan Kalalbandi, Dan Roth, Vivek Gupta, 7 Aug 2025, No Universal Prompt: Unifying Reasoning through Adaptive Prompting for Temporal Table Reasoning, https://arxiv.org/abs/2506.11246
Ningning Fu, Shengheng Liu, Weiliang Xie, Yongming Huang, 1 Aug 2025, Multi-grained spatial-temporal feature complementarity for accurate online cellular traffic prediction, https://arxiv.org/abs/2508.08281
Milad Sabouri, Masoud Mansoury, Kun Lin, Bamshad Mobasher, 11 Aug 2025, Temporal User Profiling with LLMs: Balancing Short-Term and Long-Term Preferences for Recommendations, https://arxiv.org/abs/2508.08454
Milad Sabouri, Masoud Mansoury, Kun Lin, Bamshad Mobasher, 11 Aug 2025, Using LLMs to Capture Users' Temporal Context for Recommendation, https://arxiv.org/abs/2508.08512
Edith Elkind, Tzeh Yuan Neoh, Nicholas Teh, 12 Aug 2025, Not in My Backyard! Temporal Voting Over Public Chores, https://arxiv.org/abs/2508.08810
Ziyi Guo and Yan Wang, 12 Aug 2025, Urban-STA4CLC: Urban Theory-Informed Spatio-Temporal Attention Model for Predicting Post-Disaster Commercial Land Use Change, https://arxiv.org/abs/2508.08976
Maxim A. Patratskiy, Alexey K. Kovalev, Aleksandr I. Panov, 12 Aug 2025, Spatial Traces: Enhancing VLA Models with Spatial-Temporal Understanding, https://arxiv.org/abs/2508.09032
Wen Wang, Bozhen Fang, Chenchen Jing, Yongliang Shen, Yangyi Shen, Qiuyu Wang, Hao Ouyang, Hao Chen, Chunhua Shen, 12 Aug 2025, Time Is a Feature: Exploiting Temporal Dynamics in Diffusion Language Models, https://arxiv.org/abs/2508.09138
Yunhua Pei and John Cartlidge and Anandadeep Mandal and Daniel Gold and Enrique Marcilio and Riccardo Mazzon, 12 Aug 2025, Cross-Modal Temporal Fusion for Financial Market Forecasting, https://arxiv.org/abs/2504.13522
Victor Shea-Jay Huang, Le Zhuo, Yi Xin, Zhaokai Wang, Fu-Yun Wang, Yuchi Wang, Renrui Zhang, Peng Gao, Hongsheng Li, 12 Aug 2025, TIDE : Temporal-Aware Sparse Autoencoders for Interpretable Diffusion Transformers in Image Generation, https://arxiv.org/abs/2503.07050
Yanlai Yang and Mengye Ren, 11 Aug 2025, Memory Storyboard: Leveraging Temporal Segmentation for Streaming Self-Supervised Learning from Egocentric Videos, https://arxiv.org/abs/2501.12254
Yue Yao, Zhen Xu, Youzhu Liu, Kunyuan Ma, Yuxiu Lin, Mohan Jiang, 13 Aug 2025, Integrating Feature Attention and Temporal Modeling for Collaborative Financial Risk Assessment, https://arxiv.org/abs/2508.09399
Faruk Alpay, Bugra Kilictas, Hamdi Alakkad, 13 Aug 2025, Temporal Anchoring in Deepening Embedding Spaces: Event-Indexed Projections, Drift, Convergence, and an Internal Computational Architecture, https://arxiv.org/abs/2508.09693
Wouter M. Kouw, 13 Aug 2025, Bayesian autoregression to optimize temporal Mat\'ern kernel Gaussian process hyperparameters, https://arxiv.org/abs/2508.09792
Jihang Wang, Dongcheng Zhao, Ruolin Chen, Qian Zhang, Yi Zeng, 15 Aug 2025, Boosting the Robustness-Accuracy Trade-off of SNNs by Robust Temporal Self-Ensemble, https://arxiv.org/abs/2508.11279
Changhong Jing, Yan Liu, Shuqiang Wang, Bruce X.B. Yu, Gong Chen, Zhejing Hu, Zhi Zhang, Yanyan Shen, 15 Aug 2025, PTSM: Physiology-aware and Task-invariant Spatio-temporal Modeling for Cross-Subject EEG Decoding, https://arxiv.org/abs/2508.11357
Ahmad Mousavi, Yeganeh Abdollahinejad, Roberto Corizzo, Nathalie Japkowicz, and Zois Boukouvalas, 15 Aug 2025, E-CaTCH: Event-Centric Cross-Modal Attention with Temporal Consistency and Class-Imbalance Handling for Misinformation Detection, https://arxiv.org/abs/2508.11197
Rahmat K. Adesunkanmi, Ashfaq Khokhar, Goce Trajcevski, Sohail Murad, 17 Aug 2025, Root Cause Analysis of Hydrogen Bond Separation in Spatio-Temporal Molecular Dynamics using Causal Models, https://arxiv.org/abs/2508.12500
Xiangxiang Cui, Min Zhao, Dongmei Zhi, Shile Qi, Vince D Calhoun, Jing Sui, 15 Aug 2025, BRIEF: BRain-Inspired network connection search with Extensive temporal feature Fusion enhances disease classification, https://arxiv.org/abs/2508.11732
Sishun Liu, Ke Deng, Xiuzhen Zhang, Yan Wang, 16 Aug 2025, Learning Marked Temporal Point Process Explanations based on Counterfactual and Factual Reasoning, https://arxiv.org/abs/2508.11943
Haolong Chen, Liang Zhang, Zhengyuan Xin, Guangxu Zhu, 17 Aug 2025, STM3: Mixture of Multiscale Mamba for Long-Term Spatio-Temporal Time-Series Prediction, https://arxiv.org/abs/2508.12247
Ismail Lamaakal, Chaymae Yahyati, Khalid El Makkaoui, Ibrahim Ouahbi, Yassine Maleh, 18 Aug 2025, TCUQ: Single-Pass Uncertainty Quantification from Temporal Consistency with Streaming Conformal Calibration for TinyML, https://arxiv.org/abs/2508.12905
Alicja Ziarko, Michal Bortkiewicz, Michal Zawalski, Benjamin Eysenbach and Piotr Milos, 18 Aug 2025, Contrastive Representations for Temporal Reasoning, https://arxiv.org/abs/2508.13113
Yueyang Liu, Lance Kennedy, Ruochen Kong, Joon-Seok Kim, Andreas Z\"ufle, 18 Aug 2025, Training Machine Learning Models on Human Spatio-temporal Mobility Data: An Experimental Study [Experiment Paper], https://arxiv.org/abs/2508.13135
Friedhelm Hamann, Emil Mededovic, Fabian G\"ulhan, Yuli Wu, Johannes Stegmaier, Jing He, Yiqing Wang, Kexin Zhang, Lingling Li, Licheng Jiao, Mengru Ma, Hongxiang Huang, Yuhao Yan, Hongwei Ren, Xiaopeng Lin, Yulong Huang, Bojun Cheng, Se Hyun Lee, Gyu Sung Ham, Kanghan Oh, Gi Hyun Lim, Boxuan Yang, Bowen Du, Guillermo Gallego, 18 Aug 2025, SIS-Challenge: Event-based Spatio-temporal Instance Segmentation Challenge at the CVPR 2025 Event-based Vision Workshop, https://arxiv.org/abs/2508.12813
Yangchen Pan, Junfeng Wen, Chenjun Xiao, Philip Torr, 18 Aug 2025, An MRP Formulation for Supervised Learning: Generalized Temporal Difference Learning Models, https://arxiv.org/abs/2404.15518
Zhiyuan Zheng, Jianpeng Qi, Jiantao Li, Guoqing Chao, Junyu Dong, Yanwei Yu, 18 Aug 2025, Efficient Discovery of Motif Transition Process for Large-Scale Temporal Graphs, https://arxiv.org/abs/2504.15979
Jiayu Fang, Zhiqi Shao, S T Boris Choy, Junbin Gao, 19 Aug 2025, STPFormer: A State-of-the-Art Pattern-Aware Spatio-Temporal Transformer for Traffic Forecasting, https://arxiv.org/abs/2508.13433
Su Chen, Xiaohua Qi, Xixun Lin, Yanmin Shang, Xiaolin Xu and Yangxi Li, 17 Aug 2025, Deep Graph Neural Point Process For Learning Temporal Interactive Networks, https://arxiv.org/abs/2508.13219
Tinh-Anh Nguyen-Nhu, Triet Dao Hoang Minh, Dat To-Thanh, Phuc Le-Gia, Tuan Vo-Lan, Tien-Huy Nguyen, 19 Aug 2025, STER-VLM: Spatio-Temporal With Enhanced Reference Vision-Language Models, https://arxiv.org/abs/2508.13470
Zongyuan Huang, Weipeng Wang, Shaoyu Huang, Marta C. Gonzalez, Yaohui Jin, Yanyan Xu, 19 Aug 2025, Where to Go Next Day: Multi-scale Spatial-Temporal Decoupled Model for Mid-term Human Mobility Prediction, https://arxiv.org/abs/2501.06561
Qianang Zhou, Junhui Hou, Meiyi Yang, Yongjian Deng, Youfu Li, Junlin Xiong, 19 Aug 2025, Spatially-guided Temporal Aggregation for Robust Event-RGB Optical Flow Estimation, https://arxiv.org/abs/2501.00838
Songyu Ke and Chenyu Wu and Yuxuan Liang and Xiuwen Yi and Yanping Sun and Junbo Zhang and Yu Zheng, 13 Aug 2025, GeoMAE: Masking Representation Learning for Spatio-Temporal Graph Forecasting with Missing Values, https://arxiv.org/abs/2508.14083
Donghwa Kang, Doohyun Kim, Sang-Ki Ko, Jinkyu Lee, Brent ByungHoon Kang, Hyeongboo Baek, 19 Aug 2025, STAS: Spatio-Temporal Adaptive Computation Time for Spiking Transformers, https://arxiv.org/abs/2508.14138
Lian Lian, Yilin Li, Song Han, Renzi Meng, Sibo Wang, Ming Wang, 20 Aug 2025, Artificial Intelligence-Based Multiscale Temporal Modeling for Anomaly Detection in Cloud Services, https://arxiv.org/abs/2508.14503
Jiafeng Xiong and Rizos Sakellariou, 20 Aug 2025, Graph Structure Learning with Temporal Graph Information Bottleneck for Inductive Representation Learning, https://arxiv.org/abs/2508.14859
Anushka A. Kore, Frank G. te Nijenhuis, Matthijs van der Sluijs, Wim van Zwam, Charles Majoie, Geert Lycklama \`a Nijeholt, Danny Ruijters, Frans Vos, Sandra Cornelissen, Ruisheng Su, Theo van Walsum, 19 Aug 2025, OccluNet: Spatio-Temporal Deep Learning for Occlusion Detection on DSA, https://arxiv.org/abs/2508.14286
Peiming Li, Ziyi Wang, Yulin Yuan, Hong Liu, Xiangming Meng, Junsong Yuan, Mengyuan Liu, 20 Aug 2025, UST-SSM: Unified Spatio-Temporal State Space Models for Point Cloud Video Modeling, https://arxiv.org/abs/2508.14604
Yutian Liu, Zhengyi Yang, Jiancan Wu, Xiang Wang, 20 Aug 2025, Enhancing Temporal Sensitivity of Large Language Model for Recommendation with Counterfactual Tuning, https://arxiv.org/abs/2507.03047
Jiacheng Hu, Bo Zhang, Ting Xu, Haifeng Yang, Min Gao, 20 Aug 2025, Structure-Aware Temporal Modeling for Chronic Disease Progression Prediction, https://arxiv.org/abs/2508.14942
Haodi Zhong, Liuxin Zou, Di Wang, Bo Wang, Zhenxing Niu, Quan Wang, 21 Aug 2025, EvoFormer: Learning Dynamic Graph-Level Representations with Structural and Temporal Bias Correction, https://arxiv.org/abs/2508.15378
H. I. Nurdin and C. A. Nijhuis, 21 Aug 2025, A Solvable Molecular Switch Model for Stable Temporal Information Processing, https://arxiv.org/abs/2508.15451
Haibo Wang, Zhiyang Xu, Yu Cheng, Shizhe Diao, Yufan Zhou, Yixin Cao, Qifan Wang, Weifeng Ge, Lifu Huang, 21 Aug 2025, Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models, https://arxiv.org/abs/2410.03290
Jihua Huang, Yi Yao and Ajay Divakaran, 21 Aug 2025, Transforming Causality: Transformer-Based Temporal Causal Discovery with Prior Knowledge Integration, https://arxiv.org/abs/2508.15928
Yujie Li, Zezhi Shao, Chengqing Yu, Tangwen Qian, Zhao Zhang, Yifan Du, Shaoming He, Fei Wang, Yongjun Xu, 22 Aug 2025, STA-GANN: A Valid and Generalizable Spatio-Temporal Kriging Approach, https://arxiv.org/abs/2508.16161
Nadia Asif and Zhiqing Hong and Shaogang Ren and Xiaonan Zhang and Xiaojun Shang and Yukun Yuan, 22 Aug 2025, MuST2-Learn: Multi-view Spatial-Temporal-Type Learning for Heterogeneous Municipal Service Time Estimation, https://arxiv.org/abs/2508.16503
Shunsuke Iwashita, Ning Ding, Keisuke Fujii, 25 Aug 2025, Evaluating Movement Initiation Timing in Ultimate Frisbee via Temporal Counterfactuals, https://arxiv.org/abs/2508.17611
Bangchao Deng, Lianhua Ji, Chunhua Chen, Xin Jing, Ling Ding, Bingqing QU, Pengyang Wang, Dingqi Yang, 14 Aug 2025, STRelay: A Universal Spatio-Temporal Relaying Framework for Location Prediction with Future Spatiotemporal Contexts, https://arxiv.org/abs/2508.16620
Weilin Ruan, Xilin Dang, Ziyu Zhou, Sisuo Lyu, Yuxuan Liang, 14 Aug 2025, A Retrieval Augmented Spatio-Temporal Framework for Traffic Prediction, https://arxiv.org/abs/2508.16623
Zhuding Liang, Jianxun Cui, Qingshuang Zeng, Feng Liu, Nenad Filipovic, Tijana Geroski, 21 Aug 2025, STGAtt: A Spatial-Temporal Unified Graph Attention Network for Traffic Flow Forecasting, https://arxiv.org/abs/2508.16685
Bicheng Wang and Junping Wang and Yibo Xue, 22 Aug 2025, Physics-Inspired Spatial Temporal Graph Neural Networks for Predicting Industrial Chain Resilience, https://arxiv.org/abs/2508.16836
YongKyung Oh, Dong-Young Lim, Sungil Kim, Alex Bui, 24 Aug 2025, TANDEM: Temporal Attention-guided Neural Differential Equations for Missingness in Time Series Classification, https://arxiv.org/abs/2508.17519
Hoyoung Lee, Wonbin Ahn, Suhwan Park, Jaehoon Lee, Minjae Kim, Sungdong Yoo, Taeyoon Lim, Woohyung Lim, Yongjae Lee, 23 Aug 2025, THEME : Enhancing Thematic Investing with Semantic Stock Representations and Temporal Dynamics, https://arxiv.org/abs/2508.16936
Ziyao Shangguan, Chuhan Li, Yuxuan Ding, Yanan Zheng, Yilun Zhao, Tesca Fitzgerald, Arman Cohan, 25 Aug 2025, TOMATO: Assessing Visual Temporal Reasoning Capabilities in Multimodal Foundation Models, https://arxiv.org/abs/2410.23266
Xiuyuan Cheng, Zheng Dong, Yao Xie, 22 Aug 2025, Deep spatio-temporal point processes: Advances and new directions, https://arxiv.org/abs/2504.06364
Aarush Kumbhakern, Saransh Kumar Gupta, Lipika Dey, Partha Pratim Das, 4 Sep 2025, Towards an Action-Centric Ontology for Cooking Procedures Using Temporal Graphs, https://arxiv.org/abs/2509.04159
Yin Huang, Yongqi Dong, Youhua Tang, Li Li, 4 Sep 2025, Parking Availability Prediction via Fusing Multi-Source Data with A Self-Supervised Learning Enhanced Spatio-Temporal Inverted Transformer, https://arxiv.org/abs/2509.04362
Zhaoyan Gong, Juan Li, Zhiqiang Liu, Lei Liang, Huajun Chen, Wen Zhang, 4 Sep 2025, RTQA : Recursive Thinking for Complex Temporal Knowledge Graph Question Answering with Large Language Models, https://arxiv.org/abs/2509.03995
Brett Daley, Prabhat Nagarajan, Martha White, Marlos C. Machado, 4 Sep 2025, An Analysis of Action-Value Temporal-Difference Methods That Learn State Values, https://arxiv.org/abs/2507.09523
Zacharia A. Rudge, Dominik Dold, Moritz Fieback, Dario Izzo, Said Hamdioui, 2 Sep 2025, Memristor-Based Neural Network Accelerators for Space Applications: Enhancing Performance with Temporal Averaging and SIRENs, https://arxiv.org/abs/2509.04506
Konstantinos Drossos and Mikko Heikkinen and Paschalis Tsiaflakis, 5 Sep 2025, Lightweight DNN for Full-Band Speech Denoising on Mobile Devices: Exploiting Long and Short Temporal Patterns, https://arxiv.org/abs/2509.05079
Mahishanka Withanachchi, 23 Aug 2025, Learning Spatio-Temporal Dynamics via Operator-Valued RKHS and Kernel Koopman Methods, https://arxiv.org/abs/2508.18307
Yunyang Cao, Juekai Lin, Wenhao Li, Bo Jin, 26 Aug 2025, MOCHA: Discovering Multi-Order Dynamic Causality in Temporal Point Processes, https://arxiv.org/abs/2508.18873
Yao Wu, 26 Aug 2025, HierCVAE: Hierarchical Attention-Driven Conditional Variational Autoencoders for Multi-Scale Temporal Modeling, https://arxiv.org/abs/2508.18922
Hudson de Martim, 26 Aug 2025, An Ontology-Driven Graph RAG for Legal Norms: A Hierarchical, Temporal, and Deterministic Approach, https://arxiv.org/abs/2505.00039
Yongbin Lee, Ki H. Chon, 26 Aug 2025, Atrial Fibrillation Prediction Using a Lightweight Temporal Convolutional and Selective State Space Architecture, https://arxiv.org/abs/2508.19361
Haruki Yonekura, Ren Ozeki, Tatsuya Amano, Hamada Rizk, Hirozumi Yamaguchi, 27 Aug 2025, MobText-SISA: Efficient Machine Unlearning for Mobility Logs with Spatio-Temporal and Natural-Language Data, https://arxiv.org/abs/2508.19554
Chenghao Liu, Jiachen Zhang, Chengxuan Li, Zhimu Zhou, Shixin Wu, Songfang Huang and Huiling Duan, 15 Aug 2025, TTF-VLA: Temporal Token Fusion via Pixel-Attention Integration for Vision-Language-Action Models, https://arxiv.org/abs/2508.19257
Amirhossein Sohrabbeig, Omid Ardakanian, and Petr Musilek, 26 Aug 2025, Forecasting Multivariate Urban Data via Decomposition and Spatio-Temporal Graph Analysis, https://arxiv.org/abs/2505.22474
Zeyue Zhang, Lin Song, Erkang Bao, Xiaoling Lv, Xinyue Wang, 28 Aug 2025, ATM-GAD: Adaptive Temporal Motif Graph Anomaly Detection for Financial Transaction Networks, https://arxiv.org/abs/2508.20829
Chengjun Zhang, Yuhao Zhang, Jie Yang and Mohamad Sawan, 28 Aug 2025, Ultra-Low-Latency Spiking Neural Networks with Temporal-Dependent Integrate-and-Fire Neuron Model for Objects Detection, https://arxiv.org/abs/2508.20392
Yi Jiang, Malyaban Bal, Brian Matejek, Susmit Jha, Adam Cobb, Abhronil Sengupta, 23 Aug 2025, Spatio-Temporal Pruning for Compressed Spiking Large Language Models, https://arxiv.org/abs/2508.20122
Yunwoo Kim, Junhyuk Hwang, 29 Aug 2025, Predicting Social Media Engagement from Emotional and Temporal Features, https://arxiv.org/abs/2508.21650
Aditya Makineni, Baocheng Geng, Qing Tian, 28 Aug 2025, Full-Frequency Temporal Patching and Structured Masking for Enhanced Audio Classification, https://arxiv.org/abs/2508.21243
Shanshan Song, Hui Tang, Honglong Yang, Xiaomeng Li, 29 Aug 2025, DDaTR: Dynamic Difference-aware Temporal Residual Network for Longitudinal Radiology Report Generation, https://arxiv.org/abs/2505.03401
Zifeng Ding, Shenyang Huang, Zeyu Cao, Emma Kondrup, Zachary Yang, Xingyue Huang, Yuan Sui, Zhangdie Yuan, Yuqicheng Zhu, Xianglong Hu, Yuan He, Farimah Poursafaei, Michael Bronstein, Andreas Vlachos, 31 Aug 2025, Self-Exploring Language Models for Explainable Link Forecasting on Temporal Graphs via Reinforcement Learning, https://arxiv.org/abs/2509.00975
Giacomo Acciarini and Simone Mestici and Halil Kelebek and Linnea Wolniewicz and Michael Vergalla and Madhulika Guhathakurta and Umaa Rebbapragada and Bala Poduval and At{\i}l{\i}m G\"une\c{s} Baydin and Frank Soboczenski, 30 Aug 2025, Forecasting the Ionosphere from Sparse GNSS Data with Temporal-Fusion Transformers, https://arxiv.org/abs/2509.00631
Mengjie Zhao and Olga Fink, 30 Aug 2025, Disentangling Slow and Fast Temporal Dynamics in Degradation Inference with Hierarchical Differential Models, https://arxiv.org/abs/2509.00639
Binqing Wu, Jianlong Huang, Zongjiang Shang, Ling Chen, 2 Sep 2025, ST-Hyper: Learning High-Order Dependencies Across Multiple Spatial-Temporal Scales for Multivariate Time Series Forecasting, https://arxiv.org/abs/2509.02217
Jinzhou Tang, Jusheng zhang, Sidi Liu, Waikit Xiu, Qinhan Lv, Xiying Li, 29 Aug 2025, Beyond Pixels: Introducing Geometric-Semantic World Priors for Video-based Embodied Models via Spatio-temporal Alignment, https://arxiv.org/abs/2509.00210
Zhen Chen, Xingjian Luo, Kun Yuan, Jinlin Wu, Danny T.M. Chan, Nassir Navab, Hongbin Liu, Zhen Lei, Jiebo Luo, 30 Aug 2025, SurgLLM: A Versatile Large Multimodal Model with Spatial Focus and Temporal Awareness for Surgical Video Understanding, https://arxiv.org/abs/2509.00357
Junxiang Liu and Junming Lin and Jiangtong Li and Jie Li, 1 Sep 2025, DynaMind: Reconstructing Dynamic Visual Scenes from EEG by Aligning Temporal Dynamics and Multimodal Semantics to Guided Diffusion, https://arxiv.org/abs/2509.01177
James Amarel, Nicolas Hengartner, Robyn Miller, Kamaljeet Singh, Siddharth Mansingh, Arvind Mohan, Benjamin Migliori, Emily Casleton, Alexei Skurikhin, Earl Lawrence, Gerd J. Kunde, 18 Aug 2025, Generalization vs. Memorization in Autoregressive Deep Learning: Or, Examining Temporal Decay of Gradient Coherence, https://arxiv.org/abs/2509.00024
Jiawei Cao, Jie Ouyang, Zhaomeng Zhou, Mingyue Cheng, Yupeng Li, Jiaxian Yan, Qi Liu, 1 Sep 2025, Re3: Learning to Balance Relevance & Recency for Temporal Information Retrieval, https://arxiv.org/abs/2509.01306
Yves Stebler, Thomas M. Sutter, Ece Ozkan, Julia E. Vogt, 1 Sep 2025, Temporal Representation Learning for Real-Time Ultrasound Analysis, https://arxiv.org/abs/2509.01433
Chengyuan Ma, Peng Jia, Hongyue Guo, and Wenming Yang, 2 Sep 2025, ESTM: An Enhanced Dual-Branch Spectral-Temporal Mamba for Anomalous Sound Detection, https://arxiv.org/abs/2509.02471
Rui Li, Xiaohan Wang, Yuhui Zhang, Orr Zohar, Zeyu Wang, Serena Yeung-Levy, 1 Sep 2025, Temporal Preference Optimization for Long-Form Video Understanding, https://arxiv.org/abs/2501.13919
Minjung Park, Gyuyeon Na, Soyoun Kim, Sunyoung Moon, HyeonJeong Cha, Sangmi Chai, 3 Sep 2025, HyPV-LEAD: Proactive Early-Warning of Cryptocurrency Anomalies through Data-Driven Structural-Temporal Modeling, https://arxiv.org/abs/2509.03260
Kaustuv Mukherji, Jaikrishna Manojkumar Patil, Dyuman Aditya, Paulo Shakarian, Devendra Parkar, Lahari Pokala, Clark Dorman, Gerardo I. Simari, 3 Sep 2025, Lattice Annotated Temporal (LAT) Logic for Non-Markovian Reasoning, https://arxiv.org/abs/2509.02958
Huaicheng Zhang, Ruoxin Wang, Chenlian Zhou, Jiguang Shi, Yue Ge, Zhoutong Li, Sheng Chang, Hao Wang, Jin He and Qijun Huang, 3 Sep 2025, S2M2ECG: Spatio-temporal bi-directional State Space Model Enabled Multi-branch Mamba for ECG, https://arxiv.org/abs/2509.03066
Mattia Litrico and Francesco Guarnera and Mario Valerio Giuffrida and Daniele Rav\`i and Sebastiano Battiato, 3 Sep 2025, Temporally-Aware Diffusion Model for Brain Progression Modelling with Bidirectional Temporal Regularisation, https://arxiv.org/abs/2509.03141
Joel Jaskari, Chandreyee Roy, Fumiko Ogushi, Mikko Saukkoriipi, Jaakko Sahlsten, Kimmo Kaski, 3 Sep 2025, Temporal social network modeling of mobile connectivity data with graph neural networks, https://arxiv.org/abs/2509.03319
Naoufal El Bekri, Lucas Drumetz, Franck Vermet, 3 Sep 2025, FlowKac: An Efficient Neural Fokker-Planck solver using Temporal Normalizing Flows and the Feynman-Kac Formula, https://arxiv.org/abs/2503.11427
Wei Chen, Yuqian Wu, Yuanshao Zhu, Xixuan Hao, Shiyu Wang, Yuxuan Liang, 6 Sep 2025, Select, then Balance: A Plug-and-Play Framework for Exogenous-Aware Spatio-Temporal Forecasting, https://arxiv.org/abs/2509.05779
Shaoqi Wei, Senling Wang, Hiroshi Kai, Yoshinobu Higami, Ruijun Ma, Tianming Ni, Xiaoqing Wen and Hiroshi Takahashi, 8 Sep 2025, A Spatio-Temporal Graph Neural Networks Approach for Predicting Silent Data Corruption inducing Circuit-Level Faults, https://arxiv.org/abs/2509.06289
Guanjie Cheng, Boyi Li, Peihan Wu, Feiyi Chen, Xinkui Zhao, Mengying Zhu, Shuiguang Deng, 8 Sep 2025, DyC-STG: Dynamic Causal Spatio-Temporal Graph Network for Real-time Data Credibility Analysis in IoT, https://arxiv.org/abs/2509.06483
Henry Graf\'e, Hugo Van hamme, 5 Sep 2025, Graph Connectionist Temporal Classification for Phoneme Recognition, https://arxiv.org/abs/2509.05399
Aswini Kumar Patra, 8 Sep 2025, Improved Classification of Nitrogen Stress Severity in Plants Under Combined Stress Conditions Using Spatio-Temporal Deep Learning Framework, https://arxiv.org/abs/2509.06625
Jingyu Li, Tiehua Zhang, Jinze Wang, Yi Zhang, Yuhuan Li, Yifan Zhao, Zhishu Shen, Libing Wu, Jiannan Liu, 6 Sep 2025, MetaSTH-Sleep: Towards Effective Few-Shot Sleep Stage Classification for Health Management with Spatial-Temporal Hypergraph Enhanced Meta-Learning, https://arxiv.org/abs/2505.17142
Md. Kamrul Hasan, Guang Yang, Choon Hwai Yap, 6 Sep 2025, Motion-enhanced Cardiac Anatomy Segmentation via an Insertable Temporal Attention Module, https://arxiv.org/abs/2501.14929
Haruki Yokota, Koki Yamada, Yuichi Tanaka, Antonio Ortega, 8 Sep 2025, Time-Varying Graph Learning with Constraints on Graph Temporal Variation, https://arxiv.org/abs/2001.03346
Weichen Wu, Yuting Wei, Alessandro Rinaldo, 6 Sep 2025, Uncertainty quantification for Markov chain induced martingales with application to temporal difference learning, https://arxiv.org/abs/2502.13822

AGI Research

General research on achieving Artifical General Intelligence (AGI):

Tao Feng, Chuanyang Jin, Jingyu Liu, Kunlun Zhu, Haoqin Tu, Zirui Cheng, Guanyu Lin, Jiaxuan You, 16 May 2024, How Far Are We From AGI, https://arxiv.org/abs/2405.10313
Nathan Lambert, APR 18, 2024, Llama 3: Scaling open LLMs to AGI, https://www.interconnects.ai/p/llama-3-and-scaling-open-llms
jbetke, June 3, 2024, General Intelligence (2024), https://nonint.com/2024/06/03/general-intelligence-2024/
Steve Yadlowsky, Lyric Doshi, Nilesh Tripuraneni, Nov 2023, Pretraining Data Mixtures Enable Narrow Model Selection Capabilities in Transformer Models, https://arxiv.org/abs/2311.00871
Denise Holt, Jan 29, 2024, “Deep Learning is Rubbish” — Karl Friston & Yann LeCun Face Off at Davos 2024 World Economic Forum, 𝐀𝐈 𝐦𝐨𝐧𝐤𝐬.𝐢𝐨, https://medium.com/aimonks/deep-learning-is-rubbish-karl-friston-yann-lecun-face-off-at-davos-2024-world-economic-forum-494e82089d22
Hayden Field, June 20, 2024, OpenAI competitor Anthropic announces its most powerful AI yet, CNBC, https://www.cnbc.com/2024/06/20/anthropic-claude-3point5-sonnet-ai-announced.html
Arjun Kharpal, June 21, 2024, SoftBank CEO says AI that is 10,000 times smarter than humans will come out in 10 years, CNBC, https://www.cnbc.com/2024/06/21/softbank-ceo-predicts-ai-that-is-10000-times-smarter-than-humans-.html
Rahul Verma, June 21, 2024, OpenAI's GPT-5 Pushed Back To Late 2025, But Promises Ph.D.-Level Abilities, https://in.mashable.com/tech/77593/openais-gpt-5-pushed-back-to-late-2025-but-promises-phd-level-abilities
Ignacio de Gregorio, June 2024, Mixture-of-Agents Beats ChatGPT-4o: Collaboration is Intelligence, https://medium.com/@ignacio.de.gregorio.noblejas/mixture-of-agents-beats-chatgpt-4o-6470a74f1525
Rachel Metz, July 12, 2024, OpenAI Scale Ranks Progress Toward ‘Human-Level’ Problem Solving: The company believes its technology is approaching the second level of five on the path to artificial general intelligence, Bloomberg, https://www.bloomberg.com/news/articles/2024-07-11/openai-sets-levels-to-track-progress-toward-superintelligent-ai?sref=P6Q0mxvj
Anna Tong and Katie Paul July 16, 2024, Exclusive: OpenAI working on new reasoning technology under code name ‘Strawberry’, https://www.reuters.com/technology/artificial-intelligence/openai-working-new-reasoning-technology-under-code-name-strawberry-2024-07-12/
Ethan Mollick, May 12, 2024, Superhuman? What does it mean for AI to be better than a human? And how can we tell? https://www.oneusefulthing.org/p/superhuman
Zarif Bin Akhtar, Mapping Generative Artificial Intelligence (GAI's) Exciting Future: From Gemini to Q* and Beyond, https://publications.eai.eu/index.php/airo/article/view/5962 https://doi.org/10.4108/airo.5962 PDF: https://publications.eai.eu/index.php/airo/article/view/5962/3329
Jack Dymond, August 2024, Progressive Intelligence for Low-Power Devices, Ph.D. Thesis, Faculty of Engineering and Physical Sciences, School of Electronics and Computer Science, University of Southampton, https://eprints.soton.ac.uk/492900/1/JackDymond-Final-Thesis.pdf
Rohin Shah, Seb Farquhar, Anca Dragan, 21st Aug 2024, AGI Safety and Alignment at Google DeepMind: A Summary of Recent Work, https://www.alignmentforum.org/posts/79BPxvSsjzBkiSyTq/agi-safety-and-alignment-at-google-deepmind-a-summary-of
Roy Lo, June 13, 2024, Defining AI 2.0: Beyond Generative AI, https://www.linkedin.com/pulse/defining-ai-20-beyond-generative-roy-lo-tbvie/
Ryan McNeal, Aug 27, 2024, ChatGPT and GPT-4 could get a sweet upgrade this fall with 'strawberry', https://www.androidauthority.com/openai-strawberry-ai-3475682/
Vishal Rajput, Jul 8, 2024, Why LLMs Can’t Plan And Unlikely To Reach AGI? https://medium.com/aiguys/why-llms-cant-plan-and-unlikely-to-reach-agi-642bda3e0aa3
Lareina Yee, June 7, 2024, Gen AI: A cognitive industrial revolution, https://www.mckinsey.com/capabilities/mckinsey-digital/our-insights/gen-ai-a-cognitive-industrial-revolution
Martin_Casado, Aug 31, 2024, Tweet (State of LLMs) https://threadreaderapp.com/thread/1829905130512400775.html
Gian Segato, September 2024, The dawn of a new startup era, https://giansegato.com/essays/dawn-new-startup-era
Anson Ho, Tamay Besiroglu, Ege Erdil, David Owen, Robi Rahman, Zifan Carl Guo, David Atkinson, Neil Thompson, Jaime Sevilla, Mar 12, 2024, Algorithmic Progress in Language Models, Epoch AI, https://epochai.org/blog/algorithmic-progress-in-language-models
Anson Ho, Tamay Besiroglu, Ege Erdil, David Owen, Robi Rahman, Zifan Carl Guo, David Atkinson, Neil Thompson, Jaime Sevilla, 9 Mar 2024, Algorithmic progress in language models, https://arxiv.org/abs/2403.05812
Alberto Romero. Sep 10, 2024, Big News: OpenAI to Launch AI Model That Can Reason in 2 Weeks, https://www.thealgorithmicbridge.com/p/big-news-openai-to-launch-ai-model
David Gilmore, Sep 2024, When will AI outthink humans? https://davidvgilmore.com/writings/outthinking-ai (Interesting analysis of all the GPUs in the world and when they will "out-think" all the human knowledge workers, predicting a range of years from 2028 to 2035, depending on assumptions.)
Chloe Berger, October 2, 2024, Mark Cuban says his puppy is ‘smarter than AI is today’, https://fortune.com/2024/10/01/mark-cuban-dog-puppy-smarter-than-ai/
Julia Love and Rachel Metz, October 2, 2024, Google Is Working on Reasoning AI, Chasing OpenAI’s Efforts, https://www.bloomberg.com/news/articles/2024-10-02/google-is-working-on-reasoning-ai-chasing-openai-s-efforts
Samantha Kelly, Sept. 29, 2024, 'Superintelligent' AI Is Only a Few Thousand Days Away: OpenAI CEO Sam Altman, https://www.cnet.com/tech/services-and-software/superintelligent-ai-is-only-a-few-thousand-days-away-openai-ceo-sam-altman/
Tianyang Zhong, Zhengliang Liu, Yi Pan, Yutong Zhang, Yifan Zhou, Shizhe Liang, Zihao Wu, Yanjun Lyu, Peng Shu, Xiaowei Yu, Chao Cao, Hanqi Jiang, Hanxu Chen, Yiwei Li, Junhao Chen, Huawen Hu, Yihen Liu, Huaqin Zhao, Shaochen Xu, Haixing Dai, Lin Zhao, Ruidong Zhang, Wei Zhao, Zhenyuan Yang, Jingyuan Chen, Peilong Wang, Wei Ruan, Hui Wang, Huan Zhao, Jing Zhang, Yiming Ren, Shihuan Qin, Tong Chen, Jiaxi Li, Arif Hassan Zidan, Afrar Jahin, Minheng Chen, Sichen Xia, Jason Holmes, Yan Zhuang, Jiaqi Wang, Bochen Xu, Weiran Xia, Jichao Yu, Kaibo Tang, Yaxuan Yang, Bolun Sun, Tao Yang, Guoyu Lu, Xianqiao Wang, Lilong Chai, He Li, Jin Lu, Lichao Sun, Xin Zhang, Bao Ge, Xintao Hu, Lian Zhang, Hua Zhou, Lu Zhang, Shu Zhang, Ninghao Liu, Bei Jiang, Linglong Kong, Zhen Xiang, Yudan Ren, Jun Liu, Xi Jiang, Yu Bao, Wei Zhang, Xiang Li, Gang Li, Wei Liu, Dinggang Shen, Andrea Sikora, Xiaoming Zhai, Dajiang Zhu, Tianming Liu, 27 Sep 2024, Evaluation of OpenAI o1: Opportunities and Challenges of AGI, https://arxiv.org/abs/2409.18486
https://www.cio.com/article/3567138/ai-native-software-engineering-may-be-closer-than-developers-think.html
Ignacio de Gregorio Noblejas, October 20, 2024, The Anti-LLM Revolution Begins,https://thetechoasis.beehiiv.com/p/the-anti-llm-revolution-begins
Aki Ranin, Sep 2, 2024, The Code Canaries Are Singing — Our Path Toward AGI: How the fate of human software developers reveals our path toward AGI, https://akiranin.medium.com/the-code-canaries-are-singing-our-path-toward-agi-6c234cae0189
Will Lockett Nov 2024, Apple Calls BS On The AI Revolution, They aren’t late to the AI game; they are just the only sceptical big tech company. https://medium.com/predict/apple-calls-bullshit-on-the-ai-revolution-ae38fdf83392
Anthony Ha, Nov 2024, OpenAI reportedly developing new strategies to deal with AI improvement slowdown, https://techcrunch.com/2024/11/09/openai-reportedly-developing-new-strategies-to-deal-with-ai-improvement-slowdown/
Michael Nuñez, November 11, 2024, AI’s math problem: FrontierMath benchmark shows how far technology still has to go, https://venturebeat.com/ai/ais-math-problem-frontiermath-benchmark-shows-how-far-technology-still-has-to-go/
Kyle Orland, 13 Nov 2024, What if AI doesn’t just keep getting better forever? New reports highlight fears of diminishing returns for traditional LLM training. https://arstechnica.com/ai/2024/11/what-if-ai-doesnt-just-keep-getting-better-forever/
Gary Marcus, Nov 25, 2024, A new AI scaling law shell game? Scaling laws ain’t what they used to be, https://garymarcus.substack.com/p/a-new-ai-scaling-law-shell-game
Brian Merchant, Dec 2024, AI Generated Business: The Rise of AGI and the Rush to Find a Working Business Model, https://ainowinstitute.org/general/ai-generated-business
David Luan, Pieter Abbeel, December 09, 2024, Amazon opens new AI lab in San Francisco focused on long-term research bets. The Amazon AGI SF Lab will focus on developing new foundational capabilities for enabling useful AI agents. https://www.amazon.science/blog/amazon-opens-new-ai-lab-in-san-francisco-focused-on-long-term-research-bets
Deirdre Bosa, Jasmine Wu, Dec 11 2024, The limits of intelligence — Why AI advancement could be slowing down, https://www.cnbc.com/2024/12/11/why-ai-advancement-could-be-slowing-down.html
Alberto Romero, Dec 21, 2024, OpenAI o3 Model Is a Message From the Future: Update All You Think You Know About AI. Incredible, a miracle, more than just a better state-of-the-art AI model. https://www.thealgorithmicbridge.com/p/openai-o3-model-is-a-message-from
Sabrina Ortiz, Dec. 20, 2024, OpenAI unveils its most advanced o3 reasoning model on its last day of 'shipmas', https://www.zdnet.com/article/openai-unveils-its-most-advanced-o3-reasoning-model-on-its-last-day-of-shipmas/
Akash Bajwa, Jan 06, 2025, Test-Time Search: A Path To AGI: Stacking Scaling Laws And Reward Engineering, https://akashbajwa.substack.com/p/test-time-search-a-path-to-agi
Duncan Anderson, Jan 2025, The wall that wasn’t: Benchmark results for the latest AI models suggest that any “scaling wall” has already been breached and we’re on the path to AGI. https://medium.com/barnacle-labs/the-wall-that-wasnt-62c617f66ad4
Alhassan Mumuni, Fuseini Mumuni, 6 Jan 2025, Large language models for artificial general intelligence (AGI): A survey of foundational principles and approaches, https://arxiv.org/abs/2501.03151
Jeffrey Anthony, Jan 2025, No GPT-5 in 2025 and No AGI — Ever. The Triadic Nature of Meaning-Making and the Fallacy of AI’s Understanding. https://medium.com/@WeWillNotBeFlattened/no-gpt-5-in-2025-and-no-agi-ever-aa9384efdbe5
Ndea, Jan 16, 2025, Ndea is building frontier AI systems that blend intuitive pattern recognition and formal reasoning into a unified architecture., https://ndea.com/
Akash Bajwa Jan 27, 2025, The Post-R1 World: AI Economics Have Irreversibly Changed, https://akashbajwa.substack.com/p/the-post-r1-world
Mohit Sewak, Ph.D., January 29, 2025, Achieving General Intelligence (AGI) and Super Intelligence (ASI): Pathways, Uncertainties, and Ethical Concerns, https://towardsai.net/p/l/achieving-general-intelligence-agi-and-super-intelligence-asi-pathways-uncertainties-and-ethical-concerns
Alberto Romero, Feb 06, 2025, AGI Is Already Here—It’s Just Not Evenly Distributed: Or: why you should learn to prompt AI models, https://open.substack.com/pub/thealgorithmicbridge/p/agi-is-already-hereits-just-not-evenly
Arjun Kharpal, Feb 6 2025, ‘Dangerous proposition’: Top scientists warn of out-of-control AI, https://www.cnbc.com/2025/02/07/dangerous-proposition-top-scientists-warn-of-out-of-control-ai.html
Nikhil Anand, Feb 2025, Why I think DeepSeek-R1 just revealed the path to AGI. Here’s a visual explanation of exactly what makes DeepSeek-R1 so good. https://ai.gopubby.com/why-i-think-deepseek-r1-just-revealed-the-path-to-agi-d0add267197d
Sam Altman, Feb 10, 2025, Three Observations, https://blog.samaltman.com/three-observations (Talks about scaling laws, inference costs reducing, and AGI. One of them: "The cost to use a given level of AI falls about 10x every 12 months, and lower prices lead to much more use. ")
Tobias Schnabel, Kiran Tomlinson, Adith Swaminathan, Jennifer Neville, 19 May 2025 (v2), Lost in Transmission: When and Why LLMs Fail to Reason Globally, https://arxiv.org/abs/2505.08140
Apoorv Agrawal, May 23, 2025, Why Cars Drive Themselves Before Computers Do: Robocars are ready; robot secretaries aren’t… yet, https://apoorv03.com/p/autonomy
Parshin Shojaee, Maxwell Horton, Iman Mirzadeh, Samy Bengio, Keivan Alizadeh, June 2025, The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity, Apple, https://machinelearning.apple.com/research/illusion-of-thinking https://ml-site.cdn-apple.com/papers/the-illusion-of-thinking.pdf
Dr. Ashish Bamania, June 2025, Apple’s New Research Shows That LLM Reasoning Is Completely Broken: A deep dive into Apple research that exposes the flawed thinking process in state-of-the-art Reasoning LLMs, https://ai.gopubby.com/apples-new-research-shows-that-llm-reasoning-is-completely-broken-47b5be71a06a
François Chollet, 25 Nov 2019 (v2), On the Measure of Intelligence, https://arxiv.org/abs/1911.01547
Kenneth Payne, Baptiste Alloui-Cros, 3 Jul 2025, Strategic Intelligence in Large Language Models: Evidence from evolutionary Game Theory, https://arxiv.org/abs/2507.02618
David De Cremer and Garry Kasparov, March 18, 2021, AI Should Augment Human Intelligence, Not Replace It https://hbr.org/2021/03/ai-should-augment-human-intelligence-not-replace-it
Eli Amdur, Nov 25, 2023, Jobs AI Just Can't Do, Forbes https://www.forbes.com/sites/eliamdur/2023/11/25/jobs-ai-just-cant-do/
Maryville University Online. June 6, 2024 Artificial Intelligence vs. Human Intelligence, https://online.maryville.edu/blog/ai-vs-human-intelligence/
Human- versus Artificial Intelligence - PMC - PubMed Central https://pmc.ncbi.nlm.nih.gov/articles/PMC8108480/ PDF: https://pmc.ncbi.nlm.nih.gov/articles/PMC8108480/pdf/frai-04-622364.pdf (Good article on the nature of intelligence.)
AI Won't Replace Humans – Here's The Surprising Reason Why https://www.forbes.com/sites/bernardmarr/2024/11/28/ai-wont-replace-humans--heres-the-surprising-reason-why/
Shneiderman, B. (2020). Human-Centered Artificial Intelligence: Reliable, Safe & Trustworthy. International Journal of Human–Computer Interaction, 36(6), 495–504. https://doi.org/10.1080/10447318.2020.1741118 https://www.tandfonline.com/doi/full/10.1080/10447318.2020.1741118
Joe McKendrick, July 23, 2025, Will AI think like humans? We're not even close - and we're asking the wrong question, https://www.zdnet.com/article/will-ai-think-like-humans-were-not-even-close-and-were-asking-the-wrong-question/
Webb Wright, July 1, 2025, Meta's new AI lab aims to deliver 'personal superintelligence for everyone' - whatever that means, https://www.zdnet.com/article/metas-new-ai-lab-aims-to-deliver-personal-superintelligence-for-everyone-whatever-that-means/
Lester Mapp, June 17, 2025 , Apple's 'The Illusion of Thinking' is shocking - but here's what it missed, https://www.zdnet.com/article/apples-the-illusion-of-thinking-is-shocking-but-heres-what-it-missed/
Sabrina Ortiz, June 17, 2025, What Apple's controversial research paper really tells us about LLMs, https://www.zdnet.com/article/what-apples-controversial-research-paper-really-tells-us-about-llms/
Zvi, 11th Jun 2025, Give Me a Reason(ing Model), https://www.lesswrong.com/posts/tnc7YZdfGXbhoxkwj/give-me-a-reason-ing-model
Peter Wildeford, Aug 08, 2025, GPT-5: a small step for intelligence, a giant leap for normal people: GPT-5 focuses on where the money is - everyday users, not AI elites, https://peterwildeford.substack.com/p/gpt-5-a-small-step-for-intelligence
Kenneth Wolters, Aug 12, 2025, No AGI in Sight: What This Means for LLMs, https://kennethwolters.com/posts/no-agi/
Mark Zilberman, 13 Aug 2025, Extending the Entropic Potential of Events for Uncertainty Quantification and Decision-Making in Artificial Intelligence, https://arxiv.org/abs/2508.10241
Yi Dong, Yusuke Muraoka, Scott Shi, and Yi Zhang, 14 Aug 2025, MM-Food-100K: A 100,000-Sample Multimodal Food Intelligence Dataset with Verifiable Provenance, https://arxiv.org/abs/2508.10429
Silvia Garc\'ia-M\'endez, Francisco de Arriba-P\'erez, 8 Aug 2025, Detecting and explaining postpartum depression in real-time with generative artificial intelligence, https://arxiv.org/abs/2508.10025
Yuksel Aydin, 9 Aug 2025, Cognitive Cybersecurity for Artificial Intelligence: Guardrail Engineering with CCS-7, https://arxiv.org/abs/2508.10033
Nitin Rai, Nathan S. Boyd, Gary E. Vallad, Arnold W. Schumann, 13 Aug 2025, Improving watermelon (Citrullus lanatus) disease classification with generative artificial intelligence (GenAI)-based synthetic and real-field images via a custom EfficientNetV2-L model, https://arxiv.org/abs/2508.10156
Amine Tellache, Abdelaziz Amara Korba, Amdjed Mokhtari, Horea Moldovan, Yacine Ghamri-Doudane, 14 Aug 2025, Advancing Autonomous Incident Response: Leveraging LLMs and Cyber Threat Intelligence, https://arxiv.org/abs/2508.10677
Kei-Sing Ng, 13 Aug 2025, On the Definition of Intelligence, https://arxiv.org/abs/2507.22423
Ilias Chatzistefanidis, Navid Nikaein, 23 Jul 2025, Symbiotic Agents: A Novel Paradigm for Trustworthy AGI-driven Networks, https://arxiv.org/abs/2507.17695
C.H.E. Jordaan, M. van der Stelt, T.J.J. Maal, V.M.A. Stirler, R. Leijendekkers, T. Kachman, G.A. de Jong, 30 Apr 2025, Evaluating Artificial Intelligence Algorithms for the Standardization of Transtibial Prosthetic Socket Shape Design, https://arxiv.org/abs/2507.16818
Guang Gao, Jianan Wang, Jinbo Zuo, Junnan Jiang, Jingfan Zhang, Xianwen Zeng, Yuejiang Zhu, Lianyang Ma, Ke Chen, Minhua Sheng, Ruirui Zhang, Zhaohui An, 23 Jul 2025, Towards Human-level Intelligence via Human-like Whole-Body Manipulation, https://arxiv.org/abs/2507.17141
Georgios Mappouras, 23 Jul 2025, Turing Test 2.0: The General Intelligence Threshold, https://arxiv.org/abs/2505.19550
Obumneme Zimuzor Nwafor and Mohammed Abdul Majeed Al Hooti, 23 Jul 2025, Artificial Intelligence for Green Hydrogen Yield Prediction and Site Suitability using SHAP-Based Composite Index: Focus on Oman, https://arxiv.org/abs/2507.14219
Yanjun Zheng, Xiyang Du, Longfei Liao, Xiaoke Zhao, Zhaowen Zhou, Bo Zhang, Jiawei Liu, Xiang Qi, Zhe Li, Zhiqiang Zhang, Wei Wang and Peng Zhang, 23 Jul 2025, Agentar-Fin-R1: Enhancing Financial Intelligence through Domain Expertise, Training Efficiency, and Advanced Reasoning, https://arxiv.org/abs/2507.16802
Shai Shalev-Shwartz and Amnon Shashua, 13 Jul 2025, From Reasoning to Super-Intelligence: A Search-Theoretic Perspective, https://arxiv.org/abs/2507.15865
Simon Ouellette, 17 Jul 2025, Out-of-Distribution Generalization in the ARC-AGI Domain: Comparing Execution-Guided Neural Program Synthesis and Test-Time Fine-Tuning, https://arxiv.org/abs/2507.15877
Andy E. Williams, 18 Jul 2025, The Recursive Coherence Principle: A Formal Constraint on Scalable Intelligence, Alignment, and Reasoning Architecture, https://arxiv.org/abs/2507.15880
Li-Hsiang Shen, Jyun-Jhe Huang, 22 Jul 2025, CHIMERA: Compressed Hybrid Intelligence for Twin-Model Enhanced Multi-Agent Deep Reinforcement Learning for Multi-Functional RIS-Assisted Space-Air-Ground Integrated Networks, https://arxiv.org/abs/2507.16204
Xin-De Wang, Zhi-Rui Chen, Peng-Jie Guo, Ze-Feng Gao, Cheng Mu, Zhong-Yi Lu, 22 Jul 2025, Perovskite-R1: A Domain-Specialized LLM for Intelligent Discovery of Precursor Additives and Experimental Design, https://arxiv.org/abs/2507.16307
Christian D. Blakely, 22 Jul 2025, Symbolic Graph Intelligence: Hypervector Message Passing for Learning Graph-Level Patterns with Tsetlin Machines, https://arxiv.org/abs/2507.16537
Zixu Wang, Yuhan Wang, Junfei Ma, Fuyuan Wu, Junchi Yan, Xiaohui Yuan, Zhe Zhang, Jie Zhang, 22 Jul 2025, Predictive Hydrodynamic Simulations for Laser Direct-drive Implosion Experiments via Artificial Intelligence, https://arxiv.org/abs/2507.16227
S.-Y. Zhang, J. Tian, S.-L. Liu, H.-M. Zhang, H.-Y. Bai, Y.-C. Hu, W.-H. Wang, 22 Jul 2025, Constructing material network representations for intelligent amorphous alloys design, https://arxiv.org/abs/2507.16336
Jinming Hu, Hassan Nawaz, Yuting Rui, Lijie Chi, Arif Ullah, Pavlo O. Dral, 22 Jul 2025, Aitomia: Your Intelligent Assistant for AI-Driven Atomistic and Quantum Chemical Simulations, https://arxiv.org/abs/2505.08195
Shanghai AI Lab: Yicheng Bao, Guanxu Chen, Mingkang Chen, Yunhao Chen, Chiyu Chen, Lingjie Chen, Sirui Chen, Xinquan Chen, Jie Cheng, Yu Cheng, Dengke Deng, Yizhuo Ding, Dan Ding, Xiaoshan Ding, Yi Ding, Zhichen Dong, Lingxiao Du, Yuyu Fan, Xinshun Feng, Yanwei Fu, Yuxuan Gao, Ruijun Ge, Tianle Gu, Lujun Gui, Jiaxuan Guo, Qianxi He, Yuenan Hou, Xuhao Hu, Hong Huang, Kaichen Huang, Shiyang Huang, Yuxian Jiang, Shanzhe Lei, Jie Li, Lijun Li, Hao Li, Juncheng Li, Xiangtian Li, Yafu Li, Lingyu Li, Xueyan Li, Haotian Liang, Dongrui Liu, Qihua Liu, Zhixuan Liu, Bangwei Liu, Huacan Liu, Yuexiao Liu, Zongkai Liu, Chaochao Lu, Yudong Lu, Xiaoya Lu, Zhenghao Lu, Qitan Lv, Caoyuan Ma, Jiachen Ma, Xiaoya Ma, Zhongtian Ma, Lingyu Meng, Ziqi Miao, Yazhe Niu, Yuezhang Peng, Yuan Pu, Han Qi, Chen Qian, Xingge Qiao, et al. (50 additional authors not shown), 24 Jul 2025, SafeWork-R1: Coevolving Safety and Intelligence under the AI-45$^{\circ}$ Law, https://arxiv.org/abs/2507.18576
Alberto Marchisio and Muhammad Shafique, 24 Jul 2025, Neuromorphic Computing for Embodied Intelligence in Autonomous Systems: Current Trends, Challenges, and Future Directions, https://arxiv.org/abs/2507.18139
Zhangqi Liu, 22 Jul 2025, Human-AI Co-Creation: A Framework for Collaborative Design in Intelligent Systems, https://arxiv.org/abs/2507.17774
Alberto Hern\'andez-Espinosa, Luan Ozelim, Felipe S. Abrah\~ao, Hector Zenil, 24 Jul 2025, SuperARC: An Agnostic Test for Narrow, General, and Super Intelligence Based On the Principles of Recursive Compression and Algorithmic Probability, https://arxiv.org/abs/2503.16743
Hanzhi Zhou, Erik Hornberger, Pengsheng Guo, Xiyou Zhou, Saiwen Wang, Xin Wang, Yifei He, Xuankai Chang, Rene Rauch, Louis D'hauwe, John Peebles, Alec Doane, Kohen Chia, Jenna Thibodeau, Zi-Yi Dou, Yuanyang Zhang, Ruoming Pang, Reed Li, Zhifeng Chen, Jeremy Warner, Zhaoyang Xu, Sophy Lee, David Mizrahi, Ramsey Tantawi, Chris Chaney, Kelsey Peterson, Jun Qin, Alex Dombrowski, Mira Chiang, Aiswarya Raghavan, Gerard Casamayor, Qibin Chen, Aonan Zhang, Nathalie Tran, Jianyu Wang, Hang Su, Thomas Voice, Alessandro Pappalardo, Brycen Wershing, Prasanth Yadla, Rui Li, Priyal Chhatrapati, Ismael Fernandez, Yusuf Goren, Xin Zheng, Forrest Huang, Tao Lei, Eray Yildiz, Alper Kokmen, Gokul Santhanam, Areeba Kamal, Kaan Elgin, Dian Ang Yap, Jeremy Liu, Peter Gray, Howard Xing, Kieran Liu, Matteo Ronchi, et al. (337 additional authors not shown), 17 Jul 2025, Apple Intelligence Foundation Language Models: Tech Report 2025, https://arxiv.org/abs/2507.13575
Shuiguang Deng, Di Yu, Changze Lv, Xin Du, Linshan Jiang, Xiaofan Zhao, Wentao Tong, Xiaoqing Zheng, Weijia Fang, Peng Zhao, Gang Pan, Schahram Dustdar, Albert Y. Zomaya, 18 Jul 2025, Edge Intelligence with Spiking Neural Networks, https://arxiv.org/abs/2507.14069
Maria Tsfasman, Ramin Ghorbani, Catholijn M. Jonker, Bernd Dudzik, 18 Jul 2025, The Emotion-Memory Link: Do Memorability Annotations Matter for Intelligent Systems?, https://arxiv.org/abs/2507.14084
Michael Timothy Bennett, 18 Jul 2025, What the F*ck Is Artificial General Intelligence?, https://arxiv.org/abs/2503.23923
Nick Byrd, 17 Jul 2025, Strategic Reflectivism In Intelligent Systems, https://arxiv.org/abs/2505.22987
Abhishek Sriram, Neal Tuffy, 18 Jul 2025, Accelerating RF Power Amplifier Design via Intelligent Sampling and ML-Based Parameter Tuning, https://arxiv.org/abs/2507.11928
Haobo Yang, Shiyan Zhang, Zhuoyi Yang, Xinyu Zhang, Jilong Guo, Zongyou Yang, Jun Li, 18 Jul 2025, Entropy Loss: An Interpretability Amplifier of 3D Object Detection Network for Intelligent Driving, https://arxiv.org/abs/2409.00839
Rahul Kabali, 4 Jul 2025, The Free Will Equation: Quantum Field Analogies for AGI, https://arxiv.org/abs/2507.14154
Elio Grande, 20 Jul 2025, The Endless Tuning. An Artificial Intelligence Design To Avoid Human Replacement and Trace Back Responsibilities, https://arxiv.org/abs/2507.14909
Mohammad Mashayekhi, Sara Ahmadi Majd, Arian AmirAmjadi, Parsa Hosseini, 20 Jul 2025, Clinical Semantic Intelligence (CSI): Emulating the Cognitive Framework of the Expert Clinician for Comprehensive Oral Disease Diagnosis, https://arxiv.org/abs/2507.15140
Qianchao Wang, Yuxuan Ding, Chuanzhen Jia, Zhe Li, Yaping Du, 21 Jul 2025, Explainable Artificial Intelligence based Soft Evaluation Indicator for Arc Fault Diagnosis, https://arxiv.org/abs/2507.15239
Julien Pourcel, C\'edric Colas, Pierre-Yves Oudeyer, 10 Jul 2025, Self-Improving Language Models for Evolutionary Program Synthesis: A Case Study on ARC-AGI, https://arxiv.org/abs/2507.14172
Yajiao Dai, Jun Li, Zhen Mei, Yiyang Ni, Shi Jin, Zengxiang Li, Sheng Guo, Wei Xiang, 12 Jul 2025, Semi-Supervised Federated Learning via Dual Contrastive Learning and Soft Labeling for Intelligent Fault Diagnosis, https://arxiv.org/abs/2507.14181
Federico Mason, Tommaso Zugno, Matteo Drago, Marco Giordani, Mate Boban, and Michele Zorzi, 15 Jul 2025, PRATA: A Framework to Enable Predictive QoS in Vehicular Networks via Artificial Intelligence, https://arxiv.org/abs/2507.14211
Craig S Wright, 16 Jul 2025, Cognitive Castes: Artificial Intelligence, Epistemic Stratification, and the Dissolution of Democratic Discourse, https://arxiv.org/abs/2507.14218
Shayan Rokhva, Babak Teimourpour, 19 Jul 2025, Artificial Intelligence in the Food Industry: Food Waste Estimation based on Computer Vision, a Brief Case Study in a University Dining Hall, https://arxiv.org/abs/2507.14662
Yiming Li, Shuo Shao, Yu He, Junfeng Guo, Tianwei Zhang, Zhan Qin, Pin-Yu Chen, Michael Backes, Philip Torr, Dacheng Tao, Kui Ren, 19 Jul 2025, Rethinking Data Protection in the (Generative) Artificial Intelligence Era, https://arxiv.org/abs/2507.03034
Chengshuai Zhao, Zhen Tan, Pingchuan Ma, Dawei Li, Bohan Jiang, Yancheng Wang, Yingzhen Yang, Huan Liu, 13 Aug 2025 (v3), Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens, https://arxiv.org/abs/2508.01191
Haifeng Li, Wang Guo, Haiyang Wu, Mengwei Wu, Jipeng Zhang, Qing Zhu, Yu Liu, Xin Huang, Chao Tao, 9 Aug 2025, Remote Sensing Image Intelligent Interpretation with the Language-Centered Perspective: Principles, Methods and Challenges, https://arxiv.org/abs/2508.06832
Aswin Paul, Moein Khajehnejad, Forough Habibollahi, Brett J. Kagan, Adeel Razi, 9 Aug 2025, Simulating Biological Intelligence: Active Inference with Experiment-Informed Generative Model, https://arxiv.org/abs/2508.06980
Yi Tang, Kaini Wang, Yang Chen, Guangquan Zhou, 10 Aug 2025, EndoAgent: A Memory-Guided Reflective Agent for Intelligent Endoscopic Vision-to-Decision Reasoning, https://arxiv.org/abs/2508.07292
Mubaris Nadeem, Johannes Zenkert, Lisa Bender, Christian Weber, Madjid Fathi, 11 Aug 2025, KIRETT: Knowledge-Graph-Based Smart Treatment Assistant for Intelligent Rescue Operations, https://arxiv.org/abs/2508.07834
Rachel K. Luu and Jingyu Deng and Mohammed Shahrudin Ibrahim and Nam-Joon Cho and Ming Dao and Subra Suresh and Markus J. Buehler, 8 Aug 2025, Generative Artificial Intelligence Extracts Structure-Function Relationships from Plants for New Materials, https://arxiv.org/abs/2508.06591
Hyeonuk Nam, 11 Aug 2025, Auditory Intelligence: Understanding the World Through Sound, https://arxiv.org/abs/2508.07829
Yuyang Zhou, Guang Cheng, Kang Du, Zihan Chen, Yuyu Zhao, 11 Aug 2025, Toward Intelligent and Secure Cloud: Large Language Model Empowered Proactive Defense, https://arxiv.org/abs/2412.21051
Manish Verma, Vivek Sharma, Vishal Singh, 27 Jul 2025, Artificial Intelligence In Patent And Market Intelligence: A New Paradigm For Technology Scouting, https://arxiv.org/abs/2507.20322
Huan-ang Gao, Jiayi Geng, Wenyue Hua, Mengkang Hu, Xinzhe Juan, Hongzhang Liu, Shilong Liu, Jiahao Qiu, Xuan Qi, Yiran Wu, Hongru Wang, Han Xiao, Yuhang Zhou, Shaokun Zhang, Jiayi Zhang, Jinyu Xiang, Yixiong Fang, Qiwen Zhao, Dongrui Liu, Qihan Ren, Cheng Qian, Zhenghailong Wang, Minda Hu, Huazheng Wang, Qingyun Wu, Heng Ji, Mengdi Wang, 28 Jul 2025, A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence, https://arxiv.org/abs/2507.21046
Obumneme Nwafor and Mohammed Abdul Majeed Al Hooti, 21 Jul 2025, Machine Learning Risk Intelligence for Green Hydrogen Investment: Insights for Duqm R3 Auction, https://arxiv.org/abs/2507.19529
Jianping Yao, Son N. Tran, Hieu Nguyen, Samantha Sawyer, and Rocco Longo, 27 Jul 2025, Wine Characterisation with Spectral Information and Predictive Artificial Intelligence, https://arxiv.org/abs/2507.20114
Kimi Team: Yifan Bai, Yiping Bao, Guanduo Chen, Jiahao Chen, Ningxin Chen, Ruijue Chen, Yanru Chen, Yuankun Chen, Yutian Chen, Zhuofu Chen, Jialei Cui, Hao Ding, Mengnan Dong, Angang Du, Chenzhuang Du, Dikang Du, Yulun Du, Yu Fan, Yichen Feng, Kelin Fu, Bofei Gao, Hongcheng Gao, Peizhong Gao, Tong Gao, Xinran Gu, Longyu Guan, Haiqing Guo, Jianhang Guo, Hao Hu, Xiaoru Hao, Tianhong He, Weiran He, Wenyang He, Chao Hong, Yangyang Hu, Zhenxing Hu, Weixiao Huang, Zhiqi Huang, Zihao Huang, Tao Jiang, Zhejun Jiang, Xinyi Jin, Yongsheng Kang, Guokun Lai, Cheng Li, Fang Li, Haoyang Li, Ming Li, Wentao Li, Yanhao Li, Yiwei Li, Zhaowei Li, Zheming Li, Hongzhan Lin, Xiaohan Lin, Zongyu Lin, Chengyin Liu, Chenyu Liu, Hongzhang Liu, Jingyuan Liu, Junqi Liu, Liang Liu, Shaowei Liu, T.Y. Liu, Tianwei Liu, et al. (103 additional authors not shown), 28 Jul 2025, Kimi K2: Open Agentic Intelligence, https://arxiv.org/abs/2507.20534
Matin Aghaei, Mohammad Ali Alomrani, Yingxue Zhang, Mahdi Biparva, 26 Jul 2025, When Engineering Outruns Intelligence: A Re-evaluation of Instruction-Guided Navigation, https://arxiv.org/abs/2507.20021
Marta Sidorkiewicz, Karolina Kr\'olikowska, Berenika Dyczek, Edyta Pijet-Migon, Anna Dubel, 2 Jul 2025, Artificial intelligence for sustainable wine industry: AI-driven management in viticulture, wine production and enotourism, https://arxiv.org/abs/2507.21098
Jae Wan Shim, 21 Jul 2025, Measuring and Analyzing Intelligence via Contextual Uncertainty in Large Language Models using Information-Theoretic Metrics, https://arxiv.org/abs/2507.21129
Ashley Rector, Keaton Minor, Kamden Minor, Jeff McCormack, Beth Breeden, Ryan Nowers, Jay Dorris, 29 Jul 2025, Validating Pharmacogenomics Generative Artificial Intelligence Query Prompts Using Retrieval-Augmented Generation (RAG), https://arxiv.org/abs/2507.21453
Bin Liu, 29 Jul 2025, Exploring the Link Between Bayesian Inference and Embodied Intelligence: Toward Open Physical-World Embodied AI Systems, https://arxiv.org/abs/2507.21589
Andrew Kiruluta, Andreas Lemos, and Priscilla Burity, 27 Jul 2025, Operator-Based Machine Intelligence: A Hilbert Space Framework for Spectral Learning and Symbolic Reasoning, https://arxiv.org/abs/2507.21189
Abir Ray, 28 Jul 2025, EdgeAgentX-DT: Integrating Digital Twins and Generative AI for Resilient Edge Intelligence in Tactical Networks, https://arxiv.org/abs/2507.21196
Leonard Dung and Max Hellrigel-Holderbaum, 29 Jul 2025, Against racing to AGI: Cooperation, deterrence, and catastrophic risks, https://arxiv.org/abs/2507.21839
Maximilian Ferle, Jonas Ader, Thomas Wiemers, Nora Grieb, Adrian Lindenmeyer, Hans-Jonas Meyer, Thomas Neumuth, Markus Kreuz, Kristin Reiche, Maximilian Merz, 29 Jul 2025, Unsupervised risk factor identification across cancer types and data modalities via explainable artificial intelligence, https://arxiv.org/abs/2506.12944
Hubert Baniecki and Przemyslaw Biecek, 28 Jul 2025, Adversarial attacks and defenses in explainable artificial intelligence: A survey, https://arxiv.org/abs/2306.06123
Huizi Yu, Jiayan Zhou, Lingyao Li, Shan Chen, Jack Gallifant, Anye Shi, Xiang Li, Jingxian He, Wenyue Hua, Mingyu Jin, Guang Chen, Yang Zhou, Zhao Li, Trisha Gupte, Ming-Li Chen, Zahra Azizi, Yongfeng Zhang, Yanqiu Xing, Themistocles L. Danielle S. Bitterman, Themistocles L. Assimes, Xin Ma, Lin Lu, Lizhou Fan, 29 Jul 2025, Simulated patient systems are intelligent when powered by large language model-based AI agents, https://arxiv.org/abs/2409.18924
Kees van Deemter, 29 Jul 2025, My Life in Artificial Intelligence: People, anecdotes, and some lessons learnt, https://arxiv.org/abs/2504.04142
Arushi Goel and Sreyan Ghosh and Jaehyeon Kim and Sonal Kumar and Zhifeng Kong and Sang-gil Lee and Chao-Han Huck Yang and Ramani Duraiswami and Dinesh Manocha and Rafael Valle and Bryan Catanzaro, 28 Jul 2025, Audio Flamingo 3: Advancing Audio Intelligence with Fully Open Large Audio Language Models, https://arxiv.org/abs/2507.08128
Olga Vershinina, Jacopo Sabbatinelli, Anna Rita Bonfigli, Dalila Colombaretti, Angelica Giuliani, Mikhail Krivonosov, Arseniy Trukhanov, Claudio Franceschi, Mikhail Ivanchenko, Fabiola Olivieri, 31 Jul 2025, Explainable artificial intelligence model predicting the risk of all-cause mortality in patients with type 2 diabetes mellitus, https://arxiv.org/abs/2507.23491
Kathleen Mealey, Jonathan A. Karr Jr., Priscila Saboia Moreira, Paul R. Brenner, Charles F. Vardeman II, 24 Jul 2025, Trusted Knowledge Extraction for Operations and Maintenance Intelligence, https://arxiv.org/abs/2507.22935
Shaofei Cai, Zhancun Mu, Haiwen Xia, Bowei Zhang, Anji Liu, Yitao Liang, 31 Jul 2025, Scalable Multi-Task Reinforcement Learning for Generalizable Spatial Intelligence in Visuomotor Agents, https://arxiv.org/abs/2507.23698
Hanwen Zhang, Ruichen Zhang, Wei Zhang, Dusit Niyato, Yonggang Wen, Chunyan Miao, 31 Jul 2025, Advancing Generative Artificial Intelligence and Large Language Models for Demand Side Management with Internet of Electric Vehicles, https://arxiv.org/abs/2501.15544
Chunan Tong, 31 Jul 2025, An Efficient Intelligent Semi-Automated Warehouse Inventory Stocktaking System, https://arxiv.org/abs/2309.12365
Jirui Yang, Zheyu Lin, Zhihui Lu, Yinggui Wang, Lei Wang, Tao Wei, Xin Du, Shuhan Yang, 31 Jul 2025, CEE: An Inference-Time Jailbreak Defense for Embodied Intelligence via Subspace Concept Rotation, https://arxiv.org/abs/2504.13201
Matthieu Queloz, 29 Jul 2025, Explainability Through Systematicity: The Hard Systematicity Challenge for Artificial Intelligence, https://arxiv.org/abs/2507.22197
Matej \v{S}progar, 30 Jul 2025, AGITB: A Signal-Level Benchmark for Evaluating Artificial General Intelligence, https://arxiv.org/abs/2504.04430
Yining Hong, Rui Sun, Bingxuan Li, Xingcheng Yao, Maxine Wu, Alexander Chien, Da Yin, Ying Nian Wu, Zhecan James Wang, Kai-Wei Chang, 29 Jul 2025, Embodied Web Agents: Bridging Physical-Digital Realms for Integrated Agent Intelligence, https://arxiv.org/abs/2506.15677
Marc Schmitt, 30 Jul 2025, Strategic Integration of Artificial Intelligence in the C-Suite: The Role of the Chief AI Officer, https://arxiv.org/abs/2407.10247
Wil M.P. van der Aalst, 31 Jul 2025, No AI Without PI! Object-Centric Process Mining as the Enabler for Generative, Predictive, and Prescriptive Artificial Intelligence, https://arxiv.org/abs/2508.00116
Mohit Gupta, Debjit Bhowmick, Rhys Newbury, Meead Saberi, Shirui Pan and Ben Beck, 31 Jul 2025, INSPIRE-GNN: Intelligent Sensor Placement to Improve Sparse Bicycling Network Prediction via Reinforcement Learning Boosted Graph Neural Networks, https://arxiv.org/abs/2508.00141
Guilherme Guerino and Luiz Rodrigues and Luana Bianchiniand Mariana Alves and Marcelo Marinho and Thomaz Veloso and Valmir Macario and Diego Dermeval and Thales Vieira and Ig Bittencourt and Seiji Isotani, 31 Jul 2025, A Mixed User-Centered Approach to Enable Augmented Intelligence in Intelligent Tutoring Systems: The Case of MathAIde app, https://arxiv.org/abs/2508.00103
Rajpreet Singh, Vidhi Kothari, 1 Aug 2025, Composable OS Kernel Architectures for Autonomous Intelligence, https://arxiv.org/abs/2508.00604
S. V. Chekanov and H. Kjellerstrand, 1 Aug 2025, Discovering the underlying analytic structure within Standard Model constants using artificial intelligence, https://arxiv.org/abs/2507.00225
Christopher Wissuchek, Patrick Zschech, 7 Jul 2025, Exploring Agentic Artificial Intelligence Systems: Towards a Typological Framework, https://arxiv.org/abs/2508.00844
Boheng Liu and Ziyu Li and Xia Wu, 4 Aug 2025, Neuromorphic Computing with Multi-Frequency Oscillations: A Bio-Inspired Approach to Artificial Intelligence, https://arxiv.org/abs/2508.02191
Zhuo Yang, Jiaqing Xie, Shuaike Shen, Daolang Wang, Yeyun Chen, Ben Gao, Shuzhou Sun, Biqing Qi, Dongzhan Zhou, Lei Bai, Linjiang Chen, Shufei Zhang, Jun Jiang, Tianfan Fu, Yuqiang Li, 2 Aug 2025, SpectrumWorld: Artificial Intelligence Foundation for Spectroscopy, https://arxiv.org/abs/2508.01188
Md Zahidul Islam, Md Shafiqur Rahman, Md Sumsuzoha, Babul Sarker, Md Rafiqul Islam, Mahfuz Alam and Sanjib Kumar Shil, 2 Aug 2025, Cryptocurrency Price Forecasting Using Machine Learning: Building Intelligent Financial Prediction Models, https://arxiv.org/abs/2508.01419
Sangjun Park, Tony Q.S. Quek, Hyowoon Seo, 4 Aug 2025, Pigeon-SL: Robust Split Learning Framework for Edge Intelligence under Malicious Clients, https://arxiv.org/abs/2508.02235
Songlin Xu, Xinyu Zhang, 9 Jul 2025, Cognitive Exoskeleton: Augmenting Human Cognition with an AI-Mediated Intelligent Visual Feedback, https://arxiv.org/abs/2508.00846
Bang Liu, Xinfeng Li, Jiayi Zhang, Jinlin Wang, Tanjin He, Sirui Hong, Hongzhang Liu, Shaokun Zhang, Kaitao Song, Kunlun Zhu, Yuheng Cheng, Suyuchen Wang, Xiaoqiang Wang, Yuyu Luo, Haibo Jin, Peiyan Zhang, Ollie Liu, Jiaqi Chen, Huan Zhang, Zhaoyang Yu, Haochen Shi, Boyan Li, Dekun Wu, Fengwei Teng, Xiaojun Jia, Jiawei Xu, Jinyu Xiang, Yizhang Lin, Tianming Liu, Tongliang Liu, Yu Su, Huan Sun, Glen Berseth, Jianyun Nie, Ian Foster, Logan Ward, Qingyun Wu, Yu Gu, Mingchen Zhuge, Xinbing Liang, Xiangru Tang, Haohan Wang, Jiaxuan You, Chi Wang, Jian Pei, Qiang Yang, Xiaoliang Qi, Chenglin Wu, 2 Aug 2025, Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems, https://arxiv.org/abs/2504.01990
Andrew G. Breithaupt, Michael Weiner, Alice Tang, Katherine L. Possin, Marina Sirota, James Lah, Allan I. Levey, Pascal Van Hentenryck, Reza Zandehshahvar, Marilu Luisa Gorno-Tempini, Joseph Giorgio, Jingshen Wang, Andreas M. Rauschecker, Howard J. Rosen, Rachel L. Nosheny, Bruce L. Miller, Pedro Pinheiro-Chagas, 1 Aug 2025, Integrating Generative Artificial Intelligence in ADRD: A Roadmap for Streamlining Diagnosis and Care in Neurodegenerative Diseases, https://arxiv.org/abs/2502.06842
Haris Khan, Shumaila Asif, Hassan Nasir, Kamran Aziz Bhatti, Shahzad Amin Sheikh, 1 Aug 2025, Advances in Intelligent Hearing Aids: Deep Learning Approaches to Selective Noise Cancellation, https://arxiv.org/abs/2507.07043
AgiBot-World-Contributors, Qingwen Bu, Jisong Cai, Li Chen, Xiuqi Cui, Yan Ding, Siyuan Feng, Shenyuan Gao, Xindong He, Xuan Hu, Xu Huang, Shu Jiang, Yuxin Jiang, Cheng Jing, Hongyang Li, Jialu Li, Chiming Liu, Yi Liu, Yuxiang Lu, Jianlan Luo, Ping Luo, Yao Mu, Yuehan Niu, Yixuan Pan, Jiangmiao Pang, Yu Qiao, Guanghui Ren, Cheng Ruan, Jiaqi Shan, Yongjian Shen, Chengshi Shi, Mingkang Shi, Modi Shi, Chonghao Sima, Jianheng Song, Huijie Wang, Wenhao Wang, Dafeng Wei, Chengen Xie, Guo Xu, Junchi Yan, Cunbiao Yang, Lei Yang, Shukai Yang, Maoqing Yao, Jia Zeng, Chi Zhang, Qinglin Zhang, Bin Zhao, Chengyue Zhao, Jiaqi Zhao, Jianchao Zhu, 4 Aug 2025, AgiBot World Colosseo: A Large-scale Manipulation Platform for Scalable and Intelligent Embodied Systems, https://arxiv.org/abs/2503.06669
Xingdan Wang, Jiayi He, Zhiqing Tang, Jianxiong Guo, Jiong Lou, Liping Qian, Tian Wang, Weijia Jia, 5 Aug 2025, Adaptive AI Agent Placement and Migration in Edge Intelligence Systems, https://arxiv.org/abs/2508.03345
Davide Gabrielli, Bardh Prenkaj, Paola Velardi, Stefano Faralli, 5 Aug 2025, AI on the Pulse: Real-Time Health Anomaly Detection with Wearable and Ambient Intelligence, https://arxiv.org/abs/2508.03436
Albertus Denny Handoko and Riko I Made, 5 Aug 2025, Artificial Intelligence and Generative Models for Materials Discovery -- A Review, https://arxiv.org/abs/2508.03278
Siddhant Deshpande, Yalemzerf Getnet, Waltenegus Dargie, 4 Aug 2025, Detection of Intelligent Tampering in Wireless Electrocardiogram Signals Using Hybrid Machine Learning, https://arxiv.org/abs/2507.06402
Qiguang Chen, Mingda Yang, Libo Qin, Jinhao Liu, Zheng Yan, Jiannan Guan, Dengyun Peng, Yiyan Ji, Hanjing Li, Mengkang Hu, Yimeng Zhang, Yihao Liang, Yuhang Zhou, Jiaqi Wang, Zhi Chen, Wanxiang Che, 5 Aug 2025, AI4Research: A Survey of Artificial Intelligence for Scientific Research, https://arxiv.org/abs/2507.01903
Charles L. Wang, Trisha Singhal, Ameya Kelkar, Jason Tuo, 5 Aug 2025, MI9 -- Agent Intelligence Protocol: Runtime Governance for Agentic AI Systems, https://arxiv.org/abs/2508.03858
Wesley Brewer, Murali Meena Gopalakrishnan, Matthias Maiterth, Aditya Kashi, Jong Youl Choi, Pei Zhang, Stephen Nichols, Riccardo Balin, Miles Couchman, Stephen de Bruyn Kops, P.K. Yeung, Daniel Dotson, Rohini Uma-Vaideswaran, Sarp Oral, Feiyi Wang, 5 Aug 2025, Intelligent Sampling of Extreme-Scale Turbulence Datasets for Accurate and Efficient Spatiotemporal Model Training, https://arxiv.org/abs/2508.03872
Anna Romanova, 5 Aug 2025, Development of management systems using artificial intelligence systems and machine learning methods for boards of directors (preprint, unofficial translation), https://arxiv.org/abs/2508.03769
Fardis Nadimi, Payam Abdisarabshali, Kasra Borazjani, Jacob Chakareski, Seyyedali Hosseinalipour, 5 Aug 2025, Multi-Modal Multi-Task Federated Foundation Models for Next-Generation Extended Reality Systems: Towards Privacy-Preserving Distributed Intelligence in AR/VR/MR, https://arxiv.org/abs/2506.05683
Nan Li, Wanting Yang, Marie Siew, Zehui Xiong, Binbin Chen, Shiwen Mao, Kwok-Yan Lam, 6 Aug 2025, Edge-Assisted Collaborative Fine-Tuning for Multi-User Personalized Artificial Intelligence Generated Content (AIGC), https://arxiv.org/abs/2508.04745
Zhaowei Wang and Yunsong Huang and Weicheng Liu and Hui-Ming Wang, 7 Aug 2025, Anti-Jamming Sensing with Distributed Reconfigurable Intelligent Metasurface Antennas, https://arxiv.org/abs/2508.04964
Bo Wen, 7 Aug 2025, A Framework for Inherently Safer AGI through Language-Mediated Active Inference, https://arxiv.org/abs/2508.05766
Christian Meske, Justin Brenne, Erdi Uenal, Sabahat Oelcer and Ayseguel Doganguen, 8 Aug 2025, From Explainable to Explanatory Artificial Intelligence: Toward a New Paradigm for Human-Centered Explanations through Generative AI, https://arxiv.org/abs/2508.06352
Xiangzhe Xu, Shiwei Feng, Zian Su, Chengpeng Wang, Xiangyu Zhang, 8 Aug 2025, Position: Intelligent Coding Systems Should Write Programs with Justifications, https://arxiv.org/abs/2508.06017
Mojtaba Valipour, Kelly Zheng, James Lowman, Spencer Szabados, Mike Gartner, and Bobby Braswell, 8 Aug 2025, AGI for the Earth, the path, possibilities and how to evaluate intelligence of models that work with Earth Observation Data?, https://arxiv.org/abs/2508.06057
Ruben Laukkonen, Fionn Inglis, Shamil Chandaria, Lars Sandved-Smith, Edmundo Lopez-Sola, Jakob Hohwy, Jonathan Gold, Adam Elwood, 8 Aug 2025, Contemplative Artificial Intelligence, https://arxiv.org/abs/2504.15125
Ahmed Tlili, 9 Aug 2025, Between Fear and Desire, the Monster Artificial Intelligence (AI): Analysis through the Lenses of Monster Theory, https://arxiv.org/abs/2508.08318
Sejin Kim, Sundong Kim, 12 Aug 2025, System~2 Reasoning for Human--AI Alignment: Generality and Adaptivity via ARC-AGI, https://arxiv.org/abs/2410.07866
Ratun Rahman, 12 Aug 2025, Federated Learning: A Survey on Privacy-Preserving Collaborative Intelligence, https://arxiv.org/abs/2504.17703
Jared Edward Reser, 11 Aug 2025, Artificial Intelligence Software Structured to Simulate Human Working Memory, Mental Imagery, and Mental Continuity, https://arxiv.org/abs/2204.05138
Jing Liu, Yao Du, Kun Yang, Jiaqi Wu, Yan Wang, Xiping Hu, Zehua Wang, Yang Liu, Peng Sun, Azzedine Boukerche, Victor C.M. Leung, 12 Aug 2025, Edge-Cloud Collaborative Computing on Distributed Intelligence and Model Optimization: A Survey, https://arxiv.org/abs/2505.01821
Sundong Kim, 12 Aug 2025, The Othello AI Arena: Evaluating Intelligent Systems Through Limited-Time Adaptation to Unseen Boards, https://arxiv.org/abs/2508.09292
Meiping Wang, Jian Zhong, Rongduo Han, Liming Kang, Zhengkun Shi, Xiao Liang, Xing Lin, Nan Gao, Haining Zhang, 13 Aug 2025, An Automated Multi-Modal Evaluation Framework for Mobile Intelligent Assistants, https://arxiv.org/abs/2508.09507
Manuel Herrador, 13 Aug 2025, The PacifAIst Benchmark:Would an Artificial Intelligence Choose to Sacrifice Itself for Human Safety?, https://arxiv.org/abs/2508.09762
Changyuan Zhao, Guangyuan Liu, Ruichen Zhang, Yinqiu Liu, Jiacheng Wang, Jiawen Kang, Dusit Niyato, Zan Li, Xuemin (Sherman) Shen, Zhu Han, Sumei Sun, Chau Yuen, Dong In Kim, 13 Aug 2025, Edge General Intelligence Through World Models and Agentic AI: Fundamentals, Solutions, and Challenges, https://arxiv.org/abs/2508.09561
Xuanru Zhou, Cheng Li, Shuqiang Wang, Ye Li, Tao Tan, Hairong Zheng, and Shanshan Wang, 7 Aug 2025, Generative Artificial Intelligence in Medical Imaging: Foundations, Progress, and Clinical Translation, https://arxiv.org/abs/2508.09177
Abdolazim Rezaei, Mehdi Sookhak, Mahboobeh Haghparast, 7 Aug 2025, RL-MoE: An Image-Based Privacy Preserving Approach In Intelligent Transportation System, https://arxiv.org/abs/2508.09186
Fan Zhang, Zebang Cheng, Chong Deng, Haoxuan Li, Zheng Lian, Qian Chen, Huadai Liu, Wen Wang, Yi-Fan Zhang, Renrui Zhang, Ziyu Guo, Zhihong Zhu, Hao Wu, Haixin Wang, Yefeng Zheng, Xiaojiang Peng, Xian Wu, Kun Wang, Xiangang Li, Jieping Ye, Pheng-Ann Heng, 11 Aug 2025, MME-Emotion: A Holistic Evaluation Benchmark for Emotional Intelligence in Multimodal Large Language Models, https://arxiv.org/abs/2508.09210
Ronald Carvalho Boadana, Ademir Guimar\~aes da Costa Junior, Ricardo Rios, F\'abio Santos da Silva, 7 Aug 2025, LLM-Based Intelligent Agents for Music Recommendation: A Comparison with Classical Content-Based Filtering, https://arxiv.org/abs/2508.11671
Vincent C. M\"uller and Nick Bostrom, 9 Aug 2025, Future progress in artificial intelligence: A survey of expert opinion, https://arxiv.org/abs/2508.11681
Kiruthika Balakrishnan, Durgadevi Velusamy, Hana E. Hinkle, Zhi Li, Karthikeyan Ramasamy, Hikmat Khan, Srini Ramaswamy, and Pir Masoom Shah, 15 Aug 2025, Artificial Intelligence in Rural Healthcare Delivery: Bridging Gaps and Enhancing Equity through Innovation, https://arxiv.org/abs/2508.11738
E. Ulises Moya-S\'anchez, Abraham S\'anchez-Perez, Ra\'ul Nanclares Da Veiga, Alejandro Zarate-Mac\'ias, Edgar Villareal, Alejandro S\'anchez-Montes, Edtna Jauregui-Ulloa, H\'ector Moreno and Ulises Cort\'es, 17 Aug 2025, Design and Validation of a Responsible Artificial Intelligence-based System for the Referral of Diabetic Retinopathy Patients, https://arxiv.org/abs/2508.12506
Zhongang Cai, Yubo Wang, Qingping Sun, Ruisi Wang, Chenyang Gu, Wanqi Yin, Zhiqian Lin, Zhitao Yang, Chen Wei, Xuanke Shi, Kewang Deng, Xiaoyang Han, Zukai Chen, Jiaqi Li, Xiangyu Fan, Hanming Deng, Lewei Lu, Bo Li, Ziwei Liu, Quan Wang, Dahua Lin, Lei Yang, 18 Aug 2025, Has GPT-5 Achieved Spatial Intelligence? An Empirical Study, https://arxiv.org/abs/2508.13142
Ruitao Chen, Mozhang Guo, Jinge Li, 17 Aug 2025, Towards Infant Sleep-Optimized Driving: Synergizing Wearable and Vehicle Sensing in Intelligent Cruise Control, https://arxiv.org/abs/2506.06459
Aleksandr Algazinov, Joydeep Chandra, and Matt Laing, 18 Aug 2025, INSIGHT: A Survey of In-Network Systems for Intelligent, High-Efficiency AI and Topology Optimization, https://arxiv.org/abs/2505.24269
Xiao-Wen Yang, Jie-Jing Shao, Lan-Zhe Guo, Bo-Wen Zhang, Zhi Zhou, Lin-Han Jia, Wang-Zhou Dai and Yu-Feng Li, 19 Aug 2025, Neuro-Symbolic Artificial Intelligence: Towards Improving the Reasoning Abilities of Large Language Models, https://arxiv.org/abs/2508.13678
Soumyadeep Dhar, 19 Aug 2025, The Collaboration Paradox: Why Generative AI Requires Both Strategic Intelligence and Operational Stability in Supply Chain Management, https://arxiv.org/abs/2508.13942
Mahmoud Nazzal, Khoa Nguyen, Deepak Vungarala, Ramtin Zand, Shaahin Angizi, Hai Phan, Abdallah Khreishah, 23 Jul 2025, FedChip: Federated LLM for Artificial Intelligence Accelerator Chip Design, https://arxiv.org/abs/2508.13162
Hunter McNichols, Fareya Ikram, Andrew Lan, 19 Aug 2025, The StudyChat Dataset: Student Dialogues With ChatGPT in an Artificial Intelligence Course, https://arxiv.org/abs/2503.07928
Jesmin Jahan Tithi and Hanjiang Wu and Avishaii Abuhatzera and Fabrizio Petrini, 19 Aug 2025, Scaling Intelligence: Designing Data Centers for Next-Gen Language Models, https://arxiv.org/abs/2506.15006
Kevin Xu, Risto Miikkulainen, 19 Aug 2025, Neural Cellular Automata for ARC-AGI, https://arxiv.org/abs/2506.15746
Lian Lian, Yilin Li, Song Han, Renzi Meng, Sibo Wang, Ming Wang, 20 Aug 2025, Artificial Intelligence-Based Multiscale Temporal Modeling for Anomaly Detection in Cloud Services, https://arxiv.org/abs/2508.14503
Md Mainul Abrar, Xun Jia, Yujie Chi, 19 Aug 2025, New Insights into Automatic Treatment Planning for Cancer Radiotherapy Using Explainable Artificial Intelligence, https://arxiv.org/abs/2508.14229
Sabab Aosaf, Muhammad Ali Nayeem, Afsana Haque, M Sohel Rahmana, 21 Aug 2025, Computational Intelligence based Land-use Allocation Approaches for Mixed Use Areas, https://arxiv.org/abs/2508.15240
Johannes Schleiss and Anke Manukjan and Michelle Ines Bieber and Sebastian Lang and Sebastian Stober, 18 Aug 2025, Designing an Interdisciplinary Artificial Intelligence Curriculum for Engineering: Evaluation and Insights from Experts, https://arxiv.org/abs/2508.14921
John E. Hummel and Rachel F. Heaton, 20 Aug 2025, From Basic Affordances to Symbolic Thought: A Computational Phylogenesis of Biological Intelligence, https://arxiv.org/abs/2508.15082
Ben Dickson, August 19, 2025, LLMs generate ‘fluent nonsense’ when reasoning outside their training zone, https://venturebeat.com/ai/llms-generate-fluent-nonsense-when-reasoning-outside-their-training-zone/
Zhifeng Yang, Peizong Wu, 17 Aug 2025, Research on intelligent generation of structural demolition suggestions based on multi-model collaboration, https://arxiv.org/abs/2508.15820
Aparna Singh, Geetanjali Rathee, Chaker Abdelaziz Kerrache, Mohamed Chahine Ghanem, 22 Aug 2025, A Relay-Chain-Powered Ciphertext-Policy Attribute-Based Encryption in Intelligent Transportation Systems, https://arxiv.org/abs/2508.16189
Yang Deng, Zifeng Ren, An Zhang, Tat-Seng Chua, 22 Aug 2025, Towards Goal-oriented Intelligent Tutoring Systems in Online Education, https://arxiv.org/abs/2312.10053
Shouwei Ruan, Liyuan Wang, Caixin Kang, Qihui Zhu, Songming Liu, Xingxing Wei and Hang Su, 24 Aug 2025, From reactive to cognitive: brain-inspired spatial intelligence for embodied agents, https://arxiv.org/abs/2508.17198
R\'enald Gesnot, 15 Aug 2025, The Impact of Artificial Intelligence on Human Thought, https://arxiv.org/abs/2508.16628
Hongrak Pak, Ali Mostafavi, 21 Aug 2025, Situational Awareness as the Imperative Capability for Disaster Resilience in the Era of Complex Hazards and Artificial Intelligence, https://arxiv.org/abs/2508.16669
Jussi S. Jauhiainen, Aurora Toppari, 22 Aug 2025, Generative Artificial Intelligence and Agents in Research and Teaching, https://arxiv.org/abs/2508.16701
Anton Ludwig Bonin, Pawel Robert Smolinski, Jacek Winiarski, 22 Aug 2025, Exploring the Impact of Generative Artificial Intelligence on Software Development in the IT Sector: Preliminary Findings on Productivity, Efficiency and Job Security, https://arxiv.org/abs/2508.16811
Zixuan Dong, Baoyun Peng, Yufei Wang, Lin Liu, Xinxin Dong, Yunlong Cao, Xiaodong Wang, 25 Aug 2025, See What You Need: Query-Aware Visual Intelligence through Reasoning-Perception Loops, https://arxiv.org/abs/2508.17932
Kaushik Ravi, Andreas Br\"uck, 25 Aug 2025, Citizen Centered Climate Intelligence: Operationalizing Open Tree Data for Urban Cooling and Eco-Routing in Indian Cities, https://arxiv.org/abs/2508.17648
Yajing Yang, Qian Liu, Min-Yen Kan, 23 Aug 2025, DataTales: A Benchmark for Real-World Intelligent Data Narration, https://arxiv.org/abs/2410.17859
Christopher J. Mungall and Adnan Malik and Daniel R. Korn and Justin T. Reese and Noel M. O'Boyle, Noel and Janna Hastings, 24 Aug 2025, Chemical classification program synthesis using generative artificial intelligence, https://arxiv.org/abs/2505.18470
Maryam Ahang, Todd Charter, Mostafa Abbasi, Maziyar Khadivi, Oluwaseyi Ogunfowora, Homayoun Najjaran, 22 Aug 2025, Intelligent Condition Monitoring of Industrial Plants: An Overview of Methodologies and Uncertainty Management Strategies, https://arxiv.org/abs/2401.10266
He Hu, Yucheng Zhou, Lianzhong You, Hongbo Xu, Qianning Wang, Zheng Lian, Fei Richard Yu, Fei Ma, Laizhong Cui, 25 Aug 2025, EmoBench-M: Benchmarking Emotional Intelligence for Multimodal Large Language Models, https://arxiv.org/abs/2502.04424
Walter Veit and Heather Browning, Sep 2025, Scientists discover crows can plan ahead and understand other minds in groundbreaking new study, https://www.msn.com/en-au/lifestyle/familyandrelationships/scientists-discover-crows-can-plan-ahead-and-understand-other-minds-in-groundbreaking-new-study/ar-AA1FmyXK
Tiernan Ray, Sept. 6, 2025, AI's not 'reasoning' at all - how this team debunked the industry hype: Researchers just got very specific about what a language model's 'chain of thought' is actually doing, https://www.zdnet.com/article/ais-not-reasoning-at-all-how-this-team-debunked-the-industry-hype/
Stephanie Palazzolo, Sep 2025, OpenAI’s Models Are Getting Too Smart For Their Human Teachers, https://www.theinformation.com/articles/openais-models-getting-smart-human-teachers (Using human labeling to train AI models is becoming more difficult, as the models begin to surpass humans.)
Jeremy Berman, Sep 17, 2025, How I got the highest score on ARC-AGI again swapping Python for English: Using Multi-Agent Collaboration with Evolutionary Test-Time Compute, https://jeremyberman.substack.com/p/how-i-got-the-highest-score-on-arc-agi-again (Generates multiple solutions then prunes them with "evolution" and iterates in multi-step inference.)
Yongzhi Qi, Jiaheng Yin, Jianshen Zhang, Dongyang Geng, Zhengyu Chen, Hao Hu, Wei Qi, Zuo-Jun Max Shen, 4 Sep 2025, Leveraging LLM-Based Agents for Intelligent Supply Chain Planning, https://arxiv.org/abs/2509.03811
Payam Abdisarabshali, Fardis Nadimi, Kasra Borazjani, Naji Khosravan, Minghui Liwang, Wei Ni, Dusit Niyato, Michael Langberg, Seyyedali Hosseinalipour, 3 Sep 2025, Hierarchical Federated Foundation Models over Wireless Networks for Multi-Modal Multi-Task Intelligence: Integration of Edge Learning with D2D/P2P-Enabled Fog Learning Architectures, https://arxiv.org/abs/2509.03695
Parth Ashokbhai Shiroya, Swarnagowri Shashidhar, Amod Ashtekar, Krishna Aindrila Kar, Rafaela Lomboy, Dalton Davis, and Mohammed E. Eltayeb, 4 Sep 2025, Machine Learning for LiDAR-Based Indoor Surface Classification in Intelligent Wireless Environments, https://arxiv.org/abs/2509.03813
Ahmed Mubarak, Amna Ahmed, Amira Nasser, Aya Mohamed, Fares El-Sadek, Mohammed Ahmed, Ahmed Salah, Youssef Sobhy, 27 Aug 2025, QuesGenie: Intelligent Multimodal Question Generation, https://arxiv.org/abs/2509.03535
Guanglu Jia, Ceng Zhang, Gregory S. Chirikjian, 4 Sep 2025, INGRID: Intelligent Generative Robotic Design Using Large Language Models, https://arxiv.org/abs/2509.03842
Karl Fezer and Andrew Sloss, 3 Sep 2025, Intelligence Primer, https://arxiv.org/abs/2008.07324
Liang Li, Yuntian Li, Wenxin Zhao, Shan Ye, Yun Lu, 4 Sep 2025, Is Artificial Intelligence Reshaping the Landscape of the International Academic Community of Geosciences?, https://arxiv.org/abs/2508.20117
Sri Krishna Vadlamani, Kfir Sulimany, Zhihui Gao, Tingjun Chen, Dirk Englund, 4 Sep 2025, Machine Intelligence on Wireless Edge Networks, https://arxiv.org/abs/2506.12210
Ashutosh Kumar Sinha, Ayush Patel, Mitul Dudhat, Pritam Anand, and Rahul Mishra, 4 Sep 2025, i-Mask: An Intelligent Mask for Breath-Driven Activity Recognition, https://arxiv.org/abs/2509.04544
Brennen Hill, 4 Sep 2025, Scaling Environments for Organoid Intelligence with LLM-Automated Design and Plasticity-Based Evaluation, https://arxiv.org/abs/2509.04633
Yuxuan Du, Yan Zhu, Yuan-Hang Zhang, Min-Hsiu Hsieh, Patrick Rebentrost, Weibo Gao, Ya-Dong Wu, Jens Eisert, Giulio Chiribella, Dacheng Tao, Barry C. Sanders, 5 Sep 2025, Artificial intelligence for representing and characterizing quantum systems, https://arxiv.org/abs/2509.04923
Hung-Tien Huang, Dzung Dinh, Junier B. Oliva, 25 Aug 2025, Information Templates: A New Paradigm for Intelligent Active Feature Acquisition, https://arxiv.org/abs/2508.18380
I.I. Priezzhev, D.A. Danko, A.V. Shubin, 26 Aug 2025, Novel Approaches to Artificial Intelligence Development Based on the Nearest Neighbor Method, https://arxiv.org/abs/2508.18953
Chiu-Chou Lin, 26 Aug 2025, Playstyle and Artificial Intelligence: An Initial Blueprint Through the Lens of Video Games, https://arxiv.org/abs/2508.19152
Sidahmed Benabderrahmane, Talal Rahwan, 26 Aug 2025, Metric Matters: A Formal Evaluation of Similarity Measures in Active Learning for Cyber Threat Intelligence, https://arxiv.org/abs/2508.19019
Martin Lochner and Keegan Keplinger, 25 Aug 2025, Collaborative Intelligence: Topic Modelling of Large Language Model use in Live Cybersecurity Operations, https://arxiv.org/abs/2508.18488
Shaswata Mitra, Azim Bazarov, Martin Duclos, Sudip Mittal, Aritran Piplai, Md Rayhanur Rahman, Edward Zieglar, Shahram Rahimi, 26 Aug 2025, FALCON: Autonomous Cyber Threat Intelligence Mining with LLMs for IDS Rule Generation, https://arxiv.org/abs/2508.18684
Jiaqi Wu, Jing Liu, Yang Liu, Lixu Wang, Zehua Wang, Wei Chen, Zijian Tian, Richard Yu, Victor C.M. Leung, 26 Aug 2025, A Survey on Cloud-Edge-Terminal Collaborative Intelligence in AIoT Networks, https://arxiv.org/abs/2508.18803
Seungwook Han, Jyothish Pari, Samuel J. Gershman, Pulkit Agrawal, 26 Aug 2025, General Intelligence Requires Reward-based Pretraining, https://arxiv.org/abs/2502.19402
Antonio L\'opez Mart\'inez, Alejandro Cano, Antonio Ruiz-Mart\'inez, 26 Aug 2025, Generative Artificial Intelligence-Supported Pentesting: A Comparison between Claude Opus, GPT-4, and Copilot, https://arxiv.org/abs/2501.06963
Nitish Jaipuria, Lorenzo Gatto, Zijun Kan, Shankey Poddar, Bill Cheung, Diksha Bansal, Ramanan Balakrishnan, Aviral Suri, Jose Estevez, 27 Aug 2025, CASE: An Agentic AI Framework for Enhancing Scam Intelligence in Digital Payments, https://arxiv.org/abs/2508.19932
Ji Wang, Kashing Chen, Xinyuan Song, Ke Zhang, Lynn Ai, Eric Yang, Bill Shi, 27 Aug 2025, Symphony: A Decentralized Multi-Agent Framework for Scalable Collective Intelligence, https://arxiv.org/abs/2508.20019
Bijean Ghafouri, 20 Aug 2025, A Theory of Information, Variation, and Artificial Intelligence, https://arxiv.org/abs/2508.19264
Evandro L. T. P. Cunha, 26 Aug 2025, A perishable ability? The future of writing in the face of generative artificial intelligence, https://arxiv.org/abs/2508.19427
Sumon Kanti Dey, Jeanne M. Powell, Azra Ismail, Jeanmarie Perrone, Abeed Sarker, 26 Aug 2025, Inference Gap in Domain Expertise and Machine Intelligence in Named Entity Recognition: Creation of and Insights from a Substance Use-related Dataset, https://arxiv.org/abs/2508.19467
Zachary L. Crang, Rich D. Johnston, Katie L. Mills, Johsan Billingham, Sam Robertson, Michael H. Cole, Jonathon Weakley, Adam Hewitt and, Grant M. Duthie, 26 Aug 2025, Concurrent validity of computer-vision artificial intelligence player tracking software using broadcast footage, https://arxiv.org/abs/2508.19477
Muhammad Ahmed Mohsin, Junaid Ahmad, Muhammad Hamza Nawaz, Muhammad Ali Jamshed, 27 Aug 2025, Towards 6G Intelligence: The Role of Generative AI in Future Wireless Networks, https://arxiv.org/abs/2508.19495
RexCharles Donatus, Kumater Ter, Ore-Ofe Ajayi, Daniel Udekwe, 27 Aug 2025, Multi-Agent Reinforcement Learning in Intelligent Transportation Systems: A Comprehensive Survey, https://arxiv.org/abs/2508.20315
Alireza Abbaszadeh and Armita Shahlai, 26 Aug 2025, Artificial Intelligence for CRISPR Guide RNA Design: Explainable Models and Off-Target Safety, https://arxiv.org/abs/2508.20130
Thanh Thi Nguyen, Quoc Viet Hung Nguyen, Jonathan Kua, Imran Razzak, Dung Nguyen, Saeid Nahavandi, 28 Aug 2025, Task Allocation for Autonomous Machines using Computational Intelligence and Deep Reinforcement Learning, https://arxiv.org/abs/2508.20688
Javad Enayati and Pedram Asef and Alexandre Benoit, 28 Aug 2025, A Hybrid Artificial Intelligence Method for Estimating Flicker in Power Systems (Changes are marked), https://arxiv.org/abs/2506.13611
Deepro Choudhury, Sinead Williamson, Adam Goli\'nski, Ning Miao, Freddie Bickford Smith, Michael Kirchhof, Yizhe Zhang, Tom Rainforth, 28 Aug 2025, BED-LLM: Intelligent Information Gathering with LLMs and Bayesian Experimental Design, https://arxiv.org/abs/2508.21184
Gernot Fiala, Markus Plass, Robert Harb, Peter Regitnig, Kristijan Skok, Wael Al Zoughbi, Carmen Zerner, Paul Torke, Michaela Kargl, Heimo M\"uller, Tomas Brazdil, Matej Gallo, Jaroslav Kub\'in, Roman Stoklasa, Rudolf Nenutil, Norman Zerbe, Andreas Holzinger, Petr Holub, 29 Aug 2025, Standardized Multi-Layer Tissue Maps for Enhanced Artificial Intelligence Integration and Search in Large-Scale Whole Slide Image Archives, https://arxiv.org/abs/2508.21418
Haozhe Tian, Qiyu Rao, Nina Moutonnet, Pietro Ferraro, Danilo Mandic, 29 Aug 2025, Machine Intelligence on the Edge: Interpretable Cardiac Pattern Localisation Using Reinforcement Learning, https://arxiv.org/abs/2508.21652
Zhang Lai Bin and Zhen Bin It, 30 Aug 2025, Artificial Intelligence-Based Analysis of Ice Cream Melting Behavior Under Various Ingredients, https://arxiv.org/abs/2509.00507
Li Weigang, Pedro Carvalho Brom, Lucas Ramson Siefert, 30 Aug 2025, LLM-Assisted Iterative Evolution with Swarm Intelligence Toward SuperBrain, https://arxiv.org/abs/2509.00510
Maijunxian Wang, Ran Ji, 2 Sep 2025, AGI as Second Being: The Structural-Generative Ontology of Intelligence, https://arxiv.org/abs/2509.02089
Long Jiang, Yang Yang, Ting Fong May Chui, Morgan Thornwell, Hoshin Vijai Gupta, 2 Sep 2025, Knowledge distillation as a pathway toward next-generation intelligent ecohydrological modeling systems, https://arxiv.org/abs/2509.01972
Jiancheng Ye, Michelle Ma, Malak Abuhashish, 26 Aug 2025, The Collaborations among Healthcare Systems, Research Institutions, and Industry on Artificial Intelligence Research and Development, https://arxiv.org/abs/2509.00068
Luca Cotti, Anisa Rula, Devis Bianchini, Federico Cerutti, 26 Aug 2025, Enabling Transparent Cyber Threat Intelligence Combining Large Language Models and Domain Ontologies, https://arxiv.org/abs/2509.00081
Adel Vehrer and Zsolt Palfalusi, 28 Aug 2025, The Application of Virtual Environments and Artificial Intelligence in Higher Education: Experimental Findings in Philosophy Teaching, https://arxiv.org/abs/2509.00110
Rakshitha De Silva, Shiva Raj Pokhrel, Jonathan Kua and Sithamparanathan Kandeepan, 30 Aug 2025, Intelligent Spectrum Management in Satellite Communications, https://arxiv.org/abs/2509.00286
Jiading Fang, 30 Aug 2025, Embodied Spatial Intelligence: from Implicit Scene Modeling to Spatial Reasoning, https://arxiv.org/abs/2509.00465
Luxi He, Nimra Nadeem, Michel Liao, Howard Chen, Danqi Chen, Mariano-Florentino Cu\'ellar, Peter Henderson, 1 Sep 2025, Statutory Construction and Interpretation for Artificial Intelligence, https://arxiv.org/abs/2509.01186
Urko Pe\~na-Alonso, Sim\'on Pe\~na-Fern\'andez and Koldobika Meso-Ayerdi, 1 Sep 2025, Journalists' Perceptions of Artificial Intelligence and Disinformation Risks, https://arxiv.org/abs/2509.01824
Gabriel Spadon, Oladapo Oyebode, Camilo M. Botero, Tushar Sharma, Floris Goerlandt, Ronald Pelot, 2 Sep 2025, Community-Centered Spatial Intelligence for Climate Adaptation at Nova Scotia's Eastern Shore, https://arxiv.org/abs/2509.01845
Sheng Ye, Jiyu Li, Yifan Chai, Lin Liu, Murugesu Sivapalan, Qihua Ran, 2 Sep 2025, Using explainable artificial intelligence (XAI) as a diagnostic tool: An application for deducing hydrologic connectivity at watershed scale, https://arxiv.org/abs/2509.02127
Aline Dobrovsky, Konstantin Schekotihin, Christian Burmer, 2 Sep 2025, Intelligent Assistants for the Semiconductor Failure Analysis with LLM-Based Planning Agents, https://arxiv.org/abs/2506.15567
Andrew Ferguson, Marisa LaFleur, Lars Ruthotto, Jesse Thaler, Yuan-Sen Ting, Pratyush Tiwary, Soledad Villar, E. Paulo Alves, Jeremy Avigad, Simon Billinge, Camille Bilodeau, Keith Brown, Emmanuel Candes, Arghya Chattopadhyay, Bingqing Cheng, Jonathan Clausen, Connor Coley, Andrew Connolly, Fred Daum, Sijia Dong, Chrisy Xiyu Du, Cora Dvorkin, Cristiano Fanelli, Eric B. Ford, Luis Manuel Frutos, Nicol\'as Garc\'ia Trillos, Cecilia Garraffo, Robert Ghrist, Rafael Gomez-Bombarelli, Gianluca Guadagni, Sreelekha Guggilam, Sergei Gukov, Juan B. Guti\'errez, Salman Habib, Johannes Hachmann, Boris Hanin, Philip Harris, Murray Holland, Elizabeth Holm, Hsin-Yuan Huang, Shih-Chieh Hsu, Nick Jackson, Olexandr Isayev, Heng Ji, Aggelos Katsaggelos, Jeremy Kepner, Yannis Kevrekidis, Michelle Kuchera, J. Nathan Kutz, et al. (51 additional authors not shown), 2 Sep 2025, The Future of Artificial Intelligence and the Mathematical and Physical Sciences (AI+MPS), https://arxiv.org/abs/2509.02661
Pujitha Mamillapalli, Yoghitha Ramamoorthi, Abhinav Kumar, Tomoki Murakami, Tomoaki Ogawa and Yasushi Takatori, 3 Sep 2025, Unsupervised Learning based Element Resource Allocation for Reconfigurable Intelligent Surfaces in mmWave Network, https://arxiv.org/abs/2509.03241
Oguzhan Baser, Kaan Kale, Po-han Li and Sandeep Chinchali, 2 Sep 2025, Fair Resource Allocation for Fleet Intelligence, https://arxiv.org/abs/2509.03353
Xingxuan Zhang, Gang Ren, Han Yu, Hao Yuan, Hui Wang, Jiansheng Li, Jiayun Wu, Lang Mo, Li Mao, Mingchao Hao, Ningbo Dai, Renzhe Xu, Shuyang Li, Tianyang Zhang, Yue He, Yuanrui Wang, Yunjia Zhang, Zijing Xu, Dongzhe Li, Fang Gao, Hao Zou, Jiandong Liu, Jiashuo Liu, Jiawei Xu, Kaijie Cheng, Kehan Li, Linjun Zhou, Qing Li, Shaohua Fan, Xiaoyu Lin, Xinyan Han, Xuanyue Li, Yan Lu, Yuan Xue, Yuanyuan Jiang, Zimu Wang, Zhenlei Wang, Peng Cui, 3 Sep 2025, LimiX: Unleashing Structured-Data Modeling Capability for Generalist Intelligence, https://arxiv.org/abs/2509.03505
Nefeli Manoudaki, Mert Toka, Iason Paterakis, Diarmid Flatley, 3 Sep 2025, Simulacra Naturae: Generative Ecosystem driven by Agent-Based Simulations and Brain Organoid Collective Intelligence, https://arxiv.org/abs/2509.02924
Jie Yang, Jiajun Chen, Zhangyue Yin, Shuo Chen, Yuxin Wang, Yiran Guo, Yuan Li, Yining Zheng, Xuanjing Huang, Xipeng Qiu, 8 Sep 2025, VehicleWorld: A Highly Integrated Multi-Device Environment for Intelligent Vehicle Interaction, https://arxiv.org/abs/2509.06736
Christo Mathew, Wentian Wang, Lazaros Gallos, Paul Kantor, Vladimir Menkov and Hao Wang, 7 Sep 2025, Toward a Metrology for Artificial Intelligence: Hidden-Rule Environments and Reinforcement Learning, https://arxiv.org/abs/2509.06213
Aosong Feng, Zhichao Xu, Xian Wu, Kang Zhou, Sheng Guan, Yueyan Chen, Ninad Kulkarni, Yun Zhou, Balasubramaniam Srinivasan, Haibo Ding, Lin Lee Cheong, 8 Sep 2025, IPR: Intelligent Prompt Routing with User-Controlled Quality-Cost Trade-offs, https://arxiv.org/abs/2509.06274
Waris Gill, Natalie Isak and Matthew Dressman, 6 Sep 2025, Cross-Service Threat Intelligence in LLM Services using Privacy-Preserving Fingerprints, https://arxiv.org/abs/2509.05608
Sanjeda Akter, Ibne Farabi Shihab, Anuj Sharma, 5 Sep 2025, Image Segmentation with Large Language Models: A Survey with Perspectives for Intelligent Transportation Systems, https://arxiv.org/abs/2506.14096
Kosuke Imai, Kentaro Nakamura, 6 Sep 2025, Causal Representation Learning with Generative Artificial Intelligence: Application to Texts as Treatments, https://arxiv.org/abs/2410.00903