Aussie AI
Hybrid Reasoning Models
-
Last Updated 7 March, 2025
-
by David Spuler, Ph.D.
What are Hybrid Reasoning Models?
Hybrid reasoning models are LLMs that use a combination of single-step reasoning and inference-based multi-step reasoning. For example, Large Reasoning Models (LRMs) may be trained to use only a single step for some queries. Another example is that less powerful smaller reasoning models may be improved by using multi-step inference-based reasoning, known as "test time compute." Read more about reasoning model techniques.
Research on Hybrid Reasoning Models
Research papers include:
- Maxwell Zeff, February 24, 2025, Anthropic launches a new AI model that ‘thinks’ as long as you want, https://techcrunch.com/2025/02/24/anthropic-launches-a-new-ai-model-that-thinks-as-long-as-you-want/
- Xiaoyu Tian, Liangyu Chen, Na Liu, Yaxuan Liu, Wei Zou, Kaijiang Chen, Ming Cui, 24 Nov 2023 (v4), DUMA: a Dual-Mind Conversational Agent with Fast and Slow Thinking, https://arxiv.org/abs/2310.18075
- Daniele Paliotta, Junxiong Wang, Matteo Pagliardini, Kevin Y. Li, Aviv Bick, J. Zico Kolter, Albert Gu, François Fleuret, Tri Dao, 27 Feb 2025, Thinking Slow, Fast: Scaling Inference Compute with Distilled Reasoners, https://arxiv.org/abs/2502.20339
- Jianyuan Zhong, Zeju Li, Zhijian Xu, Xiangyu Wen, Qiang Xu, 16 Feb 2025, Dyve: Thinking Fast and Slow for Dynamic Process Verification, https://arxiv.org/abs/2502.11157
- Kangan Qian, Zhikun Ma, Yangfan He, Ziang Luo, Tianyu Shi, Tianze Zhu, Jiayin Li, Jianhui Wang, Ziyu Chen, Xiao He, Yining Shi, Zheng Fu, Xinyu Jiao, Kun Jiang, Diange Yang, Takafumi Matsumaru, 27 Nov 2024, FASIONAD : FAst and Slow FusION Thinking Systems for Human-Like Autonomous Driving with Adaptive Feedback, https://arxiv.org/abs/2411.18013
- DiJia Su, Sainbayar Sukhbaatar, Michael Rabbat, Yuandong Tian, Qinqing Zheng, 13 Oct 2024, Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces, https://arxiv.org/abs/2410.09918
- Konstantina Christakopoulou, Shibl Mourad, Maja Matarić, 10 Oct 2024, Agents Thinking Fast and Slow: A Talker-Reasoner Architecture, https://arxiv.org/abs/2410.08328
- Pengbo Hu, Ji Qi, Xingyu Li, Hong Li, Xinqi Wang, Bing Quan, Ruiyu Wang, Yi Zhou, 21 Aug 2023 (v2), Tree-of-Mixed-Thought: Combining Fast and Slow Thinking for Multi-hop Visual Reasoning, https://arxiv.org/abs/2308.09658
- Thilo Hagendorff, Sarah Fabi, Michal Kosinski, 2 Aug 2023 (v2), Thinking Fast and Slow in Large Language Models, https://arxiv.org/abs/2212.05206
- Wenlin Yao, Haitao Mi, Dong Yu, 25 Sep 2024, HDFlow: Enhancing LLM Complex Problem-Solving with Hybrid Thinking and Dynamic Workflows, https://arxiv.org/abs/2409.17433
- Kyle Wiggers, March 4, 2025, Amazon is reportedly developing its own AI ‘reasoning’ model: Amazon reportedly wants to get in on the AI “reasoning” model game, https://techcrunch.com/2025/03/04/amazon-is-reportedly-developing-its-own-ai-reasoning-model/
Reasoning and CoT Efficiency Topics
Blog articles on reasoning efficiency:
More research information on general efficiency optimization techniques for reasoning models:
- Reasoning inference optimization (RIO)
- Chain-of-Thought (CoT) optimization
- Small Reasoning Models (SRMs)
- Adaptive Inference Time Compute
- Hybrid Reasoning Models
- Reasoning Tokens
Efficiency optimizations to Chain-of-Thought include:
- Hidden Token Chain-of-Thought (HCot)
- Continuous Chain-of-Thought (Coconut)
- Chain of Draft (CoD)
- CoT Reasoning Decoding
- Concise Chain-of-Thought
- CoT Token Reduction
- CoT Step Skipping
- CoT Early Stopping
- CoT Path Reduction
- Constrained Chain-of-Thought
More AI Research
Read more about: