Aussie AI
Small Reasoning Models
-
Last Updated 7 March, 2025
-
by David Spuler, Ph.D.
What are Small Reasoning Models?
Small reasoning models are the combination of reasoning techniques with small language models. Large reasoning models are very expensive to run and the goal is to reduce the cost via a smaller model, but with some loss of accuracy. Small models can be used for two types of reasoning methods: either single-step reasoning or multiple-step inference-based reasoning.
There are two basic approaches to create a Small Reasoning Model (SRM):
- Start with a Large Reasoning Model (LRM) and reduce its size, or
- Start with a small model and increase its reasoning capabilities.
Cutting down a Large Reasoning Model to a smaller one may involve:
- Model compression (e.g. quantization).
- Distillation focused on reasoning knowledge
In the cases of open-source Large Reasoning Models (e.g. DeepSeek R1), there have already been releases of smaller versions, especially quantized ones.
Adding reasoning capabilities to a small model is particularly interesting to the open-source models world. There are many very capable small models of different sizes, but not many are specifically focused on reasoning. Some ways to go about it include:
- Multi-step CoT algorithms wrapped around smaller base models.
- Improved training and fine-tuning of single-step reasoning techniques to enhance a small model.
- Combination of both approaches is also possible.
Research on Small Reasoning Models
Research papers include:
- Matthias Bastian, Oct 6, 2024, Study reveals major reasoning flaws in smaller AI language models, https://the-decoder.com/study-reveals-major-reasoning-flaws-in-smaller-ai-language-models/
- Shuyang Jiang, Yusheng Liao, Zhe Chen, Ya Zhang, Yanfeng Wang, Yu Wang, 21 Jan 2025, MedS3: Towards Medical Small Language Models with Self-Evolved Slow Thinking, https://arxiv.org/abs/2501.12051 https://github.com/pixas/medsss
- Maxwell Zeff, February 5, 2025, Researchers created an open rival to OpenAI’s o1 ‘reasoning’ model for under $50, https://techcrunch.com/2025/02/05/researchers-created-an-open-rival-to-openais-o1-reasoning-model-for-under-50/
- Kyle Wiggers, January 11, 2025, Researchers open source Sky-T1, a ‘reasoning’ AI model that can be trained for less than $450,https://techcrunch.com/2025/01/11/researchers-open-source-sky-t1-a-reasoning-ai-model-that-can-be-trained-for-less-than-450/
- Ben Dickson, February 20, 2025, How test-time scaling unlocks hidden reasoning abilities in small language models (and allows them to outperform LLMs), https://venturebeat.com/ai/how-test-time-scaling-unlocks-hidden-reasoning-abilities-in-small-language-models-and-allows-them-to-outperform-llms/
- Asif Razzaq, March 5, 2025, Qwen Releases QwQ-32B: A 32B Reasoning Model that Achieves Significantly Enhanced Performance in Downstream Task, https://www.marktechpost.com/2025/03/05/qwen-releases-qwq-32b-a-32b-reasoning-model-that-achieves-significantly-enhanced-performance-in-downstream-task/ (Features 32B parameters, 32K context length, 64 layers, RoPE, SwiGLU, RMSNorm, and attention enhancements.)
- Carl Franzen, March 5, 2025, New open-source math model Light-R1-32B surpasses equivalent DeepSeek performance with only $1000 in training costs, https://venturebeat.com/ai/new-open-source-math-model-light-r1-32b-surpasses-equivalent-deepseek-performance-with-only-1000-in-training-costs/
Reasoning and CoT Efficiency Topics
Blog articles on reasoning efficiency:
More research information on general efficiency optimization techniques for reasoning models:
- Reasoning inference optimization (RIO)
- Chain-of-Thought (CoT) optimization
- Small Reasoning Models (SRMs)
- Adaptive Inference Time Compute
- Hybrid Reasoning Models
- Reasoning Tokens
Efficiency optimizations to Chain-of-Thought include:
- Hidden Token Chain-of-Thought (HCot)
- Continuous Chain-of-Thought (Coconut)
- Chain of Draft (CoD)
- CoT Reasoning Decoding
- Concise Chain-of-Thought
- CoT Token Reduction
- CoT Step Skipping
- CoT Early Stopping
- CoT Path Reduction
- Constrained Chain-of-Thought
More AI Research
Read more about: