Aussie AI

Reasoning

  • Last Updated 12 December, 2024
  • by David Spuler, Ph.D.

Reasoning is a key part of intelligence, and much work is ongoing to improve higher-level reasoning of AI models. Examples include solving mathematical problems or performing multi-step planning such as booking a holiday.

There are two main categories of methods to improve reasoning ability:

  • Training methods ("white box reasoning")
  • Multi-step inference methods ("black box reasoning")

White Box Reasoning. The first idea is ways to train an LLM better, so that it has improved results on "reasoning" and "generalization" tasks. Fine-tuning on a more specialized subset of relevant data is a particular submethod of this area. There has been much improvement in this area, in both the capabilities of high-end large SOTA models and also at the other end of the spectrum with Small Language Models (SLMs). See more about training methods.

Black Box Reasoning. The second idea is to treat the LLM as a "black box" and try to use more LLM calls. These are called "few-shot" or "many-shot" or "multi-step" reasoning methods. Chain-of-thought is the best known of these methods, having been adopted by OpenAI for the "o1" models released in September, 2024. However, multi-step reasoning is a longstanding area of research, with much overlap with prompt engineering techniques, and there are numerous methods of doing this type of multiple calls to LLMs in the literature:

  • Chain-of-thought (CoT)
  • Self-reflection
  • Skeleton-of-thought
  • Best-of-N (BoN) method
  • Self-consistency decoding
  • Programmatic prompting
  • Tree-of-Thoughts (ToT) prompting
  • Chain-of-Symbols (CoS) prompting
  • Graph-of-Thoughts (GoT)
  • Chain-of-Table prompting
  • Thread-of-Thought (ThoT) prompting
  • System 2 Attention (S2A) prompting
  • Chain-of-Verification (CoVe) prompting
  • ReAct prompting (reason-and-act)
  • Rephrase-and-Respond (RaR) prompting
  • Chain-of-Knowledge (CoK) prompting
  • Contrastive Chain-of-Thought (CCoT) prompting
  • Program of Thoughts (PoT) prompting
  • Structured Chain-of-Thought (SCoT) prompting
  • Chain-of-Code (CoC) prompting
  • Take a Step Back prompting

Also related to these areas are the various other ways to have the LLM give a "better" answer, even if it's not really using improved reasoning. The simplest ideas include prompt engineering techniques to give the LLM a better query, RAG architectures and Retrieval Augmented Language Models (RALM) to give an LLM more relevant source data, and also dynamic tool usage integrations to generalize the LLM's capabilities to handle answers that require computations. Also relevant is the research on improving answers by fixing specific LLM limitations such as hallucinations, mathematical problem solving difficulties, and language wordplay (in)abilities.

Multi-Step Inference for Reasoning

General Research on Intelligence

What does it mean to be smart? There are various answers to this, and it's a very nuanced question.

Research on intelligence or "smartness" of AI systems:

Chain-of-Thought (CoT) Reasoning

Research papers on chain-of-thought (CoT) for reasoning:

Skeleton-of-Thought

Skeleton-of-thought is a technique with dual aims of smarter reasoning and faster inference. The idea is to generate an outline that is a list of points, and then have the LLM process each sub-point in parallel. This allows both a more focused answer to that issue, and a faster parallelization of shorter token length answers.

Research on skeleton-of-thought reasoning includes:

Reflection

Reflection, or self-reflection, is a type of reasoning where the LLM takes an extra step to "reflect" on its own answers. This is a type of multi-step reasoning method, where the LLM is admonished to improve its own answers. There are different variants of self-reflection for training improvement or inference improvement.

Research papers on reflection:

Multi-Step Methods

Research on "multi-step" methods for reasoning in general, include:

Planning (as part of Reasoning)

Having an LLM know how to make a plan is part of intelligence. Here are some papers specifically on the aspect of "planning" as part of reasoning:

Agentic Workflow

Temporal Reasoning (Time-Based Logic)

AI models struggle with the concept of time and any sort of "temporal reasoning" that is based on time progression or causation over time.

General Research on Reasoning Techniques

AGI Research

General research on achieving Artifical General Intelligence (AGI):

More AI Research

Read more about: