Aussie AI

Incremental AI Algorithms

  • Last Updated 17 September, 2024
  • by David Spuler, Ph.D.

Incremental algorithms are a general class of code optimizations where a large amount of computation is replaced by repeated smaller code executions. It is the opposite of "batch processing" where a big chunk of processing is done all at once.

A simple example is summing a list of numbers input by a user. You can either get all the numbers, and then scan through them, adding them all at once (non-incrementally). Or you can use the incremental approach, where you keep a running sum, and add each new number to that sum as it is received.

AI models in a sense are one big incremental algorithm. During training, the weights are incrementally updated, one input item at a time. During inference, the tokens are scanned one at a time, the layers incrementally modify the logits (one layer at a time), and the autoregressive decoding phase processes one new output token at a time.

Incremental learning is a method of training or fine-tuning whereby the model learns incrementally. This is an established machine learning algorithm with a body of research. However, inference optimization is not a goal of incremental learning.

Incremental algorithms are not a mainstay of inference optimization. Generally, most of the AI optimization techniques tend to be batch rather than incremental. One major reason is that batch algorithms are often easier to parallelize, whereas incremental computations need to await the results of the prior step. However, there are some optimizations to AI inference that involve the use of incremental algorithms:

Incremental Inference

Research on using incremental algorithms for LLM inference:

Incremental Algorithm Research

Research on incremental algorithms:

More AI Research

Read more about: