Aussie AI

Greedy Decoding

  • Last Updated 3 September, 2024
  • by David Spuler, Ph.D.

What is Greedy Decoding?

Greedy decoding is a simplistic type of decoding, where the decoder simply picks the output token with the highest predicted probability. This algorithm is very efficient, but regarded as unreliable in terms of accuracy and the quality of output text. More reliable alternative decoding algorithms include top-k decoding, top-p decoding, and beam search decoding.

Problems with Greedy Decoding

One problem is that greedy decoding is deterministic, which means that the AI engine is certain to always output the same token from the same inputs. This leads to an unimpressive lack of creativity and variety, and in the worst cases, can lead to loops of repetitive phrasing and poor flow of natural language text. This repetition is called "neural text degeneration."

Greedy decoding doesn't look ahead by even one word, so it can get tripped up by two-word phrases or longer sequences of more accurate text. It is also an autoregressive decoding model, because the single output token is added to the inputs for the next phase of decoding.

Optimizing Greedy Decoding

Speed is a major advantage of greedy decoding. The algorithm to choose a token is simply the "max" function on the vector of logits. And this can be further optimized by noticing that there's no longer any need to use Softmax to convert the logit outputs from log-scale to full probabilities. The exponential function is a monotonically increasing function, so the token with the highest logit value will also have the highest probability, so the incoming logit vector can simply be scanned for its maximum, skipping the entire Softmax calculation.

Research on Greedy Decoding

Articles and papers on greedy decoding:

Neural Text Degeneration

The problem of repetitious and looping text decoding is called neural text degeneration in research papers. This occurs primarily in deterministic decoding algorithms such as greedy decoding, and is largely resolved by stochastic methods such as top-k decoding.

Research on Neural Text Degeneration: Research papers include:

  • Zihao Fu, Wai Lam, Anthony Man-Cho So, and Bei Shi. 2021. A theoretical analysis of the repetition problem in text generation. In Thirty-Fifth AAAI Conference on Artificial Intelligence. https://arxiv.org/abs/2012.14660
  • Ari Holtzman, Jan Buys, Li Du, Maxwell Forbes, and Yejin Choi, 2019, The curious case of neural text degeneration, International Conference on Learning Representations, https://arxiv.org/abs/1904.09751
  • Krishna Pillutla, Swabha Swayamdipta, Rowan Zellers, John Thickstun, Sean Welleck, Yejin Choi, and Zaid Harchaoui. 2021. MAUVE: Measuring the gap between neural text and human text using divergence frontiers. Advances in Neural Information Processing Systems. https://proceedings.neurips.cc/paper/2021/file/260c2432a0eecc28ce03c10dadc078a4-Paper.pdf

More Research on Decoding Algorithms

More AI Research

Read more about: