Aussie AI

Salient Tokens

  • Last Updated 24 February, 2025
  • by David Spuler, Ph.D.

Salient tokens are an LLM optimization strategy that focuses inference cost on important or "salient" tokens. Unimportant or non-salient tokens can be pruned to reduce overall computations to focus on the salient ones. The strategy of pruning non-salient tokens can be applied to the regular inference tokens, or the same strategy can be applied in the KV cache, or both.

Related research topics include:

Research on Salient Tokens

More AI Research

Read more about: