Aussie AI

Star Attention

  • Last Updated 7 December, 2024
  • by David Spuler, Ph.D.

Star attention is an LLM attention optimization that reduces the cost of attention matrix computations on long token sequences. It is a type of "linear attention" that uses an approximation with "block sparsity" to avoid the quadratic complexity of LLM full attention algorithms. This helps the LLM known which blocks of tokens to pay attention to without needing to compute the QKV tensors for every single token.

Related LLM research areas for long context optimization of the attention methods include:

Research on Star Attention

More AI Research

Read more about: