Aussie AI
Gist Tokens
-
Last Updated 27 February, 2025
-
by David Spuler, Ph.D.
Gist tokens are an LLM optimization method that uses special extra tokens that represent the meaning or "gist" of the topic. Depending on their usage, this method can improve the accuracy of results via semantic focusing. Efficiency may be reduced slightly via extra gist tokens in the LLM's input sequence, or in some cases this method can also improve efficiency if the total token count is reduced (i.e., when gist tokens replace multiple other input tokens). The use of extra tokens is similar to token merging, and there is also much overlap with token pruning methods based on "salient tokens" (i.e., important tokens).
Related research topics include:
- Token merging
- Token pruning (input token pruning)
- Dynamic token pruning
- KV cache token pruning
- Prompt compression
- Context compression
- Token skipping
- Token dropping
- Length pruning
Research on Gist Tokens
- J Mu, XL Li, N Goodman, 2023, Learning to compress prompts with gist tokens, https://arxiv.org/abs/2304.08467
- Chenlong Deng, Zhisong Zhang, Kelong Mao, Shuaiyi Li, Xinting Huang, Dong Yu, Zhicheng Dou, 23 Dec 2024, A Silver Bullet or a Compromise for Full Attention? A Comprehensive Study of Gist Token-based Context Compression, https://arxiv.org/abs/2412.17483
- Xinze Li, Zhenghao Liu, Chenyan Xiong, Shi Yu, Yukun Yan, Shuo Wang, Ge Yu, 25 Feb 2024, Say More with Less: Understanding Prompt Learning Behaviors through Gist Compression, https://arxiv.org/abs/2402.16058 https://github.com/OpenMatch/Gist-COCO
- Yichen Jiang, Marco Vecchio, Mohit Bansal, and Anders Johannsen. 2024. Hierarchical and Dynamic Prompt Compression for Efficient Zero-shot API Usage. In Findings of the Association for Computational Linguistics: EACL 2024, pages 2162–2174, St. Julian’s, Malta. Association for Computational Linguistics. https://aclanthology.org/2024.findings-eacl.143/ https://aclanthology.org/2024.findings-eacl.143.pdf
- Peitian Zhang, Zheng Liu, Shitao Xiao, Ninglu Shao, Qiwei Ye, Zhicheng Dou, 11 Oct 2024 (v2), Compressing Lengthy Context With UltraGist, https://arxiv.org/abs/2405.16635 https://github.com/namespace-Pt/UltraGist
- Kuang-Huei Lee, Xinyun Chen, Hiroki Furuta, John Canny, Ian Fischer, 22 Jul 2024 (v3), A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts, https://arxiv.org/abs/2402.09727
- Jintian Zhang, Yuqi Zhu, Mengshu Sun, Yujie Luo, Shuofei Qiao, Lun Du, Da Zheng, Huajun Chen, Ningyu Zhang, 21 Feb 2025, LightThinker: Thinking Step-by-Step Compression, https://arxiv.org/abs/2502.15589 https://github.com/zjunlp/LightThinker (Faster CoT by compressing the text of intermediate reasoning steps with gist tokens.)
More AI Research
Read more about: