Aussie AI
Attention Steering
-
Last Updated 21 March, 2025
-
by David Spuler, Ph.D.
What is Attention Steering?
Attention steering is a method to "steer" or focus the LLM attention algorithm onto a particular subset of the tokens. This aims for more accurate and faster attention computations.
Research on Attention Steering
Research papers on attention steering:- Zhuohan Gu, Jiayi Yao, Kuntai Du, Junchen Jiang, 21 Nov 2024 (v2), LLMSteer: Improving Long-Context LLM Inference by Steering Attention on Reused Contexts, https://arxiv.org/abs/2411.13009
- Qingru Zhang, Chandan Singh, Liyuan Liu, Xiaodong Liu, Bin Yu, Jianfeng Gao, Tuo Zhao, 1 Oct 2024 (v2), Tell Your Model Where to Attend: Post-hoc Attention Steering for LLMs, https://arxiv.org/abs/2311.02262 https://github.com/QingruZhang/PASTA
- Baifeng Shi, Siyu Gai, Trevor Darrell, Xin Wang, 11 Jul 2023 (v2), TOAST: Transfer Learning via Attention Steering, https://arxiv.org/abs/2305.15542 https://github.com/bfshi/TOAST
- Yibin Wang, Weizhong Zhang, Jianwei Zheng, Cheng Jin, 20 Aug 2024 (v3), PrimeComposer: Faster Progressively Combined Diffusion for Image Composition with Attention Steering, https://arxiv.org/abs/2403.05053 https://github.com/CodeGoat24/PrimeComposer
- Haoran Wang, Kai Shu, Jan 2025, MakeEveryTokenCount: ASystematic Survey on Decoding Methods for Foundation Model, https://www.researchgate.net/profile/Haoran-Wang-96/publication/387703971_Make_Every_Token_Count_A_Systematic_Survey_on_Decoding_Methods_for_Foundation_Models/links/67784c8ce74ca64e1f49eb15/Make-Every-Token-Count-A-Systematic-Survey-on-Decoding-Methods-for-Foundation-Models.pdf https://github.com/wang2226/Awesome-LLM-Decoding
- Kyle O'Brien, David Majercak, Xavier Fernandes, Richard Edgar, Jingya Chen, Harsha Nori, Dean Carignan, Eric Horvitz, Forough Poursabzi-Sangde, 18 Nov 2024, Steering Language Model Refusal with Sparse Autoencoders, https://arxiv.org/abs/2411.11296
- Xintong Wang, Jingheng Pan, Longqin Jiang, Liang Ding, Xingshan Li, Chris Biemann, 23 Oct 2024, CogSteer: Cognition-Inspired Selective Layer Intervention for Efficient Semantic Steering in Large Language Models, https://arxiv.org/abs/2410.17714
- Neel Nanda, 8th Jul 2024, An Extremely Opinionated Annotated List of My Favourite Mechanistic Interpretability Papers v2, https://www.alignmentforum.org/posts/NfFST5Mio7BCAQHPA/an-extremely-opinionated-annotated-list-of-my-favourite
- Yixin Ji, Juntao Li, Hai Ye, Kaixin Wu, Jia Xu, Linjian Mo, Min Zhang, 5 Jan 2025, Test-time Computing: from System-1 Thinking to System-2 Thinking, https://arxiv.org/abs/2501.02497
- Hanyu Zhang, Xiting Wang, Chengao Li, Xiang Ao, Qing He, 10 Jan 2025, Controlling Large Language Models Through Concept Activation Vectors, https://arxiv.org/abs/2501.05764 (Training a vector used to control the model on certain attributes.)
- Qi Sun, Edoardo Cetin, Yujin Tang, 14 Jan 2025 (v2), Transformer2: Self-adaptive LLMs, https://arxiv.org/abs/2501.06252 (Using a vector to fine-tuning dynamically.)
- Liu Yang, Ziqian Lin, Kangwook Lee, Dimitris Papailiopoulos, Robert Nowak, 16 Jan 2025, Task Vectors in In-Context Learning: Emergence, Formation, and Benefit, https://arxiv.org/abs/2501.09240
- Dan Zhang, Tao Feng, Lilong Xue, Yuandong Wang, Yuxiao Dong, Jie Tang, 23 Jan 2025, Parameter-Efficient Fine-Tuning for Foundation Models, https://arxiv.org/abs/2501.13787
- Xinyu Ma, Yifeng Xu, Yang Lin, Tianlong Wang, Xu Chu, Xin Gao, Junfeng Zhao, Yasha Wang, 24 Jan 2025, DRESSing Up LLM: Efficient Stylized Question-Answering via Style Subspace Editing, https://arxiv.org/abs/2501.14371 https://github.com/ArthurLeoM/DRESS-LLM
- Peixuan Han, Cheng Qian, Xiusi Chen, Yuji Zhang, Denghui Zhang, Heng Ji, 4 Feb 2025 (v2), Internal Activation as the Polar Star for Steering Unsafe LLM Behavior, https://arxiv.org/abs/2502.01042
- Daniel Beaglehole, Adityanarayanan Radhakrishnan, Enric Boix-Adserà, Mikhail Belkin, 6 Feb 2025, Aggregate and conquer: detecting and steering LLM concepts by combining nonlinear predictors over multiple layers,0 https://arxiv.org/abs/2502.03708 https://github.com/dmbeaglehole/neural_controllers
- Nikhil Anand, Dec 20, 2024, Understanding “steering” in LLMs And how simple math can solve global problems. https://ai.gopubby.com/understanding-steering-in-llms-96faf6e0bee7
- Somnath Banerjee, Sayan Layek, Pratyush Chatterjee, Animesh Mukherjee, Rima Hazra, 16 Feb 2025, Soteria: Language-Specific Functional Parameter Steering for Multilingual Safety Alignment, https://arxiv.org/abs/2502.11244
- Lukasz Bartoszcze, Sarthak Munshi, Bryan Sukidi, Jennifer Yen, Zejia Yang, David Williams-King, Linh Le, Kosi Asuzu, Carsten Maple, 24 Feb 2025, Representation Engineering for Large-Language Models: Survey and Research Challenges,https://arxiv.org/abs/2502.17601
- Yingbing Huang, Deming Chen, Abhishek K. Umrawal, 28 Feb 2025, JAM: Controllable and Responsible Text Generation via Causal Reasoning and Latent Vector Manipulation, https://arxiv.org/abs/2502.20684
- Seongheon Park, Xuefeng Du, Min-Hsuan Yeh, Haobo Wang, Yixuan Li, 1 Mar 2025, How to Steer LLM Latents for Hallucination Detection? https://arxiv.org/abs/2503.01917
- Marco Scialanga, Thibault Laugel, Vincent Grari, Marcin Detyniecki, 3 Mar 2025, SAKE: Steering Activations for Knowledge Editing, https://arxiv.org/abs/2503.01751
- Kenneth J. K. Ong, Lye Jia Jun, Hieu Minh "Jord" Nguyen, Seong Hah Cho, Natalia Pérez-Campanero Antolín, 17 Mar 2025, Identifying Cooperative Personalities in Multi-agent Contexts through Personality Steering with Representation Engineering, https://arxiv.org/abs/2503.12722
- Moreno D'Incà, Elia Peruzzo, Xingqian Xu, Humphrey Shi, Nicu Sebe, Massimiliano Mancini, 14 Mar 2025, Safe Vision-Language Models via Unsafe Weights Manipulation, https://arxiv.org/abs/2503.11742
More Attention Research Topics
Related LLM research areas for long context optimization of the attention methods include:
- Attention optimization (main page)
- Local attention
- Linear attention
- Sparse attention
- Multi-Head Attention (MHA)
- Muti-Query Attention (MQA)
- Group-Query Attention (GQA)
- Flash attention
- Paged attention
Other topics in attention research:
- Low-rank matrix attention
- Medusa attention
- Block attention
- Cross attention
- Fused head attention
- Hybrid local-global attention
- FFT attention
- QKV computation optimizations
- Additive attention
- Multiplicative attention
- Graph attention
- Chunked attention
- Attention sink
- Attention steering
- Bilinear attention
- Attention-free methods
- Mixture-of-Heads (MOH) Attention (MoE+MHA)
- Star attention
- Ring attention
More AI Research
Read more about: