Aussie AI

Prompt Engineering: Types and Optimizations

  • Last Updated 12 December, 2024
  • by David Spuler, Ph.D.

Optimizing Prompt Engineering

There are various simple ways to get better results from LLMs with prompt engineering techniques with a single prompt:

  • Be specific
  • Give examples
  • Write longer prompts

Some more advanced approaches include:

  • Give multiple examples (few-shot prompting)
  • Negative prompting (tell the AI what not to do)
  • Personas
  • Chain-of-thought ("step-by-step" requests)
  • Specify an output format
  • Specify a tone, reading level, or other text meta-attribute.

There are various ways to follow up with additional prompts:

  • Iterative prompting (improve the next prompt based on the previous answer)
  • Ask the LLM to explain its reasoning
  • Ask the LLM to evaluate its own answer ("reflection")

Types of Prompt Engineering

The general categories of prompt engineering techniques are:

  • Zero-shot prompting — no examples.
  • One-shot prompting — one example.
  • Few-shot prompting — multiple examples in the prompt.

There are various known effective ways to improve the results in terms of answer accuracy/perplexity using prompt engineering:

  • Emotional prompting
  • "Step-by-step" prompting (zero-shot CoT)
  • Skeleton-of-thought
  • Chain-of-Thought (CoT) (few-shot)
  • Tree-of-Thought (ToT)

Surveys on Prompting Techniques

Survey papers on prompt engineering:

  • Sander Schulhoff, Michael Ilie, Nishant Balepur, Konstantine Kahadze, Amanda Liu, Chenglei Si, Yinheng Li, Aayush Gupta, HyoJung Han, Sevien Schulhoff, Pranav Sandeep Dulepet, Saurav Vidyadhara, Dayeon Ki, Sweta Agrawal, Chau Pham, Gerson Kroiz, Feileen Li, Hudson Tao, Ashay Srivastava, Hevander Da Costa, Saloni Gupta, Megan L. Rogers, Inna Goncearenco, Giuseppe Sarli, Igor Galynker, Denis Peskoff, Marine Carpuat, Jules White, Shyamal Anadkat, Alexander Hoyle, Philip Resnik, 6 Jun 2024, The Prompt Report: A Systematic Survey of Prompting Techniques, https://arxiv.org/abs/2406.06608
  • Xiaoxia Liu, Jingyi Wang, Jun Sun, Xiaohan Yuan, Guoliang Dong, Peng Di, Wenhai Wang, Dongxia Wang, 21 Nov 2023, Prompting Frameworks for Large Language Models: A Survey, https://arxiv.org/abs/2311.12785
  • Pranab Sahoo, Ayush Kumar Singh, Sriparna Saha, Vinija Jain, Samrat Mondal, Aman Chadha, 5 Feb 2024, A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Applications, https://arxiv.org/abs/2402.07927
  • Yuan-Feng Song, Yuan-Qin He, Xue-Fang Zhao, Han-Lin Gu, Di Jiang, Hai-Jun Yang, Li-Xin Fan, July 2024, A communication theory perspective on prompting engineering methods for large lan guage models. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 39(4): 984−1004 July 2024. DOI: 10.1007/s11390-024-4058-8, https://doi.org/10.1007/s11390-024-4058-8 https://jcst.ict.ac.cn/en/article/pdf/preview/10.1007/s11390-024-4058-8.pdf
  • Vishal Rajput, Oct 2024, The Prompt Report: Prompt Engineering Techniques, https://medium.com/aiguys/the-prompt-report-prompt-engineering-techniques-254464b0b32b

Emotional Prompting

Researchers discovered a weird technique: adding emotion to prompts makes LLMs do better. It's unclear how or why this works, but perhaps it triggers more attention paid to more important sources (i.e., tokens and weights), or perhaps it reduces attention to casual type documents.

Research papers on emotional prompting:

Chain-of-Thought (CoT)

Chain-of-thought prompting is a "step-by-step" prompting method. As a zero-shot technique, it involves just adding an encouragement like "Let's try this step-by-step." to the prompt given to the LLM. As a few-shot prompting technique, it can involve running the LLM through multiple steps to finalize an answer.

Research papers on chain-of-thought:

Tree-of-Thought (ToT)

Research papers on Tree-of-Thought (ToT) prompting:

Skeleton-of-Thought

Skeleton-of-thought is a technique that aims not only to improve accuracy, but also to improve speed and cost efficiency of inference by splitting a single prompt into multiple, smaller sub-prompts. These can be executed in parallel, to reduce overall latency.

The basic speedup works like this:

  • Generate an outline quickly (short LLM answer)
  • For each outline point, generate a brief answer (multiple focused LLM queries to compute short answers in parallel)
  • Combine them into a final, longer answer (possibly with an LLM, but this will be a long text, so heuristic packing/merging of sub-answers is faster)

Research papers on skeleton-of-thought:

Programmatic Prompt Engineering

Programmatic prompting or "auto prompting" is the use of software automation, such as an extra LLM step, to auto-create better prompts for users based on their original query text. The results should be better prompt structures and better answers.

Research on programmatic prompt engineering:

Advanced Prompt Engineering Techniques

Research papers on advanced prompting methods:

Prompt Efficiency Optimizations

There are several types of speed optimizations of LLM inference that involve prompt tokens. The main ideas are:

  • Prompt compression — fewer tokens to process.
  • Prompt caching — storing and reusing the outputs or KV cache data.
  • Parallel processing — e.g., skeleton-of-thought prompting.

Prompt compression research. Various prompt compression techniques include:

Prompt caching research. The various types of caching may include:

General Research on Prompt Engineering

More AI Research

Read more about: