Aussie AI

Parameter-Efficient Fine-Tuning (PEFT)

  • Last Updated 17 March, 2025
  • by David Spuler, Ph.D.

Parameter-Efficient Fine-Tuning (PEFT) is fine-tuning that's efficient with parameters. Instead of updating all of the model's parameters, which is slow and inefficient, only a subset of parameters is updated. The rest of the model parameters are "frozen" during the fine-tuning procedure.

Various types of PEFT have been examined, such as:

The alternatives to using PEFT to train additional intelligence into a model include:

LoRA

The idea behind LoRA is to use "low-rank" matrices, which have a smaller size, and thus are much less costly to fine-tuning. These matrices can be multiplied together to create data than can be combined with the original model.

Multi-LoRA

The use of multiple LoRA adapters got a boost when Apple chose this method for its Apple Intelligence platform. Several other platforms use multi-LoRA as an efficiency gain for both training and inference.

Research papers on multi-LoRA include:

LoRA Inference Optimizations

The popularity of LoRA as an efficient training method has also spawned research on maximizing its inference efficiency. Some of the loading and unloading of LoRA adapters can be quite expensive, and methods of optimizing multi-LORA platforms have seen various research papers.

Research papers on LoRA and multi-LoRA inference optimization include:

QLoRA

QLoRA is quantized LoRA. This is pretty standard nowadays, with most LoRA adapters using quantization. A lot of the research papers don't use the term "QLoRA" anymore. For example, Apple Intelligence uses QLoRA in its multi-LoRA architecture with 4-bit quantization.

Research papers on QLoRA include:

Prompt Tuning (Extended Vocabulary PEFT)

Prompt tuning is a lengthwise PEFT that creates new tokens to extend the vocabulary, rather than training the parameters for existing tokens. Since the tokens are new, they don't have values, and obviously aren't frozen. The weights for the original tokens are frozen, however, which means most of them. TAs an example, this type of PEFT can be useful when extending the LLM via fine-tuning on a special curated data set, so as to have particular "trigger tokens" to launch integrated tools or perform other advanced capabilities. For example, adding new tokens that indicate a "tool launch" to the vocabulary and only fine-tuning for those ones.

Research Papers on PEFT

PEFT is a popular technique that receives a lot of research attention:

More AI Research

Read more about: