Aussie AI

Hybrid RAG + Fine-tuning Methods

  • Book Excerpt from "Generative AI in C++"
  • by David Spuler, Ph.D.

Hybrid RAG + Fine-tuning Methods

Fine-tuning and RAG are more like frenemies than real enemies: they're usually against each other, but they can also work together. If you have the bucks for that, it's often the best option. The RAG architecture uses a model, and there's no reason that you can't re-train that model every now and then, if you've got the compute budget for re-training.

In a hybrid architecture, the most up-to-date information is in the RAG datastore, and the retriever component accesses that in its normal manner. But we can also occasionally re-train the underlying model, on whatever schedule our budget allows, and this gets the benefit of innate knowledge about the proprietary data inside the model itself. Occasional re-training helps keep the model updated on industry jargon and terminology, and also reduces the risk of the model filling gaps in its knowledge with “hallucinations.”

Once-only fine-tuning. One hybrid approach is to use a single up-front fine-tuning cycle to focus the model on the domain area, and then use RAG as the method whereby new documents are added. The model is then not fine-tuned again.

The goal of this once-only fine-tuning is to adjust static issues in the model:

  • Style and tone of expression
  • Brand voice
  • Industry jargon and terminology

Note that I didn't write “up-to-date product documents” on that list. Instead, we're putting all the documents in a datastore for a RAG retriever component. The model doesn't need to be re-trained on the technical materials, but will get fresh responses using the RAG document excerpts. The initial fine-tuning is focused on stylistic matters affecting the way the model answers questions, rather than on learning new facts.

Occasional re-training might still be required for ongoing familiarity with jargon or tone, or if the model starts hallucinating in areas where it hasn't been trained. However, this will be infrequent or possibly never required again.

 

Next:

Up: Table of Contents

Buy: Generative AI in C++: Coding Transformers and LLMs

Generative AI in C++ The new AI programming book by Aussie AI co-founders:
  • AI coding in C++
  • Transformer engine speedups
  • LLM models
  • Phone and desktop AI
  • Code examples
  • Research citations

Get your copy from Amazon: Generative AI in C++