Aussie AI

Training Options

  • Book Excerpt from "Generative AI in C++"
  • by David Spuler, Ph.D.

Training Options

It's easy to make a small fortune in LLM model training these days. You start with a big fortune, and then do training.

If you want a new model, and none of the off-the-shelf commercial or open source models are good enough, here are your basic options for training a smarter model:

  • Train a new model from scratch.
  • Fine-tuning (FT) of an existing model.
  • Retrieval-Augmented Generation (RAG) using a document database.

Training your own model is kind of expensive, but many of the C++ optimizations in this book might help. Yeah, right, I don't really recommend you try to train your own foundation model, no matter how good you are at C++. Also, the top LLMs are so good these days, that training a new model from scratch is probably relegated to the non-language type ML projects, using your own proprietary non-text data.

But don't listen to me. If you really have a nine-figure funding round, then go ahead and train your own foundation LLM. On the other hand, fine-tuning an existing model (e.g. GPT) is cheaper. RAG is cheaper still (probably), but it's not even a type of training, so it should really be banned by the European Union for false advertising.

Still reading this, which means you still want to do training? Which is fine, I guess, provided the GPU hosting cost isn't coming out of your pay packet. In terms of optimizing a training project, here are some methods that might be worth considering:

  • Choose smaller model dimensions (smaller is cheaper, but bigger is smarter).
  • Evaluate open-source vs commercial models.
  • Evaluate fine-tuning (FT) vs Retrieval-Augmented Generation (RAG).
  • Quantized models (“model compression” methods).
  • Knowledge distillation (train a small model using a large “teacher” model).
  • Dataset distillation (train a small model using auto-generated outputs from a large model).

 

Next:

Up: Table of Contents

Buy: Generative AI in C++: Coding Transformers and LLMs

Generative AI in C++ The new AI programming book by Aussie AI co-founders:
  • AI coding in C++
  • Transformer engine speedups
  • LLM models
  • Phone and desktop AI
  • Code examples
  • Research citations

Get your copy from Amazon: Generative AI in C++