Aussie AI
Training Options
-
Book Excerpt from "Generative AI in C++"
-
by David Spuler, Ph.D.
Training Options
It's easy to make a small fortune in LLM model training these days. You start with a big fortune, and then do training.
If you want a new model, and none of the off-the-shelf commercial or open source models are good enough, here are your basic options for training a smarter model:
- Train a new model from scratch.
- Fine-tuning (FT) of an existing model.
- Retrieval-Augmented Generation (RAG) using a document database.
Training your own model is kind of expensive, but many of the C++ optimizations in this book might help. Yeah, right, I don't really recommend you try to train your own foundation model, no matter how good you are at C++. Also, the top LLMs are so good these days, that training a new model from scratch is probably relegated to the non-language type ML projects, using your own proprietary non-text data.
But don't listen to me. If you really have a nine-figure funding round, then go ahead and train your own foundation LLM. On the other hand, fine-tuning an existing model (e.g. GPT) is cheaper. RAG is cheaper still (probably), but it's not even a type of training, so it should really be banned by the European Union for false advertising.
Still reading this, which means you still want to do training? Which is fine, I guess, provided the GPU hosting cost isn't coming out of your pay packet. In terms of optimizing a training project, here are some methods that might be worth considering:
- Choose smaller model dimensions (smaller is cheaper, but bigger is smarter).
- Evaluate open-source vs commercial models.
- Evaluate fine-tuning (FT) vs Retrieval-Augmented Generation (RAG).
- Quantized models (“model compression” methods).
- Knowledge distillation (train a small model using a large “teacher” model).
- Dataset distillation (train a small model using auto-generated outputs from a large model).
• Next: • Up: Table of Contents |
The new AI programming book by Aussie AI co-founders:
Get your copy from Amazon: Generative AI in C++ |