Aussie AI

Top 10 Really Big Optimizations

  • Book Excerpt from "Generative AI in C++"
  • by David Spuler, Ph.D.

Top 10 Really Big Optimizations

Most of this book is about optimizing your AI engine, including its C++ code and model structure. But first, let's take a step back and consider the massive optimizations for your entire project. Here's some ways to save megabucks:

    1. Buy an off-the-shelf commercial AI-based solution instead.

    2. Wrap a commercial model rather than training your own foundation model (e.g. OpenAI API).

    3. Test multiple commercial foundational model API providers and compare pricing.

    4. Use an open source pre-trained model and engine (e.g. Meta's Llama models).

    5. Avoid fine-tuning completely via Retrieval-Augmented Generation (RAG).

    6. Choose smaller model dimensions when designing your model.

    7. Choose a compressed open source pre-trained pre-quantized model (e.g. quantized Llama).

    8. Cost-compare GPU hosting options for running your model.

    9. Use cheaper commercial API providers for early development and testing.

    10. Use smaller open-source models for early development and testing.

If ten isn't enough for you, don't worry, I've got more! Roll up your sleeves and look at all the research on optimizations in Part VII.

 

Next:

Up: Table of Contents

Buy: Generative AI in C++: Coding Transformers and LLMs

Generative AI in C++ The new AI programming book by Aussie AI co-founders:
  • AI coding in C++
  • Transformer engine speedups
  • LLM models
  • Phone and desktop AI
  • Code examples
  • Research citations

Get your copy from Amazon: Generative AI in C++