Aussie AI

What is NASand?

  • Book Excerpt from "Generative AI in C++"
  • by David Spuler, Ph.D.

What is NAS?

Neural Architecture Search (NAS) is the very fancy way that AI researchers say things like this: how big should I make the model? How many weights? How many layers? What vocabulary size?

The biggest number is how many billions of weights the model should use, but this is actually dependent on a number of other numeric sizes. These weights are called “parameters” and the various other sizes are called “hyper-parameters” of the model, so NAS is also sometimes called “Hyper-Parameter Optimization” (HPO). The sizes and dimensions of models that NAS aims to determine includes:

  • Number of layers
  • Embedding size
  • Vocabulary size
  • Number of attention heads
  • Context size

Choosing these numbers is actually a very hard problem. In the early days, these choices were done either randomly or by trial-and-error, which is expensive when you're talking about GPUs. If you go too large, then the model is over-parameterized and unnecessarily expensive. Go too low, and the model won't be very accurate, or might not even work at all. Hence, a large body of research on “NAS” has developed about systematic ways to find optimal sizes of the models on the various dimensions.

NAS is not a type of model compression and isn't just something you do “offline” before inference. Rather, it's the first thing you do in an AI project. It's before training, and before you even start designing the C++ code for your engine that's tuned to the model.

 

Next:

Up: Table of Contents

Buy: Generative AI in C++: Coding Transformers and LLMs

Generative AI in C++ The new AI programming book by Aussie AI co-founders:
  • AI coding in C++
  • Transformer engine speedups
  • LLM models
  • Phone and desktop AI
  • Code examples
  • Research citations

Get your copy from Amazon: Generative AI in C++