Aussie AI
Transformer Layers and Components
-
Book Excerpt from "Generative AI in C++"
-
by David Spuler, Ph.D.
Transformer Layers and Components
Inside the encoder and decoder blocks there are lots of sub-structures, many of which are in “layers.” For example, GPT-2 is decoder-only with 12 layers of decoders. Sometimes these layers are called the “encoder stack” and “decoder stack.”
However, layers are not the whole story. Each layer has lots of sub-components, and there are also other parts of the engine that aren't in the layers. In fact, there are numerous low-level component parts of an AI engine, such as:
- Model Loader
- Tokenizer (input module)
- Embeddings
- Positional Encoding
- Vector Arithmetic (e.g. addition)
- Matrix Multiplier (MatMul/GEMM)
- Attention Heads (i.e. Q, K, and V)
- Feed-Forward Network (FFN)
- Activation Functions
- Normalization
- Softmax
- Linearization/De-embedding
- Decoding Algorithm (choosing words)
- Output module (formatting)
So, that's 14 distinct C++ modules you need to write. If we estimate two weeks for each, your engine will be done in a few months. (I wonder, dear reader, did you check my count in the above list?)
But that's not all. We've forgotten training and the above engine would be inference-only. All of the above components are related to both inference and training. The training-specific extra algorithms and modules include:
- Learning algorithms (e.g. supervised vs unsupervised)
- Training Optimizer (i.e. “gradient descent” method)
- Loss function
- Dropout
- Evaluation metrics
• Next: • Up: Table of Contents |
The new AI programming book by Aussie AI co-founders:
Get your copy from Amazon: Generative AI in C++ |