Aussie AI

Transformer Architecture Choices

Book Excerpt from "Generative AI in C++"

by David Spuler, Ph.D.

Transformer Architecture Choices

There are various architectural decisions that are made in the model design phase, which aren't really optimizations of a model, but can significantly impact its efficiency. Using a more advanced engine architecture is also effectively an optimization that “retains” accuracy because these changes allow the model to be fully trained in a better engine. Some important decisions include:

Decoder-only versus encoder-decoder architectures
Alternative floating-point representations (e.g. brain float)
Pre-norm versus post-norm
Positional encoding algorithms (embeddings)
Context length optimizations
Neural Architecture Search (NAS)

Data doesn't just magically end up in the GPU. There has to be software written to send the data there, and there are a lot of possible optimizations that are used in writing such software. This software is often called the “kernel” of the AI engine. The sub-components of the engine often get called the MatMul kernel, Softmax kernel, normalization kernel, and so on. Software techniques that aim to optimize parallelization primarily by increasing throughput and reducing latency include:

Vectorization
Multi-threading
Kernel fusion
Kernel fission
Pipelining
Scheduling algorithms

Memory usage optimizations: Software optimizations that aim to improve memory usage, and thereby benefit further from lowering memory access overhead to increase parallelism, include:

Tiling
Data locality optimizations
Dataflow optimizations
Memory management optimizations
Cache management
Prefetching
Offloading

• Next:

• Up: Table of Contents

• Buy: Generative AI in C++: Coding Transformers and LLMs

The new AI programming book by Aussie AI co-founders:

AI coding in C++
Transformer engine speedups
LLM models
Phone and desktop AI
Code examples
Research citations

Get your copy from Amazon: Generative AI in C++

Aussie AI

Transformer Architecture Choices

Transformer Architecture Choices

Quick Links

Product

New to Writing?

Writing Styles