Aussie AI
What is MatMuland?
-
Book Excerpt from "Generative AI in C++"
-
by David Spuler, Ph.D.
What is MatMul?
This is where the rubber meets the road. The core of the method used by AI Engines to trick us into thinking they're smart is in the matrix multiplications. This is where billions of weights get applied to whatever words you prompted into the AI. The words get changed to numbers (i.e. tokens), and they get crunched up through many layers of tensor kernels, hung out to dry in a normalization blender, buried in peat moss, and then munged up into tiny fractional probabilities, and out pops the answer. Simples.
Tensors are just matrices on steroids, and matrix multiplications are the basis of neural network processing. Typically, AI matrix multiplication is shortened to “MatMul” or referred to as “GEMM” (General Matrix Multiplication). If you're writing your own C++ matrix multiplication code, you're writing a “MatMul algorithm” or a “GEMM kernel.”
Note that MatMul and GEMM both refer to matrix-matrix multiplication. The simpler case of matrix-vector multiplication is called Vector Matrix Multiplication (VMM). Also, the “general” in GEMM means that any type of matrices may be multiplied, as distinct from special cases such as “sparse matrix multiplication” or triangular matrices.
The main bottleneck in Transformers is the multiplication operation deep inside the nested loops
of the matrix multiplier.
If you write a Transformer in C++ that runs without a GPU, the main time cost will be the “*
” operator
on floating-point numbers,
deep inside a matrix multiplication function.
The main optimization of MatMul in modern times is hardware optimizations, to run lots of those multiplication computations in parallel. Using tensors scales matrix multiplication up into a third dimension for increased parallelization, but it's all matrix algebra underneath. Well, actually, it's vector dot product under that, too.
• Next: • Up: Table of Contents |
The new AI programming book by Aussie AI co-founders:
Get your copy from Amazon: Generative AI in C++ |