Aussie AI
Positional Encoding
-
Book Excerpt from "Generative AI in C++"
-
by David Spuler, Ph.D.
Positional Encoding
Positional Encoding (PE) is the algorithm whereby relative position information about the placements of words in relation to each other is encoded into “embeddings” that are input into the AI model. The term is often used synonymously with “positional embeddings,” but technically, positional encoding is the algorithm (i.e. code) used to create a vector of positional embeddings (i.e. data).
The positional encoding algorithm was one of the important parts of the vanilla 2017 Transformer architecture, which used a sinusoidal positional encoding. Various attempts have been made to try other methods of positional encoding for improved model accuracy. In particular, the handling of long context lengths has been found to be better with other positional encoding algorithms, notably Rotary Positional Encoding (RoPE).
Some research has attempted to improve the raw speed of positional encoding algorithms. Although positional encoding is not usually a major CPU bottleneck, it can nevertheless be optimized via improved algorithms, approximations (including integer-only versions), and surprisingly, by removing the PE component entirely with a “NoPE” algorithm.
• Next: • Up: Table of Contents |
The new AI programming book by Aussie AI co-founders:
Get your copy from Amazon: Generative AI in C++ |