Aussie AI

What are Embeddingsand?

  • Book Excerpt from "Generative AI in C++"
  • by David Spuler, Ph.D.

What are Embeddings?

We've spent all this time talking about tokens, and yet, your Transformer doesn't even use them! Your poor starving tensors never get to taste a single token.

The first step in model inference in the Transformer architecture is to convert an input sequence into numbers called tokens. However, these tokens are not used internally to the model, because the next step of Transformer inference is to immediately convert this sequence of tokens into another internal representation called an “embedding”. An embedding is a vector of numbers that represents the information about the token sequence in very complex ways.

Note that the “embeddings” terminology is unrelated to “embedded” devices such as mobile phones or IoT edge devices. It's simply a different usage of the word.

The mapping from tokens to embeddings is actually learned during model training. The conversion of token vectors into a vector of embeddings is based on a single matrix multiplication using these learned embedding weights, with an additional step that adds “positional embeddings.” The step to combine the vector from the matrix of learned embeddings with the heuristic positional embeddings is simply by using vector addition in the Transformer architecture.

 

Next:

Up: Table of Contents

Buy: Generative AI in C++: Coding Transformers and LLMs

Generative AI in C++ The new AI programming book by Aussie AI co-founders:
  • AI coding in C++
  • Transformer engine speedups
  • LLM models
  • Phone and desktop AI
  • Code examples
  • Research citations

Get your copy from Amazon: Generative AI in C++