Aussie AI

Funnel Transformer

  • Last Updated 7 December, 2024
  • by David Spuler, Ph.D.

The Funnel Transformer is an inference optimization method using dynamic embeddings vector pruning. The length of the embeddings vector is reduced at each layer by detecting embedding elements that are small and can be ignored. This is similar to dynamic reduction of the model's internal dimension, except that the items removed can be in random locations. Hence, this is sparsification of the tensor computations along the embedding dimension.

Related areas of LLM inference optimization include:

Research on Funnel Transformer

More Research on Pruning Types

More AI Research

Read more about: