Aussie AI

Beyond Transformers

Book Excerpt from "Generative AI in C++"

by David Spuler, Ph.D.

Beyond Transformers

In consideration of the future breakthroughs beyond Transformers, let's examine their limitations.

Quadratic cost complexity in the input sequence.
Static weights that don't change during inference. Compare this to “incremental learning.”
Mathematical reasoning limitations.
Attribution and transparency issues.
Lack of general common sense
No real “world model” (superficial understanding)

Some other areas arise in terms of the other architectures which have advantages over Transformers in some types of computations.

Hybrid RNN-Transformers. The sequence-processing methods of RNNs have some advantages, although Transformers are fairly good at sequences, too.
Hybrid CNN-Transformers. Combine the CNN's innate image processing abilities with Transformers.

But here's my prediction for what comes after Transformers: more Transformers, by which I mean ensemble architectures that combine multiple models.

• Next:

• Up: Table of Contents

• Buy: Generative AI in C++: Coding Transformers and LLMs

Generative AI in C++

The new AI programming book by Aussie AI co-founders:

AI coding in C++
Transformer engine speedups
LLM models
Phone and desktop AI
Code examples
Research citations

Get your copy from Amazon: Generative AI in C++