Aussie AI
Beyond Transformers
-
Book Excerpt from "Generative AI in C++"
-
by David Spuler, Ph.D.
Beyond Transformers
In consideration of the future breakthroughs beyond Transformers, let's examine their limitations.
- Quadratic cost complexity in the input sequence.
- Static weights that don't change during inference. Compare this to “incremental learning.”
- Mathematical reasoning limitations.
- Attribution and transparency issues.
- Lack of general common sense
- No real “world model” (superficial understanding)
Some other areas arise in terms of the other architectures which have advantages over Transformers in some types of computations.
- Hybrid RNN-Transformers. The sequence-processing methods of RNNs have some advantages, although Transformers are fairly good at sequences, too.
- Hybrid CNN-Transformers. Combine the CNN's innate image processing abilities with Transformers.
But here's my prediction for what comes after Transformers: more Transformers, by which I mean ensemble architectures that combine multiple models.
• Next: • Up: Table of Contents |
The new AI programming book by Aussie AI co-founders:
Get your copy from Amazon: Generative AI in C++ |