Aussie AI
Other Types of Neural Networks
-
Book Excerpt from "Generative AI in C++"
-
by David Spuler, Ph.D.
Other Types of Neural Networks
The Transformer was a breakthrough in the evolution of neural networks. One of its main advantages was its capacity to perform calculations in parallel, allowing it to increase intelligence through sheer brute-force algorithms. This led to a massive increase in the size of models into multi-billion parameter scale, which we now call Large Language Models (LLMs).
Before the Transformer, there were many different neural network architectures. Several of these designs are still being used today in areas where they are stronger than Transformers.
Recurrent Neural Networks (RNNs). An early type of neural network that worked iteratively through a sequence. An RNN processes its inputs one token at a time, creating its output response, and then re-enters its own output as an input to its next phase. Hence, it is “recursive” in processing its own output, which is also known as “auto-regressive” mode when this same idea occurs in Transformers. Transformers have largely displaced RNNs for applications in text processing and generative AI. However, there are still research papers attempting to revive RNNs with advancements, or to create hybrid Transformer-RNN architectures.
Generative Adversarial Networks (GANs). These are an advanced image-generating neural network. The idea is to combine two models, one that generate candidate images (the “generator”), and the other model that evaluates them (the “discriminator”). By a weird kind of “fighting” between the two models, the generator model gradually creates better images that please the discriminator. The results are surprisingly effective, and this technology is still in use today.
Convolutional Neural Networks (CNNs). Whereas RNNs and Transformers are focused on input sequences, CNNs are better at input data that has a structure, especially the spatial structure inherent in images. Modern image processing and computer vision technology still uses CNNs, although enhanced Transformer architectures, such as multimodal or vision transformers, can also be used. CNNs are good at splitting an image into separate input “channels” and then applying a “filter” to each channel. Hence, CNNs have been holding their own against Transformers in areas related to image processing.
There are various other types of neural network, which all have some research attention:
- Long short-term memory (LSTM). A type of RNN.
- Spiking neural networks (SNNs)
- Liquid neural networks (LNNs)
- Quantum neural networks (QNNs)
This book is mostly about Transformers, so the interested reader is referred to the research literature for these architectures. As a general rule, there are so many research papers being written about AI that there are literally exceptions to everything. But those intrepid researchers are doing a great service to programmers by giving us lots of gritty algorithms to code up.
• Next: • Up: Table of Contents |
The new AI programming book by Aussie AI co-founders:
Get your copy from Amazon: Generative AI in C++ |