Aussie AI
Load Balancing
-
Book Excerpt from "Generative AI in C++"
-
by David Spuler, Ph.D.
Load Balancing
If your goal is a high volume of user requests, then you need to consider higher-end scalable architectures with load-balancing and fault tolerance. Some of the technologies to consider with load balancing include:
- Round Robin DNS
- Load Balancer Network Devices
- Apache Kafka
- Apache Load Balancer
- Nginx load balancing
Round robin DNS, or RR DNS, is a simple way to distribute incoming requests to multiple servers, but it isn't true load balancing because it doesn't consider load or availability of the server connections. On the upside, it requires no extra server components and can be done simply by manipulating your domain DNS records.
Kafka is a more scalable production tool with advanced features such as clustering. The advantages of using Kafka are many in a large architecture, in that it is a pre-built tool that is purpose-designed for handling a high volume incoming event stream. It has a highly efficient distributed architecture, where you send requests to a Kafka cluster, and multiple listeners can be created to process incoming requests. For each input prompt, the Kafka listener would dequeue the request, and then forward the prompt text to its associated AI engine.
Apache Load Balancer is s freeware load balancing add-on.
For more information, see the mod_proxy
and mod_proxy_balancer
Apache modules.
Nginx also supports multiple different load balancing approaches
such as round robin and least connections.
Refer to the Nginx documentation for details.
• Next: • Up: Table of Contents |
The new AI programming book by Aussie AI co-founders:
Get your copy from Amazon: Generative AI in C++ |