Aussie AI

Load Balancing

Book Excerpt from "Generative AI in C++"

by David Spuler, Ph.D.

Load Balancing

If your goal is a high volume of user requests, then you need to consider higher-end scalable architectures with load-balancing and fault tolerance. Some of the technologies to consider with load balancing include:

Round Robin DNS
Load Balancer Network Devices
Apache Kafka
Apache Load Balancer
Nginx load balancing

Round robin DNS, or RR DNS, is a simple way to distribute incoming requests to multiple servers, but it isn't true load balancing because it doesn't consider load or availability of the server connections. On the upside, it requires no extra server components and can be done simply by manipulating your domain DNS records.

Kafka is a more scalable production tool with advanced features such as clustering. The advantages of using Kafka are many in a large architecture, in that it is a pre-built tool that is purpose-designed for handling a high volume incoming event stream. It has a highly efficient distributed architecture, where you send requests to a Kafka cluster, and multiple listeners can be created to process incoming requests. For each input prompt, the Kafka listener would dequeue the request, and then forward the prompt text to its associated AI engine.

Apache Load Balancer is s freeware load balancing add-on. For more information, see the mod_proxy and mod_proxy_balancer Apache modules. Nginx also supports multiple different load balancing approaches such as round robin and least connections. Refer to the Nginx documentation for details.

• Next:

• Up: Table of Contents

• Buy: Generative AI in C++: Coding Transformers and LLMs

The new AI programming book by Aussie AI co-founders:

AI coding in C++
Transformer engine speedups
LLM models
Phone and desktop AI
Code examples
Research citations

Get your copy from Amazon: Generative AI in C++

Aussie AI

Load Balancing

Load Balancing

Quick Links

Product

New to Writing?

Writing Styles