Aussie AI

AI Engine Reliability

  • Book Excerpt from "Generative AI in C++"
  • by David Spuler, Ph.D.

AI Engine Reliability

We want our AI model to be predictable, not irrational. And it should show bravery in the face of adversity, rather than crumble into instability at the first sign of prompt confusion. At a high-level, there are various facets to AI engine reliability:

  • Accuracy of model responses
  • Safety issues (e.g., bias, toxicity)
  • Engine basic quality (e.g., not crashing or spinning)
  • Resilience to dubious inputs
  • Scalability to many users

How to make a foundation model that's smart and accurate is a whole discipline in itself. The issues include the various training and other algorithms in the Transformer architecture, along with the general quality of the training dataset. Similarly, safety issues such as bias or toxic responses are an ongoing area of research, and aren't covered in this chapter.

Aspects of the C++ code inside your Transformer engine are important for its basic quality. Writing C++ that doesn't crash or spin is a code quality issue with many techniques. This involves coding methods such as assertions and self-testing, along with external quality assurance techniques that examine the product from the outside.

Resilience is tolerance of situations that were largely unexpected by programmers. Appropriate handling of questionable inputs is a cross between a coding issue and a model accuracy issue, depending on what type of inputs are causing the problem. Similarly, the engine should be able to cope with resource failures, or at least to gracefully fail with a meaningful response to users in such cases. Checking return statuses and exception handling is a well-known issue here.

A system is only as reliable as its worst component. Hence, it's not just the Transformer and LLM to consider, but also the quality of the other components, such as:

  • Backend server software (e.g. web server, request scheduler)
  • RAG components (e.g., retriever and document database)
  • Vector database
  • Application-specific logic (i.e., whatever your “AI thingy” does)
  • Output formatting component
  • User interface

The rest of this chapter is about how to make your C++ code reliable, whether it's in an AI engine or other components. This includes various aspects of “code quality” and also ways to tolerate problems such as exception handing and defensive programming.

 

Next:

Up: Table of Contents

Buy: Generative AI in C++: Coding Transformers and LLMs

Generative AI in C++ The new AI programming book by Aussie AI co-founders:
  • AI coding in C++
  • Transformer engine speedups
  • LLM models
  • Phone and desktop AI
  • Code examples
  • Research citations

Get your copy from Amazon: Generative AI in C++