Aussie AI

AI Engine Portability

  • Book Excerpt from "Generative AI in C++"
  • by David Spuler, Ph.D.

AI Engine Portability

Ah, yes, I remember portability. Early portability was whether it was a ZX81 or an 8086. Then it was whether it was SunOS, Solaris, AIX, Ultrix, or Irix (I missed a few). And then it was Windows 95 versus Windows NT. And then it was detecting Windows versus Linux. And then it was iOS or Android.

Which brings us up to date. And now portability for AI in C++ is detecting things like:

  • CPU features
  • OS configuration settings
  • Software package versions
  • Virtual machine settings
  • GPU capabilities

Does AI need portability? Portability is an issue that can be ignored in some AI applications. If you have control over your hardware and software tech stack, you only need one platform to work, and you can optimize for exactly that platform. This wouldn't be true if you're trying to write an engine to run on a user's phone or PC, but is often the case for business applications running inference in the data center. Whether self-hosted or cloud-hosted on virtual machines, you can control the underlying platform. So, feel free to skip this entire discussion in such situations!

On the other hand, you do need portability if your users have different platforms. And even if you have your own data center, you might want to change the underlying GPU hardware at some stage. There are also various generic benefits from having most of the C++ code being standardized and portable, such as being able to unit test most code on developer's boxes (i.e., without a top-end GPU). Another simple reason is that a large AI application isn't just about matrix multiplication; there's a huge amount of ancillary code that doesn't go near the GPU. Good code design generally dictates that the non-portable parts should at least be wrapped and isolated.

Portability in C++ programming of AI applications involves correctly running on the underlying tech stack, including the operating system, CPU, and GPU capabilities. Conceptually, there are two levels:

    1. Toleration. The first level of portability is “toleration” where the program must at least work correctly on whatever platform it finds itself.

    2. Exploitation. The second level is “exploiting” the specific features of a particular tech stack, such as making the most of whatever GPU hardware is available.

This is generally true for any application, but especially true for AI engines. To get it running fast, you'll need a whole boatload of exploitation deep in your C++ kernels.

 

Next:

Up: Table of Contents

Buy: Generative AI in C++: Coding Transformers and LLMs

Generative AI in C++ The new AI programming book by Aussie AI co-founders:
  • AI coding in C++
  • Transformer engine speedups
  • LLM models
  • Phone and desktop AI
  • Code examples
  • Research citations

Get your copy from Amazon: Generative AI in C++