Aussie AI

Detecting GPU Support in C++

  • Book Excerpt from "Generative AI in C++"
  • by David Spuler, Ph.D.

Detecting GPU Support in C++

Detecting GPU capabilities that are available at runtime in C++ is even more problematic than detecting CPU accelerators or SIMD instructions. The available options for GPU detection include:

  • NVIDIA CUDA C++ compiler (nvcc)
  • AMD ROCm
  • Microsoft DirectML (DirectX)
  • Apple Metal
  • Vulkan API (e.g. vkEnumeratePhysicalDevices, vkGetPhysicalDeviceProperties)
  • Low-level GPU shader APIs

NVIDIA requires CUDA code to be compiled with their nvcc compiler, and the compiler itself has builtin mechanisms for testing the GPU capabilities. That results of that output can be used to set #define options within the C++ code too. The compiler also comes with some builtin defines.

GPU detection is not just determining if a GPU is available. More detail will typically be required, down to “is feature X available” or “which implementation of feature X is available.” For example, NVIDIA has a “GPU Architecture” and a “GPU Feature List” to test for capabilities.

AI Meta-Compilers. The alternative to trying to test GPU capabilities at runtime in C++ is to write code higher up the chain. There are also various ways to write cross-platform code for GPU platforms at a higher level than C++ code, such as:

  • OpenCL
  • OpenMP
  • SYCL
  • OpenACC

These methods are all designed to make your code portable to different hardware environments. Typically, you write C++-like code, which is then pre-compiled into an internal form that is managed by the wrapper code, and instantiated on the particular platform on which it is currently running.

 

Next:

Up: Table of Contents

Buy: Generative AI in C++: Coding Transformers and LLMs

Generative AI in C++ The new AI programming book by Aussie AI co-founders:
  • AI coding in C++
  • Transformer engine speedups
  • LLM models
  • Phone and desktop AI
  • Code examples
  • Research citations

Get your copy from Amazon: Generative AI in C++