Aussie AI

AI Server Hosting Options

  • Book Excerpt from "Generative AI in C++"
  • by David Spuler, Ph.D.

AI Server Hosting Options

How are you running your AI engines? If you're calling a commercial AI API, at least there's an SEP field wrapped around that issue. But if you're running your own model or open source models, then you have various options:

  • Cloud server hosting to rent boxes with GPUs (e.g. AWS, Azure, GCP, OVH, etc.)
  • GPU-specific hosting companies (the big companies and several newer startups).
  • Hourly GPU rental (from the same GPU hosting companies).
  • Model hosting (not just GPUs, and again, choose between companies big and small)

Note that GPUs are not your only concern. You will need some non-GPU boxes to run the other components of your AI production architecture, such as basic Apache/Nginx servers, backend servers, and application logic servers. The AI request servers might run near your AI servers, or could be on separate boxes.

Various ancillary servers may also be needed for your operations, such as:

  • Testbed servers (GPU and non-GPU)
  • Deployment servers (e.g. marshaling new releases)
  • Static file cache servers
  • DNS servers (if you DIY)

You need a way to test your new production architecture before it goes live. The full “deployment” procedure may need a server to manage it, and for a rollback procedure when it fails. You might also need extra boxes to DIY a cache of static files (e.g. images, scripts), or you can use a Content Delivery Network (CDN) commercial provider.

 

Next:

Up: Table of Contents

Buy: Generative AI in C++: Coding Transformers and LLMs

Generative AI in C++ The new AI programming book by Aussie AI co-founders:
  • AI coding in C++
  • Transformer engine speedups
  • LLM models
  • Phone and desktop AI
  • Code examples
  • Research citations

Get your copy from Amazon: Generative AI in C++