Aussie AI
AI Technology Trends
-
Book Excerpt from "Generative AI in C++"
-
by David Spuler, Ph.D.
AI Technology Trends
Multi-model AI is here already. We're in the early stages of discovering what can be achieved by putting multiple AI models together. The formal research term for this is “ensemble” AI. For example, GPT-4 is rumored to be an eight-model architecture, and this will spur on many similar projects. As multiple-model approaches achieve greater levels of capability, this will in turn create further demand for AI models and their underlying infrastructure. This will amplify the need for optimizations in the underlying C++ engines.
Multimodal engines. Multimodality of the ability of an AI to understand inputs in text and images, and also output the same. Google Gemini is a notable large multimodal model. This area of technology is only at the beginning of its journey.
Longer Contexts. The ability of AI engines to handle longer texts has been improving, both in terms of computational efficiency and better understanding and generation results (called “length generalization”). GPT-2 had a context window of 1024 tokens, GPT-3 had 2048, and GPT-4 originally had versions from 4k up to 32k, but has now advanced to 128k tokens as I write this (November, 2023). An average fictional novel starts at 50,000 words and goes up to 200,000 words, so we're getting to the neighborhood of having AI engines generate a full work from a single prompt, although, at present, the quality is rather lacking compared to professional writers.
AI Gadgets. The use of AI in the user interface has made alternative form factors viable. Some of the novel uses of AI in hardware gadgets include the Rabbit R1, Humane Ai Pin, and Rewind Pendant.
Intelligent Autonomous Agents (IAAs). These types of “smart agents” will work on a continual basis, rather than waiting for human requests. The architecture is a combination of an AI engine with a datastore and a scheduler.
Small Models. Although the mega-size foundation models still capture all the headlines, small or medium size models are becoming more common in both open source and commercial settings. They even have their own acronym now: Small Language Models (SLMs). Notably, Microsoft has been doing some work in this area with its Orca and Phi models. Apparently 7B is “small” now.
Specialized Models. High quality, focused training data can obviate the need for a monolithic model. Training a specialized model for a particular task can be effective, and at a much lower cost. Expect to see a lot more of this in medicine, finance, and other industry verticals.
Data Feed Integrations. AI engines cannot answer every question alone. They need to access data from other sources, such as the broad Internet or specific databases such as real estate listings or medical research papers. Third-party data feeds can be integrated using a RAG-style architecture.
Tool Integrations. Answering some types of questions requires integration with various “tools” that the AI Engine can use for supplemental processing in user requests. For example, answering “What time is it?” is not possible via training with the entire Wikipedia corpus, but requires integration with a clock. Implementing an AI engine so that it knows both when and how to access tools is a complex engineering issue.
The Need for Speed. The prevailing problem at the moment is that AI engines are too inefficient, requiring too much computation and too many GPU cycles. Enter C++ as the savior? Well, yes and no. C++ is already in every AI stack, so the solution will be better use of C++ combined with research into better algorithms and optimization techniques, and increasingly powerful hardware underneath the C++ code.
• Next: • Up: Table of Contents |
The new AI programming book by Aussie AI co-founders:
Get your copy from Amazon: Generative AI in C++ |