Aussie AI
Data Structure Double Initialization
-
Book Excerpt from "Generative AI in C++"
-
by David Spuler, Ph.D.
Data Structure Double Initialization
If you have an initialization routine that does a lot of work, it sometimes becomes a slug by accident. I'm not talking about a single variable initialization, but the initialization of a large program data structure at startup, like a precomputed lookup-table or a perfect hashing algorithm. In the design patterns vocabulary, such a situation is a “singleton” data structure, where only a single object ever exists in the program. It's easy to lose track of whether its initialization routine has been called, and then it gets called twice (or more!).
An example would be some of the precomputation methods whereby a large lookup-table is initialized at program startup. For example, a 24-bit lookup table has been used elsewhere in this book to optimize AI activation functions such as GELU.
The way to avoid the slug of double-initialization
is simply to track calls to the initialization routine.
The idiom that I use is a local static
variable of type bool
at the start of the initialization function:
static bool s_once = false; if (s_once) { yassert(!s_once); // Should be once only return; // Avoid double intialization! } s_once = true;
Another way is to actually count the calls with an integer, which is a generalization that works for additional scenarios:
static int s_calls = 0; ++s_calls; if (s_calls > 1) { yassert(s_calls <= 1); return; // Avoid double intialization! }
Note that I've shown how to wrap these multiple lines of code up into a single “yassert_once
” macro in Chapter 41, if you want a simpler method.
Singleton global objects.
If you've done the hard yards to declare a big data structure like this
as its own class, then you can simply instantiate only one object (i.e. as a global).
The C++ class infrastructure does well in ensuring that a constructor is only called once.
Even so, it may be worthwhile to declare a static
data member
and use similar logic to ensure that initialization on this object isn't ever done twice.
In any of these situations, it's a worthwhile investment of a couple of CPU instructions,
an increment and a test,
to avoid accidentally running the whole routine again.
Since the code is virtually identical for all cases, to avoid copy-paste typos,
you could even hide
these few statements behind a standard C++ preprocessor macro with a name of your choosing
Or you could even use an inline
function with the “return
” statement changed to throwing an exception.
• Next: • Up: Table of Contents |
The new AI programming book by Aussie AI co-founders:
Get your copy from Amazon: Generative AI in C++ |