Aussie AI
Reducing Data Size
-
Book Excerpt from "Generative AI in C++"
-
by David Spuler, Ph.D.
Reducing Data Size
There are many techniques for reducing the size of program data. These techniques apply to all three types of memory — static, stack and heap storage. In some cases, a method may increase the memory storage in one area to decrease the memory usage in another, which is valid only if the total storage requirements decrease.
Use char
arrays not std::string
.
The use of std::string
is very convenient, but if your program has
many strings, the extra storage used by the string
objects can add up.
Consider managing your own raw char
arrays as C-style strings if you really need the space.
Avoid max-size arrays or buffers. When using an array data structure or buffer, there is temptation to be lazy and just make it bigger than it will need to be. Avoid this temptation and optimize the memory usage properly. Change an oversize array into a dynamically allocated array, if size can be determined easily at runtime.
Smart buffers or smart array classes.
An alternative to using an oversize array or buffer
is to create “smart” classes that manage this,
by automatically extending the array or buffer if more elements are needed.
The std::vector
class is a good way to do this.
Bit vectors. These can be used where information can be reduced to a single Boolean value, such as bit flags or masks. The use of bit vectors is very compact in terms of space, and there are standard C++ libraries to implement these efficiently.
Unions.
When using a lot of structures, space can be reduced by overlaying the data fields. This
can only be done if the fields to be overlayed are mutually exclusive (i.e. they never have
active data in them at the same time). There is a special C++ data type for this purpose: the
union
.
Linearize multi-dimensional dynamic arrays. Use the simpler and smaller size of a one-dimensional array, with the two-dimensional structure mapped onto it with index calculations. This adds more runtime cost, but saves space over multiple levels of dynamic array allocations.
Reusing space.
One way to conserve memory is to reuse the space used by a variable. The union
data
type is an example of this general idea, and another is reusing variables for different
purposes. For example, rather than letting several functions each have a local temporary
buffer, they could all use the same global variable (although this is a very dangerous
practice). As another example, if a program uses two similar arrays, examine whether the
two arrays can share the same storage (possibly as a union
).
Note that I don't recommend any of these approaches: too dangerous!
Small data types: short
, char
.
Instead of using arrays of int
, use arrays of short
, char
or unsigned char
.
There is no problem with this method, provided large integer values are not being stored
(e.g. larger than 127 for char
, or larger than 255 for unsigned char
). This technique
is also worthwhile when applied to int
fields in objects although alignment restrictions may limit the improvement — use the sizeof
operator to determine if the size of
the object has been reduced. Smaller local variables could also be declared as a
smaller type, but this may increase the executable size due to type conversions. Note that
speed can be compromised by using smaller data types because of the type conversions
that often result. Similarly, use float
instead of double
, where the greater precision
of results is not important (e.g., an AI model).
Bit-fields in objects.
When storing small integers in objects or structures, there is a way to specify exactly the number of
bits required. These types are called “bit-fields” and can only be used for fields inside
objects, structures or unions. You cannot declare a local variable with a bit-field type. When using bit-fields, small integers or Boolean flags are automatically packed into a struct
or union
. This reduces storage requirements significantly, but reduces speed because it is necessary to pack and unpack bits.
Parallel arrays versus arrays of objects or structures.
Because of alignment restrictions, an object or structure may have unusable extra padding bytes.
The number of padding bytes can be determined by using the sizeof
operator, and subtracting the sizes of each individual field from the size of the object. If there are padding
bytes, replacing an array of struct
with a number of “parallel” arrays removes the need
for this padding.
Packing.
When dealing with large arrays of small integers, it can be more efficient to pack them
together (i.e. more than one value per word), particularly when the information is binary
(true or false), because only one bit per value is needed.
The easiest way in C++ is to use std::bitset
.
Note that bit-fields are a form of packing provided by the compiler that can support more than one bit.
They are also much
easier to use than coding it yourself.
Packing object arrays with #pragma pack
.
Microsoft compilers support the “#pragma pack
” preprocessor directive, which can specify the
packing/alignment characteristics of an object.
This can allow arrays of these objects to be packed more closely into storage.
Reordering fields in objects and structures.
Because of the word alignment on some machines, the order of fields in an object or structure can
change the size of the object. This only applies to objects containing different size
fields. A general rule for minimizing the space is to order the fields from largest to
smallest. This heuristic may not give the best ordering — examine the size of a few
different orderings using the sizeof
operator, if space is crucial.
This is a machine-dependent optimization, and may not work well on some machines.
Store integer codes instead of string names.
If you're storing a string to represent some particular type
or a limited set of names, or something with a finite set,
then you can use an enum instead.
If you need to generate the actual string name,
use an array lookup or a switch
statement to return the equivalent string constant.
For example, when dealing with AI word tokens, which are indeed fixed and finite, use the integer token code
without storing the word as a string,
while maintaining a single copy of the vocabulary strings
(which you need anyway for the tokenizing algorithm).
• Next: • Up: Table of Contents |
The new AI programming book by Aussie AI co-founders:
Get your copy from Amazon: Generative AI in C++ |