Aussie AI Blog

Memory-Safe C++ Library Functions

  • October 30th August, 2024
  • by David Spuler, Ph.D.

Memory-Functions

Compiler vendors provide a variety of useful library functions to help with memory safety. Some of these are defined in C++, whereas others are platform-specific. The main classes of functions include:

  • Heap memory management
  • Stack memory management
  • Text string buffer management

If we want to manage memory safely, we first need to examine all the different ways that a C++ program can get some memory.

Memory Allocation Management Functions

The main long-standing heap management functions are:

  • malloc
  • calloc
  • free

And in C++ there are the basic operators:

  • new — object or primitive type allocation.
  • delete — de-allocation operator.
  • new[] — array allocation version.
  • delete[] — array de-allocation.

And there's also the rarely-used standard functions:

  • realloc — resize a heap block, possibly moving it.
  • alloca — dynamically allocate a stack block of memory.
  • mmap — memory-mapped blocks.
  • sbrk — low-level allocation.

Stack Memory Management

It is less commonly used, but possible, to dynamically allocated stack memory. Functions include:

  • alloca — the main stack memory allocation function (<alloca.h>.
  • _malloca — stack memory allocation (Microsoft CRT)
  • _freea — free memory on the stack or heap (Microsoft CRT)
  • __builtin_alloca_with_align (GCC version with alignment)

Note that "de-allocation" of a stack block is technically not required, because the memory is reclaimed when the function returns and the stack is unwound.

Platform-Specific Memory Management

There are also a variety of platform-specific and newer functions. The main header files are:

  • <stdlib.h> — standard memory allocation functions.
  • <malloc.h> — Linux or Windows.
  • <crtdbg.h> — C++ Run-Time debug (Microsoft MSVS).
  • <Strsafe.h> — Microsoft MSVS.

The platform-specific or newer memory-related functions include:

  • _expand (MSVS) — lengthen a heap block, in place, without moving it.
  • _malloc_dbg (MSVS) — debug versions of basic memory primitives in <crtdbg.h>.
  • reallocarray — array version of realloc.
  • free_sized (C23)
  • set_new_handler and get_new_handler (C++11)

One of the main problems with memory primitives was handling of alignment. Some other functions and language features include:

  • alignas specifier
  • __declspec(align(N))
  • aligned_alloc (C11/C++17)
  • _alloca (aligned version).
  • free_aligned_sized (C23)
  • _aligned_malloc (Microsoft CRT)
  • _aligned_realloc (Microsoft CRT)
  • _aligned_free (Microsoft CRT)
  • _aligned_msize (Microsoft CRT)
  • _aligned_offset_malloc (Microsoft CRT)
  • _aligned_offset_realloc (Microsoft CRT)
  • posix_memalign (POSIX)
  • _aligned_storage (deprecated)
  • std::aligned_storage
  • aligned_union (deprecated)
  • alignment_of
  • _Alignas

Unix and Linux Memory Management

There are a variety of Linux memory management primitives available via GCC, mostly defined in <malloc.h>:

  • malloc_usable_size — size of an allocated memory block.
  • mallinfo, mallinfo2 — get allocated memory block information.
  • malloc_info — exports XML info about the heap state.
  • malloc_stats — allocation statistics.
  • mallopt — set memory allocation options (e.g., can control how glibc handles a double-free error.)
  • getrlimit and setrlimit — manage resource limits, including the heap.

Some of the other non-standard memory functions in early Unix and Linux include extra versions with bit flag controls:

  • mallocx
  • rallocx
  • xallocx
  • sallocx
  • dallocx
  • sdallocx
  • nallocx

There are also "memory allocation control" and other memory allocation primitives in older UNIX and Linux:

  • mallctl
  • mallctlnametomib
  • mallctlbymib
  • malloc_stats_print
  • malloc_usable_size
  • malloc_message

Windows Memory Management

Windows has a variety of additional useful functions, some in <Strsafe.h> and others are in the C++ Runtime (CRT) functions and its debug versions in <crtdbg.h>:

  • _malloc_dbg and other "debugging heap" versions (<crtdbg.h>).
  • _CrtCheckMemory — check heap for integrity.
  • _CrtSetDbgFlag — control debug flags.
  • _CrtMemState memory block structure in crtdbg.h
  • _heapmin — reclaim some heap memory (heap minimize).
  • _heapadd — increase heap size.
  • _heapchk — self-test heap for consistency.
  • _heapset — fill all unallocated heap memory with a canary byte!
  • _heapwalk — traverse through the heap blocks.

Windows has a feature that I especially like: callbacks for memory allocation operations! Here are the details:

  • _CrtSetAllocHook — set a "hook" (callback) whenever allocation occurs.
  • _CrtGetAllocHook

There are also various other calls about memory addresses:

  • _CrtIsMemoryBlock — check addresses.
  • _CrtIsValidHeapPointer — check heap addresses.
  • _CrtIsValidPointer
  • _CrtReportBlockType

Size of an Allocated Heap Block

There's no standardized way to take the address of a heap block and return its size. This is unfortunate, because that would be helpful in several ways for memory safety. Hence, there are platform-specific versions:

  • _msize — Windows MSVS version.
  • malloc_usable_size — GCC version.
  • malloc_size — MacOS version.

Note that the size of the memory block returned from these functions is not necessarily the same as the original size of the request. It shouldn't be less, but it can often be larger, because the system memory allocator has padded out the allocated block for alignment or other optimization reasons. When you run simple tests, it will probably appear to always be the correct size, but after a longer execution with a lot of allocations and deallocations, the algorithm for the memory allocator can get trickier, and it may vary significantly.

If a platform-specific block size function does return a larger value for the block size, it's not easy to know this has occurred. Hence, don't assume that this size value will point exactly to your redzone area, or whatever other tricks you're doing with the end of your allocated memory blocks.

As a further warning, note that _msize on Windows is a little fragile, because it throws a runtime exception if the address is either:

    (a) a non-heap address, or

    (b) not the start of a heap address.

Hence, it's not that useful in testing whether a random address is a heap block or not. Maybe it needs to be combined with _CrtIsValidHeapPointer.

C++ Own Goals

Personally, I think that C++ made some "own goals" in terms of unnecessarily causing memory safety issues. As computers have gotten faster, the relative cost of addressing these issues becomes relatively low against the expense of tracking memory issues. Some of the areas where it probably should just be safer and tolerate issues include:

  • malloc and new should zero memory (like calloc).
  • alloca should also zero memory.
  • realloc should zero any extra allocated memory areas.
  • new/delete should be interchangeable with malloc/free (e.g., free on a new block should work).
  • new/delete should also work with the new[]/delete[] array versions.
  • Stack variables should be zeroed when a function starts (like global variables).

There are a lot of there little "undefined" areas that are glitches in the standard C++ library, which probably should be detected and tolerated, or at least warned about, by the library functions instead:

  • std::list crashes if deleting an object during an iterator scan.
  • fflush(stdin) should be detected and tolerated.
  • Mismatched fread/fwrite on a file without intervening fseek would be easy to detect.
  • strncpy should have a warning when it truncates and leaves the string without a null.
  • cos or sin of a number larger than two pi probably means the caller has confused radians and degrees.
  • strlen(NULL) should not crash.

We can "fix" some of these issues by defining our own intercepted versions of these functions, either via using our own wrapper function names instead, or via automatic preprocessor macro intercepts or link-time changes.

Related Memory Safety Blog Articles

See also these articles:

Safe C++ Book



Safe C++: Fixing Memory Safety Issues The new Safe C++ coding book by David Spuler:
  • Memory Safety
  • Rust versus C++
  • The Safe C++ Standard
  • Pragmatic Memory Safety

Get your copy from Amazon: Safe C++: Fixing Memory Safety Issues

More AI Research Topics

Read more about: