Aussie AI

Fast memory block copying with memcpy

  • Book Excerpt from "Generative AI in C++"
  • by David Spuler, Ph.D.

Fast memory block copying with memcpy

The fast way to copy an entire memory block is with memcpy. This applies to copying vectors and to matrices and tensors that are linearized into a contiguous array. Rather than copy each element of an array, one at a time, in a loop, the memcpy standard library function can be used to copy the entire block in one statement:

    memcpy(destarr, srcarr, sizeof(srcarr)); 

Note that this is a bitwise copy of the array intended for simple data types. For example, it won't run copy constructors if applied to an array of objects.

The memcpy function does a very fast memory block copy. It is like strcpy in that the destination is the first parameter, but memcpy will copy everything, even null bytes and hidden padding bytes, and will always copy a fixed number of bytes. memcpy is a super-fast byte copy, but is unsafe, because it does not have well-defined behavior if the source and destination blocks overlap.

memcpy overlapping blocks error: The only downside with memcpy is that it can fail with overlapping ranges for the source and destination blocks, so if you are shuffling arrays up or down one element using memcpy, then you have to be careful, because the results on overlapping ranges are undefined. Here's a buggy example of using memcpy to remove the first character of a string in place:

    memcpy(s, s+1, strlen(s+1)+1);  // Bug

The problem is that the blocks starting at “s” and “s+1” are overlapping. It is implementation-defined whether this code will run correctly. The fix is simply to use memmove, which always works correctly for overlaps:

    memmove(s, s+1, strlen(s+1)+1);  // Correct

The memmove function is a safer version of memcpy, which also works correctly if the memory blocks overlap. If the source and destination blocks don't overlap, it's the same as memcpy, except probably slightly slower. If they do overlap, then memmove conceptually will copy the source to a temporary area, and then copy it to the destination block.

 

Next:

Up: Table of Contents

Buy: Generative AI in C++: Coding Transformers and LLMs

Generative AI in C++ The new AI programming book by Aussie AI co-founders:
  • AI coding in C++
  • Transformer engine speedups
  • LLM models
  • Phone and desktop AI
  • Code examples
  • Research citations

Get your copy from Amazon: Generative AI in C++