Aussie AI

Pointers vs References

  • Book Excerpt from "Generative AI in C++"
  • by David Spuler, Ph.D.

Pointers vs References

Overall, pointers are a good and bad feature of C++. They are low-level variables that allow efficient processing of memory addresses, so we can code some very fast methods with pointers. They allow us to get very close to the machine.

On the downside, there are pointer pitfalls. Pointers trip up novices and experienced programmers alike. There is an immense list of common faults with pointer manipulation, and coding problems with pointers and memory management are probably half of the causes of bugs in C++ (at least). There are some tools that mitigate against pointer problems (e.g. Linux Valgrind) but it is a never-ending battle against them.

Pointers and arrays were implemented very similarly, and came from the earliest designs of the original C language. Basically, arrays are treated as a specific type of pointer, with various differences depending on whether they are variables or function parameters.

Then came C++ to the rescue. References arrived with the new-fangled programming language (cleverly named as “C++”) and were thoughtfully designed as a type of safe pointer that cannot be null, but is just as efficient as a pointer because the constraints on references are enforced at compile-time.

C++ allows two ways to indirectly refer to an object without needing to create a whole new copy: pointers and references. The syntax is either “*” or “&” for their declarations.

    MyVector *myptr = &mv;  // Pointer to mv object
    MyVector &myref = mv;   // Reference to mv object

Pointers and references are more efficient than spinning up a whole new copy of the object, especially when the underlying object is a complicated object. And when you have a function call, you should definitely avoid sending in a whole object.

    void processit(MyVector v)  // Slow
    {
        // ....
    }

This is inefficient because the whole MyVector object will get copied, via whatever copy constructor you have defined, which is slow. And if you haven't defined a copy constructor, then the compiler uses default bitwise copy of a structure, which is not only slow, but also rarely what you want, and often a bug.

The faster reference version is to use a “const” reference (or non-const if you're modifying it inside the function):

    void processit(const MyVector & v) // Reference argument
    {
        // ....
    }

The pointer version is:

    void processit(MyVector * v)  // Pointer argument
    {
        // ....
    }

Which is faster in C++ — pointers or references? The short answer of “not any difference” is the general view, because references are implemented as pointers by the compiler behind the scenes. The two functions above are not going to be significantly different in terms of speed.

The slightly longer answer is that references can be faster because there's no null case. A reference must always be referring to an object for the duration of its scope. The C++ compiler ensures that references cannot occur without an object:

    MyVector &v;          // Cannot do this
    MyVector &v = NULL;   // Nor this
    MyVector &v = 0;      // Nor this

A reference must be initialized from an object, and you cannot set references equal to pointers, because you actually have to de-reference the pointer with the “*” operator, which crashes if it's a null pointer:

    MyVector &v = myptr;  // Disallowed
    MyVector &v = *myptr; // Works if non-null

There's no way in C++ to get a zero value into a reference variable (we hope). For example, the address-of operator (&) applied to a reference variable returns the address of the referenced object, not the memory location of the reference itself. Hence, references are always referring to something and they cannot be equivalent to the null pointer.

References are slightly faster: The guarantee of an object for a reference fixes all those null pointer core dumps, and also relieves the programmer of the burden of testing for null pointers. The compiler does this guarantee for references at compile-time, so there's no hidden null check being done by the compiler at run-time, making it efficient. So, there's a minor speed improvement from using references, by not having to add safety checks for “ptr!=NULL” throughout the function call hierarchy.

Pointers can be better than references if you need a “null” situation to occur. For example, you're processing an object that may or may not exist, and you need the pointer to be allowed to be “NULL” if there's no object. This should occur rarely, and references should be preferred in many cases.

And finally, references aren't very useful when you're trying to scan through the data in vectors, matrices, or tensors in an AI engine. You can't do pointer arithmetic on a reference in C++.

 

Next:

Up: Table of Contents

Buy: Generative AI in C++: Coding Transformers and LLMs

Generative AI in C++ The new AI programming book by Aussie AI co-founders:
  • AI coding in C++
  • Transformer engine speedups
  • LLM models
  • Phone and desktop AI
  • Code examples
  • Research citations

Get your copy from Amazon: Generative AI in C++