Aussie AI
C++ Function Call Bugs
-
Bonus Material for "Generative AI in C++"
-
by David Spuler, Ph.D.
Function Call Bugs
Objects pass by value, arrays by reference
Parameter passing can be very confusing in C++. Simple variable types like int or float are passed by value, meaning that when a function is called, the arguments are copied and these copies are used inside the function. The original arguments cannot be changed within the function, and thus, the values of variables passed as arguments to a function cannot be changed by the function.
All class objects or struct variables follow the same pass-by-value semantics. You cannot change a passed-in object from inside the function, unless you explicitly use "reference parameters" (i.e. add the & symbol).
Arrays are different, and are passed by reference instead, which confuses novices. When arrays are passed as arguments, the elements of an array can be changed inside the function.
The reason for this exception is that arrays are considered to be pointers to the first element of the array. This distinction can cause errors when a function does modify its parameters (e.g. to use as working variables).
Function return inconsistencies
There are a number of problems with the use of a return statement. A simple example of such an error is a non-void function with no return statement at all. Thankfully, this is a C++ compiler error.
Another situation where a function return is incorrect is when an int function uses the return statement:
return; // return NO value
This is only correct if the function is declared of type "void" and the C++ compiler should produce an error message for C++,
The same problem can occur in less obvious ways when a return statement is not found on some execution paths. For example, the function below returns a value only if the condition in the if statement is true:
bool positive(int x) { if (x > 0) return true; }
Again, this should be a C++ compilation error. But it used to be a crash in C, when the function exited at the closing right brace and returned a garbage value.
Returning the address of a local variable
If a function returns the address of a local variable, this is a major error. Because the local variable is stored on the stack, when the function terminates that value is no longer defined. Any pointer that holds the address has become a dangling reference.
This error is reasonably common when writing a function to return a string. Consider a function, label, to allocate a new string label ten characters long. The implementation shown below generates a unique label and stores it in a local string variable, returning the address of the string:
char *label(void) { static int count = 0; char s[10]; sprintf(s, "XXX%d", count++); return s; /* ERROR - returning address */ }
The behavior of this function is undefined. It may work correctly if, by coincidence, the string value stored on the stack is not corrupted. However, it is more likely to fail in some way, resulting in the production of a very strange label. Any pointer that holds the returned address has become a dangling reference.
Returning the address of a static local variable
The obvious method of resolving the problem of returning a local variable's address is to declare the local string array as static. Howev er, this creates a different problem. The second time the function is called, the same pointer value will be returned, and the first label will be overwritten. All pointers will point to the same label. Changing one label will change them all.
One "hack" solution to this problem is to use a small fixed number of different addresses. In this way there can be a small number of labels returned from the function active at the same time without affecting each other. This method isn'tageneral or reliable solution and is likely to introduce obscure bugs if more than the maximum number of labels is used, but it has occasional uses.
#define MAX_LABELS 5 /* how many distinct labels */ #define STR_LEN 10 /* length of string label */ char *label(void) { static int count = 0; static int sp = 0; static char names[MAX_LABELS][STR_LEN]; sp = (sp + 1) % MAX_LABELS; sprintf(&names[sp][0], "XXX%d", count++); return &names[sp][0]; }
One apparent solution to this problem is to use malloc to allocate some dynamic memory each time. However, this imposes the problem of how to deallocate the memory once it is no longer needed. The label function cannot do this, so the rest of the program must remember to deallocate the labels when they are no longer needed.
Note that many library functions use this method of returning the address of internal static storage, and this is satisfactory because the program using these library functions can make explicit copies of the values, rather than accessing them through the address returned. Therefore, returning the address of a static variable may be the best solution to the particular problem.
Aliasing and reference parameters
Aliasing is possible when using reference variables, or more commonly when using reference parameters for a function. A typical problem is illustrated by the code to compute the maximum of two values and return it in a third:
void max3(const double &x, const double &y, double &max) { max = x; if (y > max) max = y; }
This code will work well for all instances except a few inv olving aliasing. Consider the code sequence:
double a = 1.0, b = 2.0; max3(a, b, b);
For this set of arguments, the reference parameters y and max are aliases that both refer to b. The first assignment to max also changes y. Therefore, the code is equivalent to:
b = a; if (b > b) b = b;
The function will always return the value of a as the maximum of the two values, regardless of the value of b. Note that even the const-ness of the reference parameters does not prevent this aliasing problem.
One solution to the problem is to avoid aliasing by making x and y value parameters (possibly losing some efficiency advantage). Alternatively, the function can be rewritten so that it works correctly even if aliasing exists:
void max3(double &x, double &y, double &max) { if (x > y) max = x; else max = y; }
The aliasing can still occur, but it's harmless in this sequence.
A particular case of aliasing of reference parameters in C++ classes involves the overloaded = operator when an object is assigned to itself. This common error is discussed in a separate section.
The main function should return a value
A particular example of problems with function return values concerns the main function. Technically, it needs to be declared with int return type, and a void return type is an error in C++ The main can either have no parameters, or have an argc/argv set of parameters. A third parameter for the environment pointer is sometimes used. However, many implementations are permissive and will still run various incarnations, because the main function has been often been misused in the past. I tested a variety of declarations of main using MSVS, and all of these compiled and ran in a non-strict mode:
void main() int main() int main(void) int main(int argc, char *argv[]) int main(int argc, char *argv[], char *envp[])
The following simple program contains a subtle error — the return value of main is undefined.
int main() { printf("Hello world\n"); }
The integer return value of the main function is passed back to the operating system as a status code. Although the program itself is rarely directly concerned with the value returned by main, the operating system may examine this value. It is important for a program always to return a value, as some operating system processes may fail if a strange random value is returned. Although the above program will work in most environments, there are a few for which it may fail.
The most common methods of returning a value in main are to use the exit library function or simply a return statement. The idiom is commonly exit(0) or return(0) for success, and a 1 value for a failure, but there are also standard macros EXIT_SUCCESS and EXIT_FAILURE.