Aussie AI
C++ Variable Bugs
-
Bonus Material for "Generative AI in C++"
-
by David Spuler, Ph.D.
Variable Bugs
Non-initialized local variables
This problem occurs when a variable is used before it has been assigned a value, either by explicit initialization or an assignment statement. For example, the function below is incorrect because the local variable "sum" is not set with an initial value of zero:
int sum(int n) { int sum; // Bug for (int i = 1; i <= n; i++) sum += i; return sum; }
The compiler does not necessarily initialize local variables to zero, although some compilers do. If not explicitly initialized, the values are undefined at the start of the function. Also, some compilers and linting tools will detect this problem and warn about "used before set" variables.
Single letter variables
Novice programmers learning their first language may not immediately appreciate the significance of the single quotes around character constants. An occasional source of errors are tests such as:
if (A <= ch && ch <= Z)
And you'll find that there are declarations given for the variables A and Z, as a faulty fix, because the compiler complained that they were not declared! The correct expression is:
if ('A' <= ch && ch <= 'Z')
Programmers using such expressions should be encouraged to use the library functions in <ctype.h> instead. The expression "isupper(ch)" would be more portable, more readable, and also highly likely to be more efficient than the hand-coded test above.
Missing structure initializers
There is a very neat syntax for the initialization of arrays and structures, using a comma-separated list of expressions inside a pair of braces. Unfortunately, the compiler will rarely warn about having too few initializers, and it is a common error to accidentally omit an expression. The code below is an example of this situation:
struct { int x,y,z; } a = { 10, 20 }; // Bug?Note that the initializer for "z" is missing. The uninitialized fields are set to zero, but this is not adequate if another value was intended to be stored.
Scopes and inner redeclaration
Variables can be declared in inner scopes with the same names as variables in outer scopes, without evenacompilation warning. This is often called "inner redeclaration" or "shadowing." It is a very convenient feature but can occasionally lead to confusion.
Redeclaration of global variables
An example of inner redeclaration is a local variable having the same name as a global variable:
int x = 0; // Global variable int main() { int x = 0; // Local variable ... }
Although problems are quite rare, it is possible that a programmer will use the name x intending the global variable to be accessed, but instead the local variable is accessed. If the above declarations are used in C++ there is no way to access the global variable x since it is hidden by the local variable.
Various styles can prevent this error. One is to declare all global variable with a prefix "g" or "g_" in their name. Another is that the "::" (double colon) operator can be used to access global variables:
::x = 3; // Access global x
Redeclaration of local variables
Redeclaration of a variable is also possible in different blocks in a program. Every pair of braces introduces a new scope in C++, and variable declarations are permitted in every scope. These variables can use the same names as outer scope variables. Such variables are local to the innermost enclosing block and are often called "block-local" declarations. The following code is legal in C++:
int main() { int i; for (i = 1; i <= 10; i++) { int i; // declare block-local variable printf("%d\n", i); // Uses block-local variable } }
In this code the "outer" loop variable i is hidden by the declaration of the "inner" block-local variable. Although the above code is obviously erroneous, there are valid uses for hiding a name by declaring a variable in a new scope, such as to introduce a temporary variable.
As a special case, the common for loop idiom declares "i" to have scope to the closing brace of the loop body:
for (int i = 1; i <= 10; i++) { // ... }
Although hidden global variables can be accessed in C++ using the :: operator, this is not possible for hidden local variables. Hidden local variables cannot be accessed at all while another declaration hides the name.
A far worse problem is that if i is declared in an outer block, the new declaration of i will hide the outer declaration, and the program may access the wrong variable. This is the case below since the if statement introduces an extra scope level between the two declarations of i:
int main(void) { int i = 0; if (some_test()) { for (int i = 1; i < 10; i++) { do_something(i); } std::cout << "i = " << i << "\n"; //ambiguous } std::cout << "i = " << i << "\n"; // safe }
The first output statement will output 10, since it is referring to the i declared in the for loop. However, once the enclosing scope (from the if statement) closes, the i declared in the for loop is no longer accessible and using i will refer to the outer local variable.
It is also possible to run into problems with nested loops:
for (int i = 0; i < N; i++) { for (int i = 0; i < M; i++) { // Bug (should be j) // ... } }
Shadowing function parameter names
Another type of scope error is local variables that "shadow" a function parameter name.
int fn(int x, int y) { int x = 0; /* Error: hides parameter name // ... }
In the example above the parameter x is no longer available in the function, and the local variable is used instead. The above example probably gets a compile error, because both variables at in the same scope. However, if the second were nested deeper in braces, it's not an error any more.
Initialization squeezing out the null byte
The initialization of character arrays with string constants requires enough room for the null byte. If the number of letters in the string is exactly equal to the specified size of the array, the C++ compiler should give an error:
char s[3] = "YES"; // Error
The null byte is needed at the end of the string, so this declaration is wrong. A better style is to let the compiler count the characters:
char s[] = "YES"; // Correct