Aussie AI
Control Flow Bugs
-
Bonus Material for "Generative AI in C++"
-
by David Spuler, Ph.D.
Control Flow Bugs
There are a number of common errors in the use of the C++ flow of control statements. Some of these errors are particularly hard to track down because control can pass through a function along many different paths and the error may only occur on one particular path. Some common types of errors include:
- Semicolon is a statement
- Empty loops
- Infinite loops
- Never-executed loops
- Conditional if test is always true or always false
- Conditional test logic reversed (e.g. missing "!")
- Loop test is always true or always false
- Case fallthrough (missing break)
- Function not returning on all paths
- Unreachable code
Accidental empty loop
A common novice error with loops is to place a semicolon just after the header of a for or while loop. Syntactically, this is correct, so the compiler gives no error message. However, it changes the meaning of the loop. For example, consider the code:
for (i = 1; i <= 10; i++); /* Extra semicolon */ { ... // body of loop }
This is interpreted as:
for (i = 1; i <= 10; i++) ; // empty loop { ... // body of loop executed only once }
Semicolons are statements in C++. The effect of this is that the body of the loop is assumed to be an empty loop by the compiler. The block after the loop header (the real loop body) is only executed after the loop has finished, and is executed only once. Worse still, the accidental empty loop may cause an infinite loop if the condition is not being changed in the header.
Infinite Loops
There are many ways for a C++ loop to go infinite:
- Forgotten incrementer (you forgot "i++")
- Wrong incrementer (should be "j++" not "i++")
- Forgotten break (your intentionally infinite loop has no break statement)
- Looping down to zero with "i++" (should be "i--")
- Missing incrementer on one path
- Loop test is always true.
Unreachable statements
It is possible to have code in a program that cannot be reached under any circumstances. This type of code occasionally indicates the presence of a bug, but more commonly indicates that unnecessary code has not been removed. Good compilers should produce a warning message — the Unix lint utility finds many instances of the error.
Mismatched else clauses
The rule that an else always matches the closest if is usually satisfactory. Howev er, there are occasions where "dangling else" errors can arise in nested if statements such as:
if (y < 0) if (x < 0) x = 0; else // Bug! y = 0;
Based on the indentation used by the programmer, the else clause is presumably intended to match the first if. However, the compiler matches the else with the second (closest) if, and compiles the code as if it were written as:
if (y < 0) { if (x < 0) x = 0; else y = 0; }
The method of avoiding this error is to always use braces around the inner if statement when using nested if statements.
if (y < 0) { // Correct if (x < 0) x = 0; } else y = 0;
Dead assignments
An assignment statement that has no effect because a later assignment always overwrites the value is sometimes called a "dead assignment" or "dead definition." In many cases a dead assignment is merely indicative of old obsolete code statements that have not been removed. However, occasionally it can arise because of a programming error. An example of this situation is:
if (*s == '\0') empty = true; empty = false;
The assignment of true to empty is "dead" because it will always be overwritten by the second assignment, setting it to false. In this case the error is that the source code line containing "else" has been accidentally lost (as suggested by the indentation).
Although dead assignments are rarely caught by the compiler, I've seen one example where the optimization phase of compilation warned that it was removing an assignment because it had no effect.
goto bypasses local variable initializations
A rare compilation error involving the goto statement occurs when a goto jumps to a label in the middle of a compound statement. This jump will bypass any initializations of variables at the start of the block. The C++ language prohibits programs that use goto in this manner, by requiring a compilation error to be emitted when an initialization is bypassed.
goto middle; // Error if (... ) { int j = 1; // initialization skipped middle: ... }
Looping Down Mismatch #1
The code to initialize a vector:
for (int i = 0; i < n; i++) { // Correct v[i] = 0.0f; }
The reversed vector again is:
for (int i = n - 1; i >= 0; i++) { // Bug v[i] = 0.0f; }
The problem is that we forgot to change "i++" to "i--". This happens all the time in code maintenance. The result is an infinite loop.
Looping Down Mismatch #2
Here's the reversed vector loop again:
for (int i = n - 1; i < 0; i--) { // Bug v[i] = 0.0f; }
Here's we've forgotten to change "i < 0" to "i >= 0", because we've changed "n" to "0", but not fixed the "<" operator. In this bug, the loop condition fails immediately, and the loop body never gets executed. Nothing gets changed in the vector.
Looping Down Array Bound Error
The code to initialize a vector in a normal forwards loop:
for (int i = 0; i < n; i++) { // Correct v[i] = 0.0f; }
So, what if we just reverse it (e.g. for data locality):
for (int i = n; i >= 0; i--) { // Bug v[i] = 0.0f; }
Note that we have correctly used an "i--" iterator and the condition test has changed to "i>=0". But there's still a bug. The first iteration of the loop will begin at "i==n" which then tries to set "v[n]". But arrays in C++ go from 0 to n-1, not from 1 to n, so "v[n]" is an array out-of-bounds error. We need to change the start to be "i=n-1" not "i=n", giving the better code:
for (int i = n - 1; i >= 0; i--) { // Correct v[i] = 0.0f; }
Looping Down Off-by-One Error
Consider the reversed loop:
for (int i = n - 1; i > 0; i--) { // Bug v[i] = 0.0f; }
This code has correctly changed the start to be "n-1" and "i++" changed to "i--", and we also changed the loop condition "i < n" to "i > 0". Is that correct?
Nope. The loop test "i > 0" is actually wrong. It misses one case (i==0), and the loop stops one iteration too early. The vector element "v[0]" will never get initialized. This is a type of "off-by-one" error. The corrected code has to do "i >= 0" so that it still executes when "i==0":
for (int i = n - 1; i >= 0; i--) { // Correct v[i] = 0.0f; }
Looping Down to Zero with size_t
Nothing looks wrong with this code for printing the numbers 0 to 10, using "size_t" rather than "int" for the type. This is technically correct, and also helps the compiler optimize the code as well. The code looks right:
for (size_t i = 0; i <= 10; i++) // Correct cout << "i = " << i << "\n";
But what if we reverse the loop to count down? We've carefully changes both "<=" to ">=" and "i++" to "i--", so it all looks ready to run:
for (size_t i = 10; i >= 0; i--) // Bug! cout << "i = " << i << "\n";
See the bug? The problem is that "size_t" is an unsigned type on most platforms. You should hopefully get a compiler warning because the test "size_t >= 0" will always be true, so this is an infinite loop. When "i" is zero, the "i--" will underflow the unsigned number back up to positive infinity (about 4 billion if size_t type is 32-bits). You cannot use "size_t" variables to loop down to zero.
Looping Down Error Summary
If we are changing a C++ for loop to run the loop in reverse ending at zero (e.g. we're doing the "looping to zero" or "loop reversal" optimizations), we need to make sure to change several things:
- "i++" changes to "i--" (otherwise an infinite loop)
- i starts at "n-1" not "n" (otherwise an array bound error)
- i tests with "0" not "n" (otherwise it's not reversed)
- i test is changed from "<" (otherwise the loop never executes)
- i tests with 0 using ">=" not ">" (otherwise it skips i==0)
- i is "int" type not "unsigned" (infinite loop)
- i is not "size_t" or other hidden unsigned type (looping for a while)
Mismatched Assertion Condition
Can you spot the bug in this code?
float vector_sum_assert_example3(float v[], int n) { if (v != NULL) { yassert(v != NULL); return 0.0; } float sum = 0; for (int i = 0; i < n; i++) { sum += v[i]; } return sum; }
The idea is to test for "v" being NULL, and then trigger an assertion, but also tolerate "v==NULL". However, the cut-pasted code has the wrong condition, with "v!=NULL" that should be "v==NULL" or "!v". As coded above, this function will always return zero (and silently). The fixed code is:
if (v == NULL) { // Fixed! yassert(v != NULL); return 0.0; }
The reverse bug can also occur in this idiom, where the "if" condition is correct, but the assertion is wrong:
if (v == NULL) { yassert(v == NULL); // Wrong! return 0.0; }
This is another cut-paste coding error. The code as written will simply tolerate "v==NULL" without crashing, but will never emit any assertion failure.