Aussie AI
Loop Distribution
-
Book Excerpt from "Generative AI in C++"
-
by David Spuler, Ph.D.
Loop Distribution
Loop distribution is type of loop code motion that creates two loops
from a single loop that contain an “if
” statement.
The hoisted code is a conditional test.
Some early papers in the 1990s called it “loop unswitching.”
Some papers use the term “loop distribution” with the different meaning
of splitting a loop into two loops, which we call “loop fission.”
The goal of loop distribution is to move an “if
” test out of the loop body,
by creating two loops,
and ends up creating two separate loops on two pathways.
This sounds similar to loop fission, but loop distribution is a more general optimization that doesn't
require parallelization to get a speed improvement (whereas loop fission does).
Instead, loop distribution gets a benefit in ordinary sequential execution
because it moves the if
-test computation out of the loop body
to a once-only pre-initialization test (i.e. “hoisted”).
Note that only one of the two loops is executed
each time, and these two loops are never executed in parallel,
so this technique is not really a type of loop fission.
Example: Loop Distribution: Here's a dummy example of implementing an “add-or-subtract” function using a passed-in Boolean flag.
void aussie_vector_addition_slow( float v[], int n, bool do_add, float scalar) { for (int i = 0; i < n; i++) { if (do_add) v[i] += scalar; // Add else v[i] -= scalar; // Subtract } }
The problem is that the test “if(do_add)
” is computed for every loop iteration,
and yet “do_add
” is a loop-invariant flag variable.
The faster version is to use loop distribution
to move the if
-test into the loop initialization, and then split the two pathways inside the loop to instead have two separate loops.
Here's the faster version:
void aussie_vector_addition_loop_distribution( float v[], int n, bool do_add, float scalar) { if (do_add) { // Add scalar for (int i = 0; i < n; i++) { v[i] += scalar; // Add } } else { // Subtract scalar for (int i = 0; i < n; i++) { v[i] -= scalar; // Subtract } } }
This example is still far from optimal. For starters, it should be using pointer arithmetic rather than array indices.
• Next: • Up: Table of Contents |
The new AI programming book by Aussie AI co-founders:
Get your copy from Amazon: Generative AI in C++ |