Aussie AI
Timing C++ Code
-
Book Excerpt from "Generative AI in C++"
-
by David Spuler, Ph.D.
Timing C++ Code
There are a number of reasons why it can be useful to time the execution of a program. Timing C++ code can be useful in determining which statements should be optimized whereas profilers may only indicate which functions are consuming time. Timing code can also determine the relative efficiency of various operations and give you valuable information about writing code for your machine (e.g. is shifting faster than integer multiplication?).
The time Command.
If the full execution time for a program is all that is needed, the Linux time
command can be used to calculate the time required by a program. There are two
versions — a stand-alone utility in /bin
and a command built into csh
. The command
to run is usually:
time a.out
A different executable name could also be used and command line arguments can also be specified.
Code Instrumentation. If a more detailed speed analysis is needed, it is possible to add C++ self-instrumentation code to your program to monitor its own performance. The basic idea is to use the standard library functions to monitor the time before and after an action.
The
most useful function is the “clock
” function which counts the number of clock ticks since
the program began executing. The “time
” function, which keeps track of the real calendar
time could also be used, but it is not a true indication of processor time on a large multi-user system.
The clock
function is correct for both single user and
multi-user systems.
The clock
function returns a value of type clock_t
(typically long
or int
) that
counts the number of clock ticks. This value can be converted to seconds by dividing by
the constant CLOCKS_PER_SEC
, also declared in <time.h>
.
The basic idea of timing C++ code blocks is to call the clock function before and after an operation and examine the difference between the number of clicks. The code below examines the relative speed of shift and multiplication operations on int operands.
void profile_shifts() { const int MILLION = 1000000; const int ITERATIONS = 100 * MILLION; int x = 1, y = 2, z = 3; clock_t before = clock(); for (int i = 0; i < ITERATIONS; i++) x = y << z; printf("%d Shifts took %f seconds\n", ITERATIONS, (double)(clock() - before) / CLOCKS_PER_SEC); before = clock(); for (int i = 0; i < ITERATIONS; i++) x = y * z; printf("%d Multiplications took %f seconds\n", ITERATIONS, (double)(clock() - before) / CLOCKS_PER_SEC); }
clock Portability Pitfall.
Note that some implementations on older Unix versions don’t conform to the C++ standard and return the
number of clock ticks since the first call to the clock
function. This means that a single
call to clock
at the end of the program would always return zero. Hence, it is more portable to
measure the number of clock ticks between two calls to clock, one at the start and one
at the end.
Obviously, you can also put the first call to “clock
” at the start of the “main” function to
avoid this rare glitch.
Note that on implementations that are correct,
a call at the start of “main
” may be non-zero due to the overhead of global and static C++ object instantiations
(i.e. constructors for global objects),
which occurs before entering main
.
Clock Tick Integer Division Pitfall.
Note that the clock_t
type and CLOCKS_PER_SEC
constant are both integers.
Hence, here's a bug:
clock_t diff = clock() - before; double seconds = diff / CLOCKS_PER_SEC; // Bug!
The problem is that it's integer division, so it inaccurately truncates to an integer.
You need a typecast to float
or double
on either side of the division operator.
clock_t diff = clock() - before; double seconds = diff / (double)CLOCKS_PER_SEC; // Correct
Clock Tick Overflow Pitfall.
The clock
function also has a problem with wraparound on some implementations.
Because of its high resolution, the number of clock ticks can quickly overflow the maximum
value that can be stored by the type clock_t
. On one system the clock
function will
wrap around after only 36 minutes. If the program being timed runs for longer than this
period, the use of clock
can be misleading. One solution is to use the “time
” function
rather than “clock
” when executions are longer, but this usually only has resolution to the nearest second.
• Next: • Up: Table of Contents |
The new AI programming book by Aussie AI co-founders:
Get your copy from Amazon: Generative AI in C++ |