1 Timing ArrayFire Code {#timing}
4 In performance-sensitive applications, it is vital to profile and measure the
5 execution time of operations. ArrayFire provides mechanisms to achieve this.
7 ArrayFire employs an asynchronous evaluation model for all of its
8 functions. This means that operations are queued to execute but do not
9 necessarily complete prior to function return. Hence, directly measuring the
10 time taken for an ArrayFire function could be misleading. To accurately
11 measure time, one must ensure the operations are evaluated and synchronize the
14 ArrayFire also employs a lazy evaluation model for its elementwise arithmetic
15 operations. This means operations are not queued for execution until the
16 result is needed by downstream operations blocking until the operations are
19 The following describes how to time ArrayFire code using the eval and sync
20 functions along with the timer and timeit functions. A final note on kernel
21 caching also provides helpful details about ArrayFire runtimes.
23 ## Using ArrayFire eval and sync functions
25 ArrayFire provides functions to force the evaluation of lazy functions and to
26 block until all asynchoronous operations complete.
28 1. The [eval](\ref af::eval) function:
30 Forces the evaluation of an ArrayFire array. It ensures the execution of
31 operations queued up for a specific array.
33 It is only required for timing purposes if elementwise arithmetic functions
34 are called on the array, since these are handled by the ArrayFire JIT.
36 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~{.cpp}
37 af::array A = af::randu(1000, 1000);
38 af::array B = A + A; // Elementwise arithmetic operation.
39 B.eval(); // Forces evaluation of B.
40 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
42 The function initializes the evaluation of the JIT-tree for that array and
43 may return prior to the completion of those operations. To ensure proper
44 timing, combine with a [sync](\ref af::sync) function.
46 2. The [sync](\ref af::sync) function:
48 Synchronizes the ArrayFire stream. It waits for all the previous operations
49 in the stream to finish. It is often used after [eval](\ref af::eval) to
50 ensure that operations have indeed been completed.
52 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~{.cpp}
53 af::sync(); // Waits for all previous operations to complete.
54 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
56 ## Using ArrayFire timer and timeit functions
58 ArrayFire provides a simple timer functions that returns the current time in
61 1. The [timer](\ref af::timer) function:
63 timer() : A platform-independent timer with microsecond accuracy:
64 * [timer::start()](\ref af::timer::start) starts a timer
66 * [timer::start()](\ref af::timer::stop) seconds since last \ref
67 af::timer::start "start"
69 * \ref af::timer::stop(af::timer start) "timer::stop(timer start)" seconds
74 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~{.cpp}
76 // - be sure to use the eval and sync functions so that previous code
77 // does not get timed as part of the execution segment being measured
80 // - be sure to use the eval and sync functions to ensure the code
81 // segment operations have been completed
83 printf("elapsed seconds: %g\n", timer::stop());
84 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
86 Example: multiple timers
88 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~{.cpp}
90 // - be sure to use the eval and sync functions so that previous code
91 // does not get timed as part of the execution segment being measured
92 timer start1 = timer::start();
93 timer start2 = timer::start();
95 // - be sure to use the eval and sync functions to ensure the code
96 // segment operations have been completed
98 printf("elapsed seconds: %g\n", timer::stop(start1));
99 // run another code segment
100 // - be sure to use the eval and sync functions to ensure the code
101 // segment operations have been completed
103 printf("elapsed seconds: %g\n", timer::stop(start2));
104 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
106 Accurate and reliable measurement of performance involves several factors:
107 * Executing enough iterations to achieve peak performance.
108 * Executing enough repetitions to amortize any overhead from system timers.
110 2. The [timeit](\ref af::timeit) function:
112 To take care of much of this boilerplate, [timeit](\ref af::timeit) provides
113 accurate and reliable estimates of both CPU or GPU code.
115 Here is a stripped down example of [Monte-Carlo estimation of PI](\ref
116 benchmarks/pi.cpp) making use of [timeit](\ref af::timeit). Notice how it
117 expects a `void` function pointer.
119 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~{.cpp}
121 #include <arrayfire.h>
125 int n = 20e6; // 20 million random samples
126 array x = randu(n, f32), y = randu(n, f32);
127 // how many fell inside unit circle?
128 float pi = 4.0 * sum<float>(sqrt(x*x + y*y)) < 1) / n;
132 printf("pi_function took %g seconds\n", timeit(pi_function));
135 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
139 pi_function took 0.007252 seconds
140 (test machine: Core i7 920 @ 2.67GHz with a Tesla C2070)
143 ## A note on kernel caching
145 The first run of ArrayFire code exercises any JIT compilation in the
146 application, automatically saving a cache of the compilation to
147 disk. Subsequent runs load the cache from disk, executing without
148 compilation. Therefore, it is typically best to "warm up" the code with one
149 run to initiate the application's kernel cache. Afterwards, subsequent runs do
150 not include the compile time and are tend to be faster than the first run.
152 Averaging the time taken is always the best approach and one reason why the
153 [timeit](\ref af::timeit) function is helpful.