profiler.h – performance profiling¶
Timer based on the cycle counter¶
-
void
timeit_start
(timeit_t t) void timeit_stop(timeit_t t)¶ Gives wall and user time - useful for parallel programming.
Example usage:
timeit_t t0; // ... timeit_start(t0); // do stuff, take some time timeit_stop(t0); flint_printf("cpu = %wd ms wall = %wd ms\n", t0->cpu, t0->wall);
-
void
start_clock
(int n) void stop_clock(int n) double get_clock(int n)¶ Gives time based on cycle counter.
First one must ensure the processor speed in cycles per second is set correctly in
profiler.h
, in the macro definition#define FLINT_CLOCKSPEED
.One can access the cycle counter directly by
get_cycle_counter()
which returns the current cycle counter as adouble
.A sample usage of clocks is:
init_all_clocks(); start_clock(n); // do something stop_clock(n); flint_printf("Time in seconds is %f.3\n", get_clock(n));
where
n
is a clock number (from 0-19 by default). The number of clocks can be changed by alteringFLINT_NUM_CLOCKS
. One can also initialise an individual clock withinit_clock(n)
.
Framework for repeatedly sampling a single target¶
-
void
prof_repeat
(double *min, double *max, profile_target_t target, void *arg)¶ Allows one to automatically time a given function. Here is a sample usage:
Suppose one has a function one wishes to profile:
void myfunc(ulong a, ulong b);
One creates a struct for passing arguments to our function:
typedef struct { ulong a, b; } myfunc_t;
a sample function:
void sample_myfunc(void * arg, ulong count) { myfunc_t * params = (myfunc_t *) arg; ulong a = params->a; ulong b = params->b; for (ulong i = 0; i < count; i++) { prof_start(); myfunc(a, b); prof_stop(); } }
Then we do the profile:
double min, max; myfunc_t params; params.a = 3; params.b = 4; prof_repeat(&min, &max, sample_myfunc, ¶ms); flint_printf("Min time is %lf.3s, max time is %lf.3s\n", min, max);
If either of the first two parameters to
prof_repeat
areNULL
, that value is not stored.One may set the minimum time in microseconds for a timing run by adjusting\
DURATION_THRESHOLD
and one may set a target duration in microseconds by adjustingDURATION_TARGET
inprofiler.h
.
Memory usage¶
-
void
get_memory_usage
(meminfo_t meminfo)¶ Obtains information about the memory usage of the current process. The meminfo object contains the slots
size
(virtual memory size),peak
(peak virtual memory size),rss
(resident set size),hwm
(peak resident set size). The values are stored in kilobytes (1024 bytes). This function currently only works on Linux.
Simple profiling macros¶
-
macro
TIMEIT_REPEAT
(timer, reps) macro TIMEIT_END_REPEAT(timer, reps)¶ Repeatedly runs the code between the
TIMEIT_REPEAT
and theTIMEIT_END_REPEAT
markers, automatically increasing the number of repetitions until the elapsed time exceeds the timer resolution. The macro takes as input a predefinedtimeit_t
object and an integer variable to hold the number of repetitions.
-
macro TIMEIT_START macro
TIMEIT_STOP
()¶ Repeatedly runs the code between the
TIMEIT_START
and theTIMEIT_STOP
markers, automatically increasing the number of repetitions until the elapsed time exceeds the timer resolution, and then prints the average elapsed cpu and wall time for a single repetition.
-
macro TIMEIT_ONCE_START macro
TIMEIT_ONCE_STOP
()¶ Runs the code between the
TIMEIT_ONCE_START
and theTIMEIT_ONCE_STOP
markers exactly once and then prints the elapsed cpu and wall time. This does not give a precise measurement if the elapsed time is short compared to the timer resolution.
-
macro
SHOW_MEMORY_USAGE
()¶ Retrieves memory usage information via
get_memory_usage
and prints the results.