How do I measure the execution time of a method in ECLiPSe CLP? Currently, I have this:
measure_traditional(Difficulty,Selection,Choice):-
statistics(runtime, _),
time(solve_traditional(Difficulty,Selection,Choice,_)),
time(solve_traditional(Difficulty,Selection,Choice,_)),
time(solve_traditional(Difficulty,Selection,Choice,_)),
time(solve_traditional(Difficulty,Selection,Choice,_)),
time(solve_traditional(Difficulty,Selection,Choice,_)),
time(solve_traditional(Difficulty,Selection,Choice,_)),
time(solve_traditional(Difficulty,Selection,Choice,_)),
time(solve_traditional(Difficulty,Selection,Choice,_)),
statistics(runtime,[_|T]), % T
write(T).
I need to write the time it took to perform a method solve_traditional(...) and write it out to a text file. However, it is not precise enough. Sometimes time will print 0.015 or 0.016 seconds for the given method, but usually it prints 0.0 seconds.
Figuring the method completes too fast, I decided to make use of statistics(runtime, ...) to measure the time it takes between two runtime calls. I could then measure for example the time it takes to complete 20 method calls and divide the measured time T by 20.
Only problem is, with 20 calls T equals either 0, 16, 32 or 48 milliseconds. Apparently, it measures the time for each method call separately and finds the sum of the execution times (which is often just 0.0s). This beats the whole purpose of measuring the runtime for N method calls and dividing the time T by N.
In short: the current methods I'm using for execution time measurements are inadequate. Is there a way to make it more precise (9 decimals for example)?
Benchmarking is a tricky business in any programming language, and particularly so in CLP. Especially if you plan to publish your results, you should be extremely thorough and make absolutely sure you are measuring what you claim to measure.
See the different timers offered by the statistics/2 primitive. There is a real-time high-resolution timer that can be accessed via statistics(hr_time,T).
If your benchmark runtime is too short, you have to run it multiple times.
Run enough repetitions, run on a quiet machine, make sure your results are reproducible.
In your code, you run the benchmark successively in a conjunction. This is not recommended because variable instantiations, delayed goals or garbage can survive from previous runs and slow down or speed up subsequent runs. As suggested above, you could use the pattern
which is essentially a way of repeating N times the sequence
The point of this is that the combination of
once/1
andfail
undoes all ofGoal
's computation, so that the next iteration starts as much as possible from a similar machine state. Unfortunately, this undo-process itself adds extra runtime, which distorts the measurement...You either have to make sure that the overhead is negligible, or you have to measure the overhead (e.g. by running the test framework with a dummy benchmark) and subtract it, for example:
A paper that discusses these things specifically is: On Benchmarking Constraint Logic Programming Platforms, by Mark Wallace, Joachim Schimpf, Kish Shen and Warwick Harvey. In CONSTRAINTS Journal, ed. E.C. Freuder,9(1), pp 5-34, Kluwer, 2004.