Benchmark code - dividing by the number of iterati

I had an interesting discussion with my friend about benchmarking a C/C++ code (or code, in general). We wrote a simple function which uses getrusage to measure cpu time for a given piece of code. (It measures how much time of cpu it took to run a specific function). Let me give you an example:

const int iterations = 409600; 
double s = measureCPU(); 
for( j = 0; j < iterations; j++ )
        function(args); 
double e = measureCPU(); 
std::cout << (e-s)/iterations << " s \n";

We argued, should we divide (e-s) by the number of iterations, or not? I mean, when we dont divide it the result is in acceptable form (ex. 3.0 s) but when we do divide it, it gives us results like 2.34385e-07 s ...

So here are my questions:

should we divide (e-s) by the number of iterations, if so, why?
how can we print 2.34385e-07 s in more human-readable form? (let's say, it took 0.00000003 s) ?

should we first make a function call for once, and after that measure cpu time for iterations, something like this:

// first function call, doesnt bother with it at all
function(args); 
// real benchmarking
const int iterations = 409600; 
double s = measureCPU(); 
for( j = 0; j < iterations; j++ )
            function(args); 
double e = measureCPU(); 
std::cout << (e-s)/iterations << " s \n";

if you divide the time by number of iterations, then you'll get iteration independent comparison of run time of one function, the more iterations, the more precise result. EDIT: its an average run time over n iterations.
you can multiply the divided time by 1e6 to get microseconds per one iteration unit (i assume that measureCPU returns secods)
```
std::cout << 1e6*(e-s)/iterations << " s \n";
```
as @ogni42 stated, you are getting an overhead from for loop into your measured time, so you could try to unroll the loop a bit to lower the measurement error, do a 8 to 16 calls each iteration, try different call counts to see how the measured time changes:
```
for( j = 0; j < iterations; j++ ) {
    function(args);
    function(args);
    function(args);
    function(args);
    ...
}
```
What you basically get is a lower is better number. If you wanted higher is better scoring you could measure diferent variations of function and then get the time of the fastest one. This one could score 10 points.
```
score_for_actual_function = 10.0 * fastest_time / time_of_actual_function
```

This scoring is kind of time independent, so you can just compare different function variations and the function can score less than one point... and beware of division by zero :)