I notice a case in python, when a block of code, nested in a loop, runs continuously, it is much faster than running with some .sleep()
time interval.
I wonder the reason and a possible solution.
I guess it's related to CPU-cache or some mechanism of cPython VM.
'''
Created on Aug 22, 2015
@author: doge
'''
import numpy as np
import time
import gc
gc.disable()
t = np.arange(100000)
for i in xrange(100):
#np.sum(t)
time.sleep(1) #--> if you comment this line, the following lines will be much faster
st = time.time()
np.sum(t)
print (time.time() - st)*1e6
result:
without sleep in loop, time consumed: 50us
with a sleep in loop, time consumed: >150us
some disadvantage of the .sleep()
is, that it releases CPU, thus I provide the exactly same version with a C
code below:
'''
Created on Aug 22, 2015
@author: doge
'''
import numpy as np
import time
import gc
gc.disable()
t = np.arange(100000)
count = 0
for i in xrange(100):
count += 1
if ( count % 1000000 != 0 ):
continue
#--> these three lines make the following lines much slower
st = time.time()
np.sum(t)
print (time.time() - st)*1e6
another experiment: (we remove the for loop)
st = time.time()
np.sum(t)
print (time.time() - st)*1e6
st = time.time()
np.sum(t)
print (time.time() - st)*1e6
st = time.time()
np.sum(t)
print (time.time() - st)*1e6
...
st = time.time()
np.sum(t)
print (time.time() - st)*1e6
result:
execution time decreased from 150us -> 50us gradually.
and keep stable in 50us.
to find out whether this is problem of CPU-cache, I wrote a C
counterpart. And have found out that this kind of phenomenon does not happen.
#include <iostream>
#include <sys/time.h>
#define num 100000
using namespace std;
long gus()
{
struct timeval tm;
gettimeofday(&tm, NULL);
return ( (tm.tv_sec % 86400 + 28800) % 86400 )*1000000 + tm.tv_usec;
}
double vec_sum(double *v, int n){
double result = 0;
for(int i = 0;i < n;++i){
result += v[i];
}
return result;
}
int main(){
double a[num];
for(int i = 0; i < num; ++i){
a[i] = (double)i;
}
//for(int i = 0; i < 1000; ++i){
// cout<<a[i]<<"\n";
//}
int count = 0;
long st;
while(1){
++count;
if(count%100000000 != 0){ //---> i use this line to create a delay, we can do the same way in python, result is the same
//if(count%1 != 0){
continue;
}
st = gus();
vec_sum(a,num);
cout<<gus() - st<<endl;
}
return 0;
}
result:
time stable in 250us, no matter in "count%100000000" or "count%1"
(not an answer - but too long to post as comment)
i did some experimentation and ran (something slightly simpler) through
timeit
.the result is:
the first line is just to check that the the sleep function uses about
n_timeit*n_loop*sleep_sec
seconds. so if this value is small - that should be ok.but as you see - your findings remain: the loop with the sleep function (subtracting the time sleep uses) takes up more time than the loop without sleep...
i don't think that python optimizes the loop without sleep (a c compiler might; the variable
s
is never used).