Why does IDLE 3.4 take so long on this program?

2020-08-26 03:32发布

问题:

EDIT: I'm redoing the question entirely. The issue has nothing to do with time.time()

Here's a program:

import time
start=time.time()
a=9<<(1<<26)             # The line that makes it take a while
print(time.time()-start)

This program, when saved as a file and run with IDLE in Python 3.4, takes about 10 seconds, even though 0.0 is printed out from time.time(). The issue is very clearly with IDLE, because when run from the command line this program takes almost no time at all.

Another program that has the same effect, as found by senshin, is:

def f(): 
    a = 9<<(1<<26)

I have confirmed that this same program, when run in Python 2.7 IDLE or from the command line on python 2.7 or 3.4, is near instantaneous.

So what is Python 3.4 IDLE doing that makes it take so long? I understand that calculating this number and saving it to memory is disk intensive, but what I'd like to know is why Python 3.4 IDLE performs this computation and write when Python 2.7 IDLE and command line Python presumably do not.

回答1:

I would look at that line and pick it apart. You have:

9 << (1 << 26)

(1 << 26) is the first expression evaluated, and it produces a really large number. What this line is saying, is that you are going to multiply the number 1 by 2 to the power of 26, effectively producing the number 2 ** 26 in memory. This is not the problem however. You then shift 9 left by the count of 2 ** 26. This produces a number that is around 50 million digits long in memory (I cant even calculate it exactly!), because the shift left is just too big. Be careful in the future, as shifts by what seems to be small amounts do in fact grow very fast. If it was any larger, your program may have not run at all. Your expression mathematically evaluates to 9 * 2 ** (2 ** 26), if you were curious.

The ambiguity in the comment section is probably actually dealing with how this huge portion of memory is handled by python under the hood, and not IDLE.

EDIT 1:

I thing that what is happening, is that a mathematical expression evaluates to its answer, even when placed inside of a function that isn't called yet, only if the expression is self sufficient. This means that if a variable is used in the equation, the equation will be untouched in the byte code, and not evaluated until hard execution. The function has to be interpreted, and in that process, I think that your value is actually computed, resulting in the slower times. I am not sure about this, but I strongly suspect this behavior to be the root cause. Even if it is not so, you got to admit that 9<<(1<<26) kicks the computer in the behind, there's not much optimization that can be done there.

In[73]: def create_number():
            return 9<<(1<<26)
In[74]: #Note that this seems instantaneous, but try calling the function!
In[75]: %timeit create_number()
#Python environment crashes because task is too hard

There is a slight deception in this kind of testing however. When trying this with the regular timeit, I got:

In[3]: from timeit import timeit
In[4]: timeit(setup = 'from __main__ import create_number', stmt = 'create_number()', number = 1)
Out[4]: .004942887388800443

Also keep in mind that printing the value is not do-able, so something like:

In[102]: 9<<(1<<26)

should not even be attempted.

For even more added support:

I felt like a rebel, so I decided to see what would happen if I timeit the raw execution of the equation:

In[107]: %timeit 9<<(1<<26)
10000000 loops, best of 3: 22.8 ns per loop

In[108]: def empty(): pass
In[109]: %timeit empty()
10000000 loops, best of 3: 96.3 ns per loop

This is really fishy, because apparently this calculation happens faster than the time it takes Python to call an empty function, which is obviously not the case. I repeat, this is not instantaneous, but probably has something to do with retrieving an already calculated object somewhere in memory, and reusing that value to calculate the expression. Anyways, nice question.



回答2:

I am really puzzled. Here are more results with 3.4.1. Running either of the first two lines from the editor in either 3.4.1 or 3.3.5 gives the same contrast.

>>> a = 1 << 26; b = 9 << a  # fast, , .5 sec
>>> c = 9 << (1 << 26)  # slow, about 3 sec
>>> b == c  # fast
True
>>> exec('d=9<<(1<<26)', globals())  # fast
>>> c == d  # fast
True

The difference between normal execution and Idle's is that Idle exec's code in an exec call like the above, except that the 'globals' passed to exec is not globals() but a dict configured to look like globals(). I do not know of any 2.7 -- 3.4 difference in Idle in this respect except for the change of exec from statement to function. How can exec'ing an exec be faster than a single exec? How can adding an intermediate binding be faster?