I found that creation of a class is way slower than instantiation of a class.
>>> from timeit import Timer as T
>>> def calc(n):
... return T("class Haha(object): pass").timeit(n)
<<After several these 'calc' things, at least one of them have a big number, eg. 100000>>
>>> calc(9000)
15.947055101394653
>>> calc(9000)
17.39099097251892
>>> calc(9000)
18.824054956436157
>>> calc(9000)
20.33335590362549
Yeah, create 9000 classes took 16 secs, and becomes even slower in the subsequent calls.
And this:
>>> T("type('Haha', b, d)", "b = (object, ); d = {}").timeit(9000)
gives similar results.
But instantiation don't suffer:
>>> T("Haha()", "class Haha(object): pass").timeit(5000000)
0.8786070346832275
5000000 instances in less than one sec.
What makes the creation this expensive?
And why the creation process become slower?
EDIT:
How to reproduce:
start a fresh python process, the initial several "calc(10000)"s give a number of 0.5 on my machine. And try some bigger values, calc(100000), it can't end in even 10secs, interrupt it, and calc(10000), gives a 15sec.
EDIT:
Additional fact:
If you gc.collect() after 'calc' becomes slow, you can get the 'normal' speed at beginning, but the timing will increasing in subsequent calls
>>> from a import calc
>>> calc(10000)
0.4673938751220703
>>> calc(10000)
0.4300072193145752
>>> calc(10000)
0.4270968437194824
>>> calc(10000)
0.42754602432250977
>>> calc(10000)
0.4344758987426758
>>> calc(100000)
^CTraceback (most recent call last):
File "<stdin>", line 1, in <module>
File "a.py", line 3, in calc
return T("class Haha(object): pass").timeit(n)
File "/usr/lib/python2.7/timeit.py", line 194, in timeit
timing = self.inner(it, self.timer)
File "<timeit-src>", line 6, in inner
KeyboardInterrupt
>>> import gc
>>> gc.collect()
234204
>>> calc(10000)
0.4237039089202881
>>> calc(10000)
1.5998330116271973
>>> calc(10000)
4.136359930038452
>>> calc(10000)
6.625348806381226
Ahahaha! Gotcha!
Was this perchance done on a Python version without this patch? (HINT: IT WAS)
Check the line numbers if you want proof.
Marcin was right: when the results look screwy you've probably got a screwy benchmark. Run gc.disable()
and the results reproduce themselves. It just shows that when you disable garbage collection you get garbage results!
To be more clear, the reason running the long benchmark broke things is that:
timeit
disables garbage collections, so overly large benchmarks take much (exponentially) longer
timeit
wasn't restoring garbage collection on exceptions
You quit the long-running process with an asynchronous exception, turning off garbage collection
This might give you the intuition:
>>> class Haha(object): pass
...
>>> sys.getsizeof(Haha)
904
>>> sys.getsizeof(Haha())
64
Class object is much more complex and expensive structure than an instance of that class.
A quick dis of the following functions:
def a():
class Haha(object):
pass
def b():
Haha()
gives:
2 0 LOAD_CONST 1 ('Haha')
3 LOAD_GLOBAL 0 (object)
6 BUILD_TUPLE 1
9 LOAD_CONST 2 (<code object Haha at 0x7ff3e468bab0, file "<stdin>", line 2>)
12 MAKE_FUNCTION 0
15 CALL_FUNCTION 0
18 BUILD_CLASS
19 STORE_FAST 0 (Haha)
22 LOAD_CONST 0 (None)
25 RETURN_VALUE
and
2 0 LOAD_GLOBAL 0 (Haha)
3 CALL_FUNCTION 0
6 POP_TOP
7 LOAD_CONST 0 (None)
10 RETURN_VALUE
accordingly.
By the looks of it, it simply does more stuff when creating a class. It has to initialize class, add it to dicts, and wherever else, while in case of Haha()
is just calls a function.
As you noticed doing garbage collection when it gets's too slow speeds stuff up again, so Marcin's right in saying that it's probably memory fragmentation issue.
It isn't: Only your contrived tests show slow class creation. In fact, as @Veedrac shows in his answer, this result is an artifact of timeit disabling garbage collection.
Downvoters: Show me a non-contrived example where class creation is slow.
In any case, your timings are affected by the load on your system at the time. They are really only useful for comparisons performed at pretty much the same time. I get about 0.5s for 9000 class creations. In fact, it's about 0.3s on ideone, even when performed repeatedly: http://ideone.com/Du859. There isn't even an upward trend.
So, in summary, it is much slower on your computer than others, and there is no upwards trend on other computers for repeated tests (as per your original claim). Testing massive numbers of instantiations does show slowing down, presumably because the process consumes a lot of memory. You have shown that allocating a huge amount of memory slows a process down. Well done.
That ideone code in full:
from timeit import Timer as T
def calc(n):
return T("class Haha(object): pass").timeit(n)
for i in xrange(30):
print calc(9000)