Using exponentiation **0.5 less efficient than mat

2020-03-01 10:14发布

问题:

A quote from "Python Programming: An Introduction to Computer Science"

We could have taken the square root using exponentiation **. Using math.sqrt is somewhat more efficient.

"Somewhat", but to what extent, and how?

回答1:

Theoretically, hammar's answer and duffymo's answer are good guesses. But in practice, on my machine, it's not more efficient:

>>> import timeit
>>> timeit.timeit(stmt='[n ** 0.5 for n in range(100)]', setup='import math', number=10000)
0.15518403053283691
>>> timeit.timeit(stmt='[math.sqrt(n) for n in range(100)]', setup='import math', number=10000)
0.17707490921020508

Part of the problem is the . operation. If you import sqrt directly into the namespace, you get a slight improvement.

>>> timeit.timeit(stmt='[sqrt(n) for n in range(100)]', setup='from math import sqrt', number=10000)
0.15312695503234863

Key word there: slight.

Further testing indicates that as the number gets larger, the benefit you get from using sqrt increases. But still not by a lot!

>>> timeit.timeit(stmt='[n ** 0.5 for n in range(1000000)]', setup='import math', number=1)
0.18888211250305176
>>> timeit.timeit(stmt='[math.sqrt(n) for n in range(1000000)]', setup='import math', number=1)
0.18425297737121582
>>> timeit.timeit(stmt='[sqrt(n) for n in range(1000000)]', setup='from math import sqrt', number=1)
0.1571958065032959


回答2:

No need to guess the implementation, we can read the code!

math.sqrt is a thin wrapper about sqrt from the standard C library: see mathmodule.c, line 956

The ** operator has multiple implementations depending on the types of the arguments, but in the case of a floating-point exponent, it eventually dispatches to pow from the standard C library (see floatobject.c line 783).

Modern CPUs often have special square root instructions which general exponentiation routines don't use (compare and contrast the implementations of pow and sqrt in glibc for x86-64, for example). But once all the interpreter overhead is added (byte codes, type checking, method dispatch etc), the difference in raw speed doesn't matter all that much, and can be dominated by issues like whether you call sqrt directly or look it up via the math module (as shown by the timings in other answers).



回答3:

** has to support any exponent while math.sqrt knows it's always 0.5. math.sqrt can therefore use a more specialized (and therefore probably more efficient) algorithm.



回答4:

My guess is that math.sqrt uses Newton's method, which converges quadratically, and exponentiation uses something else that is slower.



回答5:

Here's a slightly different approach. We want an int just bigger than the square root. Two ways (which disagree for square numbers but that's OK):

>>>timeit.timeit(stmt='[int(n**0.5)+1 for n in range(1000000)]', setup='', number=1)  
0.481772899628  
>>>timeit.timeit(stmt='[ceil(sqrt(n)) for n in range(1000000)]', setup='from math import sqrt, ceil', number=1)  
0.293844938278  
>>>timeit.timeit(stmt='[int(ceil(sqrt(n))) for n in range(1000000)]', setup='from math import sqrt, ceil', number=1)  
0.511347055435

So the math functions are faster...until you convert the float to int. (I need to do a lot of comparisons with the value, and while I haven't tested it, comparing integers should be cheaper than comparing floats.)

But hey, it's Python. You're on top of too many abstractions to try to optimize performance with this level of granularity.