“Least Astonishment” and the Mutable Default Argum

2020-01-22 10:09发布

Anyone tinkering with Python long enough has been bitten (or torn to pieces) by the following issue:

def foo(a=[]):
    a.append(5)
    return a

Python novices would expect this function to always return a list with only one element: [5]. The result is instead very different, and very astonishing (for a novice):

>>> foo()
[5]
>>> foo()
[5, 5]
>>> foo()
[5, 5, 5]
>>> foo()
[5, 5, 5, 5]
>>> foo()

A manager of mine once had his first encounter with this feature, and called it "a dramatic design flaw" of the language. I replied that the behavior had an underlying explanation, and it is indeed very puzzling and unexpected if you don't understand the internals. However, I was not able to answer (to myself) the following question: what is the reason for binding the default argument at function definition, and not at function execution? I doubt the experienced behavior has a practical use (who really used static variables in C, without breeding bugs?)

Edit:

Baczek made an interesting example. Together with most of your comments and Utaal's in particular, I elaborated further:

>>> def a():
...     print("a executed")
...     return []
... 
>>>            
>>> def b(x=a()):
...     x.append(5)
...     print(x)
... 
a executed
>>> b()
[5]
>>> b()
[5, 5]

To me, it seems that the design decision was relative to where to put the scope of parameters: inside the function or "together" with it?

Doing the binding inside the function would mean that x is effectively bound to the specified default when the function is called, not defined, something that would present a deep flaw: the def line would be "hybrid" in the sense that part of the binding (of the function object) would happen at definition, and part (assignment of default parameters) at function invocation time.

The actual behavior is more consistent: everything of that line gets evaluated when that line is executed, meaning at function definition.

30条回答
你好瞎i
2楼-- · 2020-01-22 10:33

Already busy topic, but from what I read here, the following helped me realizing how it's working internally:

def bar(a=[]):
     print id(a)
     a = a + [1]
     print id(a)
     return a

>>> bar()
4484370232
4484524224
[1]
>>> bar()
4484370232
4484524152
[1]
>>> bar()
4484370232 # Never change, this is 'class property' of the function
4484523720 # Always a new object 
[1]
>>> id(bar.func_defaults[0])
4484370232
查看更多
欢心
3楼-- · 2020-01-22 10:34

1) The so-called problem of "Mutable Default Argument" is in general a special example demonstrating that:
"All functions with this problem suffer also from similar side effect problem on the actual parameter,"
That is against the rules of functional programming, usually undesiderable and should be fixed both together.

Example:

def foo(a=[]):                 # the same problematic function
    a.append(5)
    return a

>>> somevar = [1, 2]           # an example without a default parameter
>>> foo(somevar)
[1, 2, 5]
>>> somevar
[1, 2, 5]                      # usually expected [1, 2]

Solution: a copy
An absolutely safe solution is to copy or deepcopy the input object first and then to do whatever with the copy.

def foo(a=[]):
    a = a[:]     # a copy
    a.append(5)
    return a     # or everything safe by one line: "return a + [5]"

Many builtin mutable types have a copy method like some_dict.copy() or some_set.copy() or can be copied easy like somelist[:] or list(some_list). Every object can be also copied by copy.copy(any_object) or more thorough by copy.deepcopy() (the latter useful if the mutable object is composed from mutable objects). Some objects are fundamentally based on side effects like "file" object and can not be meaningfully reproduced by copy. copying

Example problem for a similar SO question

class Test(object):            # the original problematic class
  def __init__(self, var1=[]):
    self._var1 = var1

somevar = [1, 2]               # an example without a default parameter
t1 = Test(somevar)
t2 = Test(somevar)
t1._var1.append([1])
print somevar                  # [1, 2, [1]] but usually expected [1, 2]
print t2._var1                 # [1, 2, [1]] but usually expected [1, 2]

It shouldn't be neither saved in any public attribute of an instance returned by this function. (Assuming that private attributes of instance should not be modified from outside of this class or subclasses by convention. i.e. _var1 is a private attribute )

Conclusion:
Input parameters objects shouldn't be modified in place (mutated) nor they should not be binded into an object returned by the function. (If we prefere programming without side effects which is strongly recommended. see Wiki about "side effect" (The first two paragraphs are relevent in this context.) .)

2)
Only if the side effect on the actual parameter is required but unwanted on the default parameter then the useful solution is def ...(var1=None): if var1 is None: var1 = [] More..

3) In some cases is the mutable behavior of default parameters useful.

查看更多
萌系小妹纸
4楼-- · 2020-01-22 10:34

You can get round this by replacing the object (and therefore the tie with the scope):

def foo(a=[]):
    a = list(a)
    a.append(5)
    return a

Ugly, but it works.

查看更多
Viruses.
5楼-- · 2020-01-22 10:37

I sometimes exploit this behavior as an alternative to the following pattern:

singleton = None

def use_singleton():
    global singleton

    if singleton is None:
        singleton = _make_singleton()

    return singleton.use_me()

If singleton is only used by use_singleton, I like the following pattern as a replacement:

# _make_singleton() is called only once when the def is executed
def use_singleton(singleton=_make_singleton()):
    return singleton.use_me()

I've used this for instantiating client classes that access external resources, and also for creating dicts or lists for memoization.

Since I don't think this pattern is well known, I do put a short comment in to guard against future misunderstandings.

查看更多
何必那么认真
6楼-- · 2020-01-22 10:39

What you're asking is why this:

def func(a=[], b = 2):
    pass

isn't internally equivalent to this:

def func(a=None, b = None):
    a_default = lambda: []
    b_default = lambda: 2
    def actual_func(a=None, b=None):
        if a is None: a = a_default()
        if b is None: b = b_default()
    return actual_func
func = func()

except for the case of explicitly calling func(None, None), which we'll ignore.

In other words, instead of evaluating default parameters, why not store each of them, and evaluate them when the function is called?

One answer is probably right there--it would effectively turn every function with default parameters into a closure. Even if it's all hidden away in the interpreter and not a full-blown closure, the data's got to be stored somewhere. It'd be slower and use more memory.

查看更多
贪生不怕死
7楼-- · 2020-01-22 10:39

It's a performance optimization. As a result of this functionality, which of these two function calls do you think is faster?

def print_tuple(some_tuple=(1,2,3)):
    print some_tuple

print_tuple()        #1
print_tuple((1,2,3)) #2

I'll give you a hint. Here's the disassembly (see http://docs.python.org/library/dis.html):

#1

0 LOAD_GLOBAL              0 (print_tuple)
3 CALL_FUNCTION            0
6 POP_TOP
7 LOAD_CONST               0 (None)
10 RETURN_VALUE

#2

 0 LOAD_GLOBAL              0 (print_tuple)
 3 LOAD_CONST               4 ((1, 2, 3))
 6 CALL_FUNCTION            1
 9 POP_TOP
10 LOAD_CONST               0 (None)
13 RETURN_VALUE

I doubt the experienced behavior has a practical use (who really used static variables in C, without breeding bugs ?)

As you can see, there is a performance benefit when using immutable default arguments. This can make a difference if it's a frequently called function or the default argument takes a long time to construct. Also, bear in mind that Python isn't C. In C you have constants that are pretty much free. In Python you don't have this benefit.

查看更多
登录 后发表回答