“Least Astonishment” and the Mutable Default Argum

Anyone tinkering with Python long enough has been bitten (or torn to pieces) by the following issue:

def foo(a=[]):
    a.append(5)
    return a

Python novices would expect this function to always return a list with only one element: [5]. The result is instead very different, and very astonishing (for a novice):

>>> foo()
[5]
>>> foo()
[5, 5]
>>> foo()
[5, 5, 5]
>>> foo()
[5, 5, 5, 5]
>>> foo()

A manager of mine once had his first encounter with this feature, and called it "a dramatic design flaw" of the language. I replied that the behavior had an underlying explanation, and it is indeed very puzzling and unexpected if you don't understand the internals. However, I was not able to answer (to myself) the following question: what is the reason for binding the default argument at function definition, and not at function execution? I doubt the experienced behavior has a practical use (who really used static variables in C, without breeding bugs?)

Edit:

Baczek made an interesting example. Together with most of your comments and Utaal's in particular, I elaborated further:

>>> def a():
...     print("a executed")
...     return []
... 
>>>            
>>> def b(x=a()):
...     x.append(5)
...     print(x)
... 
a executed
>>> b()
[5]
>>> b()
[5, 5]

To me, it seems that the design decision was relative to where to put the scope of parameters: inside the function or "together" with it?

Doing the binding inside the function would mean that x is effectively bound to the specified default when the function is called, not defined, something that would present a deep flaw: the def line would be "hybrid" in the sense that part of the binding (of the function object) would happen at definition, and part (assignment of default parameters) at function invocation time.

The actual behavior is more consistent: everything of that line gets evaluated when that line is executed, meaning at function definition.

标签： python language-design default-parameters least-astonishment

30条回答

ら.Afraid

2楼-- · 2019-09-11 04:52

This "bug" gave me a lot of overtime work hours! But I'm beginning to see a potential use of it (but I would have liked it to be at the execution time, still)

I'm gonna give you what I see as a useful example.

def example(errors=[]):
    # statements
    # Something went wrong
    mistake = True
    if mistake:
        tryToFixIt(errors)
        # Didn't work.. let's try again
        tryToFixItAnotherway(errors)
        # This time it worked
    return errors

def tryToFixIt(err):
    err.append('Attempt to fix it')

def tryToFixItAnotherway(err):
    err.append('Attempt to fix it by another way')

def main():
    for item in range(2):
        errors = example()
    print '\n'.join(errors)

main()

prints the following

Attempt to fix it
Attempt to fix it by another way
Attempt to fix it
Attempt to fix it by another way

0人赞添加讨论(0) 举报

SAY GOODBYE

3楼-- · 2019-09-11 04:56

I sometimes exploit this behavior as an alternative to the following pattern:

singleton = None

def use_singleton():
    global singleton

    if singleton is None:
        singleton = _make_singleton()

    return singleton.use_me()

If singleton is only used by use_singleton, I like the following pattern as a replacement:

# _make_singleton() is called only once when the def is executed
def use_singleton(singleton=_make_singleton()):
    return singleton.use_me()

I've used this for instantiating client classes that access external resources, and also for creating dicts or lists for memoization.

Since I don't think this pattern is well known, I do put a short comment in to guard against future misunderstandings.

0人赞添加讨论(0) 举报

放荡不羁爱自由

4楼-- · 2019-09-11 04:58

Suppose you have the following code

fruits = ("apples", "bananas", "loganberries")

def eat(food=fruits):
    ...

When I see the declaration of eat, the least astonishing thing is to think that if the first parameter is not given, that it will be equal to the tuple ("apples", "bananas", "loganberries")

However, supposed later on in the code, I do something like

def some_random_function():
    global fruits
    fruits = ("blueberries", "mangos")

then if default parameters were bound at function execution rather than function declaration then I would be astonished (in a very bad way) to discover that fruits had been changed. This would be more astonishing IMO than discovering that your foo function above was mutating the list.

The real problem lies with mutable variables, and all languages have this problem to some extent. Here's a question: suppose in Java I have the following code:

StringBuffer s = new StringBuffer("Hello World!");
Map<StringBuffer,Integer> counts = new HashMap<StringBuffer,Integer>();
counts.put(s, 5);
s.append("!!!!");
System.out.println( counts.get(s) );  // does this work?

Now, does my map use the value of the StringBuffer key when it was placed into the map, or does it store the key by reference? Either way, someone is astonished; either the person who tried to get the object out of the Map using a value identical to the one they put it in with, or the person who can't seem to retrieve their object even though the key they're using is literally the same object that was used to put it into the map (this is actually why Python doesn't allow its mutable built-in data types to be used as dictionary keys).

Your example is a good one of a case where Python newcomers will be surprised and bitten. But I'd argue that if we "fixed" this, then that would only create a different situation where they'd be bitten instead, and that one would be even less intuitive. Moreover, this is always the case when dealing with mutable variables; you always run into cases where someone could intuitively expect one or the opposite behavior depending on what code they're writing.

I personally like Python's current approach: default function arguments are evaluated when the function is defined and that object is always the default. I suppose they could special-case using an empty list, but that kind of special casing would cause even more astonishment, not to mention be backwards incompatible.

0人赞添加讨论(0) 举报

叼着烟拽天下

5楼-- · 2019-09-11 04:59

Well, the reason is quite simply that bindings are done when code is executed, and the function definition is executed, well... when the functions is defined.

Compare this:

class BananaBunch:
    bananas = []

    def addBanana(self, banana):
        self.bananas.append(banana)

This code suffers from the exact same unexpected happenstance. bananas is a class attribute, and hence, when you add things to it, it's added to all instances of that class. The reason is exactly the same.

It's just "How It Works", and making it work differently in the function case would probably be complicated, and in the class case likely impossible, or at least slow down object instantiation a lot, as you would have to keep the class code around and execute it when objects are created.

Yes, it is unexpected. But once the penny drops, it fits in perfectly with how Python works in general. In fact, it's a good teaching aid, and once you understand why this happens, you'll grok python much better.

That said it should feature prominently in any good Python tutorial. Because as you mention, everyone runs into this problem sooner or later.

0人赞添加讨论(0) 举报

Explosion°爆炸

6楼-- · 2019-09-11 04:59

The solutions here are:

Use None as your default value (or a nonce object), and switch on that to create your values at runtime; or
Use a lambda as your default parameter, and call it within a try block to get the default value (this is the sort of thing that lambda abstraction is for).

The second option is nice because users of the function can pass in a callable, which may be already existing (such as a type)

0人赞添加讨论(0) 举报

在下西门庆

7楼-- · 2019-09-11 05:01

This is not a design flaw. Anyone who trips over this is doing something wrong.

There are 3 cases I see where you might run into this problem:

You intend to modify the argument as a side effect of the function. In this case it never makes sense to have a default argument. The only exception is when you're abusing the argument list to have function attributes, e.g. cache={}, and you wouldn't be expected to call the function with an actual argument at all.
You intend to leave the argument unmodified, but you accidentally did modify it. That's a bug, fix it.
You intend to modify the argument for use inside the function, but didn't expect the modification to be viewable outside of the function. In that case you need to make a copy of the argument, whether it was the default or not! Python is not a call-by-value language so it doesn't make the copy for you, you need to be explicit about it.

The example in the question could fall into category 1 or 3. It's odd that it both modifies the passed list and returns it; you should pick one or the other.

0人赞添加讨论(0) 举报

1 2 3 4 5 下一页

“Least Astonishment” and the Mutable Default Argum

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间