Anyone tinkering with Python long enough has been bitten (or torn to pieces) by the following issue:
def foo(a=[]):
a.append(5)
return a
Python novices would expect this function to always return a list with only one element: [5]
. The result is instead very different, and very astonishing (for a novice):
>>> foo()
[5]
>>> foo()
[5, 5]
>>> foo()
[5, 5, 5]
>>> foo()
[5, 5, 5, 5]
>>> foo()
A manager of mine once had his first encounter with this feature, and called it "a dramatic design flaw" of the language. I replied that the behavior had an underlying explanation, and it is indeed very puzzling and unexpected if you don't understand the internals. However, I was not able to answer (to myself) the following question: what is the reason for binding the default argument at function definition, and not at function execution? I doubt the experienced behavior has a practical use (who really used static variables in C, without breeding bugs?)
Edit:
Baczek made an interesting example. Together with most of your comments and Utaal's in particular, I elaborated further:
>>> def a():
... print("a executed")
... return []
...
>>>
>>> def b(x=a()):
... x.append(5)
... print(x)
...
a executed
>>> b()
[5]
>>> b()
[5, 5]
To me, it seems that the design decision was relative to where to put the scope of parameters: inside the function or "together" with it?
Doing the binding inside the function would mean that x
is effectively bound to the specified default when the function is called, not defined, something that would present a deep flaw: the def
line would be "hybrid" in the sense that part of the binding (of the function object) would happen at definition, and part (assignment of default parameters) at function invocation time.
The actual behavior is more consistent: everything of that line gets evaluated when that line is executed, meaning at function definition.
This actually has nothing to do with default values, other than that it often comes up as an unexpected behaviour when you write functions with mutable default values.
No default values in sight in this code, but you get exactly the same problem.
The problem is that
foo
is modifying a mutable variable passed in from the caller, when the caller doesn't expect this. Code like this would be fine if the function was called something likeappend_5
; then the caller would be calling the function in order to modify the value they pass in, and the behaviour would be expected. But such a function would be very unlikely to take a default argument, and probably wouldn't return the list (since the caller already has a reference to that list; the one it just passed in).Your original
foo
, with a default argument, shouldn't be modifyinga
whether it was explicitly passed in or got the default value. Your code should leave mutable arguments alone unless it is clear from the context/name/documentation that the arguments are supposed to be modified. Using mutable values passed in as arguments as local temporaries is an extremely bad idea, whether we're in Python or not and whether there are default arguments involved or not.If you need to destructively manipulate a local temporary in the course of computing something, and you need to start your manipulation from an argument value, you need to make a copy.
I am going to demonstrate an alternative structure to pass a default list value to a function (it works equally well with dictionaries).
As others have extensively commented, the list parameter is bound to the function when it is defined as opposed to when it is executed. Because lists and dictionaries are mutable, any alteration to this parameter will affect other calls to this function. As a result, subsequent calls to the function will receive this shared list which may have been altered by any other calls to the function. Worse yet, two parameters are using this function's shared parameter at the same time oblivious to the changes made by the other.
Wrong Method (probably...):
You can verify that they are one and the same object by using
id
:Per Brett Slatkin's "Effective Python: 59 Specific Ways to Write Better Python", Item 20: Use
None
and Docstrings to specify dynamic default arguments (p. 48)This implementation ensures that each call to the function either receives the default list or else the list passed to the function.
Preferred Method:
There may be legitimate use cases for the 'Wrong Method' whereby the programmer intended the default list parameter to be shared, but this is more likely the exception than the rule.
Just change the function to be:
The solutions here are:
None
as your default value (or a nonceobject
), and switch on that to create your values at runtime; orlambda
as your default parameter, and call it within a try block to get the default value (this is the sort of thing that lambda abstraction is for).The second option is nice because users of the function can pass in a callable, which may be already existing (such as a
type
)When we do this:
... we assign the argument
a
to an unnamed list, if the caller does not pass the value of a.To make things simpler for this discussion, let's temporarily give the unnamed list a name. How about
pavlo
?At any time, if the caller doesn't tell us what
a
is, we reusepavlo
.If
pavlo
is mutable (modifiable), andfoo
ends up modifying it, an effect we notice the next timefoo
is called without specifyinga
.So this is what you see (Remember,
pavlo
is initialized to []):Now,
pavlo
is [5].Calling
foo()
again modifiespavlo
again:Specifying
a
when callingfoo()
ensurespavlo
is not touched.So,
pavlo
is still[5, 5]
.It may be true that:
it is entirely consistent to hold to both of the features above and still make another point:
The other answers, or at least some of them either make points 1 and 2 but not 3, or make point 3 and downplay points 1 and 2. But all three are true.
It may be true that switching horses in midstream here would be asking for significant breakage, and that there could be more problems created by changing Python to intuitively handle Stefano's opening snippet. And it may be true that someone who knew Python internals well could explain a minefield of consequences. However,
The existing behavior is not Pythonic, and Python is successful because very little about the language violates the principle of least astonishment anywhere near this badly. It is a real problem, whether or not it would be wise to uproot it. It is a design flaw. If you understand the language much better by trying to trace out the behavior, I can say that C++ does all of this and more; you learn a lot by navigating, for instance, subtle pointer errors. But this is not Pythonic: people who care about Python enough to persevere in the face of this behavior are people who are drawn to the language because Python has far fewer surprises than other language. Dabblers and the curious become Pythonistas when they are astonished at how little time it takes to get something working--not because of a design fl--I mean, hidden logic puzzle--that cuts against the intuitions of programmers who are drawn to Python because it Just Works.