wiki:Programming/Python/PythonMixedMetaphors

Python Mixed Metaphors

July 20 2016

python-logo.png

Python violates "The Principle of Least Astonishment" when using a mutable default value as a function parameter.

Yeah, it's a bold statement. But hear me out and then decide for yourself.

Principle of Least Astonishment

If you're not familiar with it, Wikipedia as a pretty good explanation about the Principle of Least Astonishment.

"if a necessary feature has a high astonishment factor, it may be necessary to redesign the feature." In general engineering design contexts, the principle may be taken to mean that a component of a system should behave in a manner consistent with how users of that component are likely to expect it to behave, i.e., users should not be astonished at the way it behaves.

... the principle aims to exploit users' pre-existing knowledge to minimize the learning curve, for instance by designing interfaces that borrow heavily from "functionally similar or analogous programs with which your users are likely to be familiar." User expectations in this respect may be closely related to a particular computing platform or tradition.

The Violation

In Python, when you write this code:

my_list = []

it allocates an instance of list and assigns it to the my_list variable each time that line of code is executed.

Essentially, [] is shorthand for list() and anytime you see [] in code, you would expect that a new list object would be created. And it is perfectly fair to have such an expectation.

But that's not the behavior when you use [] as the default value for function parameter.

def list_append(value, my_list=[]):
    my_list.append(value)
    return my_list

So, if you wrote the following code to invoke list_append:

first_name_list = list_append('Vijay')
print(first_name_list)
last_name_list = list_append('Varadan')
print(last_name_list)

the output you'd see is:

['Vijay']
['Vijay', 'Varadan']

and not:

['Vijay']
['Varadan']

This result would definitely surprise me. I would expect to see 2 distinct lists with Vijay in first_name_list and Varadan in last_name_list.

Note that the example I used would yield even more confusing results, when you invoke it initially with a pre-allocated list instance, but subsequently with just one parameter.

first_names = ['Jeff']
print(list_append('Vijay', first_names)) # output is ['Jeff', Vijay']

last_names = list_append('Lu')
print(list_append('Varadan', last_names)) # output is ['Lu', 'Varadan']

phone_numbers = list_append('+1-206-548-6565')
print(phone_numbers) # output is ['Lu', 'Varadan', '+1-206-548-6565']

As you can see, this would be very, very confusing; especially since the 2nd call to list_append worked just fine and created a new list when you didn't pass one in. The 3rd invocation doesn't create a new list and append it to the new list; instead it simply appends it to the list that was created when the function was first invoked with one parameter, using the default value for the second parameter.

Explanation of the Behavior

When you come across the code fragments above, you'd expect that Python would allocate a new instance of my_list each time you called the list_append function with just one parameter. And your expectations would be dashed on the rocks of Python's inconsistency.

Why? Because Python only allocates a new list instance and assigns it to my_list the first time list_append is invoked. Subsequent invocations use the same my_list instance that was allocated the first time.

This behavior is explained best by Jim Dennis' comment for the answer about "mutable default arguments", which I'll quote here:

As the def statement is executed its arguments are evaluated by the interpreter. This creates (or rebinds) a name to a code object (the suite of the function). However, the default arguments are instantiated as objects at the time of definition. This is true of any time of defaulted object, but only significant (exposing visible semantics) when the object is mutable. There's no way of re-binding that default argument name in the function's closure although it can obviously be over-ridden for any call or the whole function can be re-defined)

To clarify further using Robert Rossney's comment as reference, default arguments to a function are stored in a tuple. This tuple is an attribute of the function. Tuples are immutable, so the mutable once allocated, cannot be allocated again. So, the same list instance is re-used in all subsequent function invocations.

The Fix

In order to get the right behavior, you would have to set the default value of my_list to a sentinel value, check for the sentinel value in body of the function and if the check succeeded, allocate a new list and assign it to my_list. The code would look like this:

def list_append_fixed(value, my_list=None):
    # note that you must explicitly compare against None and
    # can't check for "not my_list" in the if clause because the
    # "not my_list" check will evaluate to True when an empty list
    # is passed in, which does not equate to the sentinel check
    if my_list is None:
        my_list = []
    my_list.append(value)
    return my_list

Some Thoughts

The new code fragment defining list_append_fixed feels unnatural.

The behavior for allocation in case of default parameters for functions should be the same as for any other code that is executed and should not appear as though it's special-cased for first time function execution versus subsequent executions of the same function.

Additional Thoughts (rant?)

There exists a Python idiom to check that a variable is not None (see PEP-8) and yes, it applies to default parameter values. But when checking collections (like lists or dictionaries), the recommended idiom, which covers both the None value as well as the empty collection cases, is:

if not my_list:
    # ...
    # ...

The recommended idiom for checking a collection can't be used in this scenario. Why? Well, if an empty list is passed in, the if not my_list check would pass and a new list would be needlessly allocated. The only correct check is to make an explicit check against None.

It feels like Python mixed up its metaphors.

Other References

Last modified 8 years ago Last modified on Feb 19, 2017, 9:12:34 PM