| 1 | = Python Mixed Metaphors = |
| 2 | July 20 2016 |
| 3 | |
| 4 | [[Image(htdocs:images/python/python-logo.png, align=center, nolink)]] |
| 5 | |
| 6 | Python violates "The Principle of Least Astonishment" when using a mutable default value as a function parameter. |
| 7 | |
| 8 | Yeah, it's a bold statement. But hear me out and then decide for yourself. |
| 9 | |
| 10 | == Principle of Least Astonishment == |
| 11 | If you're not familiar with it, Wikipedia as a pretty good explanation about the [https://en.wikipedia.org/wiki/Principle_of_least_astonishment Principle of Least Astonishment]. |
| 12 | >> "if a necessary feature has a high astonishment factor, it may be necessary to redesign the feature." In general engineering design contexts, the principle may be taken to mean that a component of a system should behave in a manner consistent with how users of that component are likely to expect it to behave, i.e., users should not be astonished at the way it behaves. |
| 13 | |
| 14 | >> ... the principle aims to exploit users' pre-existing knowledge to minimize the learning curve, for instance by designing interfaces that borrow heavily from "functionally similar or analogous programs with which your users are likely to be familiar." User expectations in this respect may be closely related to a particular computing platform or tradition. |
| 15 | |
| 16 | == The Violation == |
| 17 | In Python, when you write this code: |
| 18 | |
| 19 | {{{#!python |
| 20 | my_list = [] |
| 21 | }}} |
| 22 | |
| 23 | it allocates an instance of ```list``` and assigns it to the ```my_list``` variable each time that line of code is executed. |
| 24 | |
| 25 | Essentially, ```[]``` is shorthand for ```list()``` and anytime you see ```[]``` in code, you would expect that a new list object would be created. And it is perfectly fair to have such an expectation. |
| 26 | |
| 27 | But that's not the behavior when you use ```[]``` as the default value for function parameter. |
| 28 | |
| 29 | {{{#!python |
| 30 | def list_append(value, my_list=[]): |
| 31 | my_list.append(value) |
| 32 | return my_list |
| 33 | }}} |
| 34 | |
| 35 | So, if you wrote the following code to invoke ```list_append```: |
| 36 | |
| 37 | {{{#!python |
| 38 | first_name_list = list_append('Vijay') |
| 39 | print(first_name_list) |
| 40 | last_name_list = list_append('Varadan') |
| 41 | print(last_name_list) |
| 42 | }}} |
| 43 | |
| 44 | the output you'd see is: |
| 45 | |
| 46 | {{{ |
| 47 | ['Vijay'] |
| 48 | ['Vijay', 'Varadan'] |
| 49 | }}} |
| 50 | |
| 51 | and not: |
| 52 | |
| 53 | {{{ |
| 54 | ['Vijay'] |
| 55 | ['Varadan'] |
| 56 | }}} |
| 57 | |
| 58 | This result would definitely surprise me. I would expect to see 2 distinct lists with ```Vijay``` in ```first_name_list``` and ```Varadan``` in ```last_name_list```. |
| 59 | |
| 60 | Note that the example I used would yield even more confusing results, when you invoke it initially with a pre-allocated list instance, but subsequently with just one parameter. |
| 61 | |
| 62 | {{{#!python |
| 63 | first_names = ['Jeff'] |
| 64 | print(list_append('Vijay', first_names)) # output is ['Jeff', Vijay'] |
| 65 | |
| 66 | last_names = list_append('Lu') |
| 67 | print(list_append('Varadan', last_names)) # output is ['Lu', 'Varadan'] |
| 68 | |
| 69 | phone_numbers = list_append('+1-206-548-6565') |
| 70 | print(phone_numbers) # output is ['Lu', 'Varadan', '+1-206-548-6565'] |
| 71 | }}} |
| 72 | |
| 73 | As you can see, this would be very, very confusing; especially since the 2nd call to ```list_append``` worked just fine and created a new list when you didn't pass one in. The 3rd invocation doesn't create a new list and append it to the new list; instead it simply appends it to the list that was created when the function was first invoked with one parameter, using the default value for the second parameter. |
| 74 | |
| 75 | == Explanation of the Behavior == |
| 76 | When you come across the code fragments above, you'd expect that Python would allocate a new instance of ```my_list``` each time you called the ```list_append``` function with just one parameter. And your expectations would be dashed on the rocks of Python's inconsistency. |
| 77 | |
| 78 | **Why?** Because Python only allocates a new list instance and assigns it to ```my_list``` the **first** time ```list_append``` is invoked. Subsequent invocations use the same ```my_list``` instance that was allocated the first time. |
| 79 | |
| 80 | This behavior is explained best by [http://stackoverflow.com/questions/101268/hidden-features-of-python#comment3187590_113198 Jim Dennis' comment] for the answer about "mutable default arguments", which I'll quote here: |
| 81 | |
| 82 | >> As the def statement is executed its arguments are evaluated by the interpreter. This creates (or rebinds) a name to a code object (the suite of the function). However, the default arguments are instantiated as objects at the time of definition. This is true of any time of defaulted object, but only significant (exposing visible semantics) when the object is mutable. There's no way of re-binding that default argument name in the function's closure although it can obviously be over-ridden for any call or the whole function can be re-defined) |
| 83 | |
| 84 | To clarify further [http://stackoverflow.com/questions/101268/hidden-features-of-python#comment1557739_113198 using Robert Rossney's comment as reference], default arguments to a function are stored in a tuple. This tuple is an attribute of the function. Tuples are immutable, so the mutable once allocated, cannot be allocated again. So, the same list instance is re-used in all subsequent function invocations. |
| 85 | |
| 86 | == The Fix == |
| 87 | |
| 88 | In order to get the right behavior, you would have to set the default value of ```my_list``` to a sentinel value, check for the sentinel value in body of the function and if the check succeeded, allocate a new list and assign it to ```my_list```. The code would look like this: |
| 89 | |
| 90 | {{{#!python |
| 91 | def list_append_fixed(value, my_list=None): |
| 92 | # note that you must explicitly compare against None and |
| 93 | # can't check for "not my_list" in the if clause because the |
| 94 | # "not my_list" check will evaluate to True when an empty list |
| 95 | # is passed in, which does not equate to the sentinel check |
| 96 | if my_list is None: |
| 97 | my_list = [] |
| 98 | my_list.append(value) |
| 99 | return my_list |
| 100 | }}} |
| 101 | |
| 102 | == Some Thoughts == |
| 103 | The new code fragment defining ```list_append_fixed``` feels unnatural. |
| 104 | |
| 105 | The behavior for allocation in case of default parameters for functions should be the same as for any other code that is executed and should not appear as though it's special-cased for first time function execution versus subsequent executions of the same function. |
| 106 | |
| 107 | === Additional Thoughts (rant?) === |
| 108 | There exists a Python idiom to check that a variable is not ```None``` (see [https://www.python.org/dev/peps/pep-0008/ PEP-8]) and yes, it applies to default parameter values. But when checking collections (like lists or dictionaries), the recommended idiom, which covers both the ```None``` value as well as the empty collection cases, is: |
| 109 | |
| 110 | {{{#!python |
| 111 | if not my_list: |
| 112 | # ... |
| 113 | # ... |
| 114 | }}} |
| 115 | |
| 116 | The recommended idiom for checking a collection can't be used in this scenario. Why? Well, if an empty list is passed in, the ```if not my_list``` check would pass and a new list would be needlessly allocated. The only correct check is to make an explicit check against ```None```. |
| 117 | |
| 118 | It feels like Python mixed up its metaphors. |
| 119 | |
| 120 | == Other References == |
| 121 | * [http://stackoverflow.com/questions/101268/hidden-features-of-python#113198 This answer] to a Stack Overflow question about [http://stackoverflow.com/questions/101268/hidden-features-of-python "Hidden Features of Python"] |
| 122 | |
| 123 | * There's even an [http://docs.quantifiedcode.com/python-code-patterns/correctness/mutable_default_value_as_argument.html anti-pattern] for this exact behavior. |
| 124 | |
| 125 | * This behavior seems to bite enough number of developers that it's part of [https://docs.python.org/3.5/faq/programming.html#why-are-default-values-shared-between-objects the Python programming FAQ]. |